Removing Duplicate Rows From Table in Oracle: 11 Answers

Sam
3,102 5 14 33
jmfsg
19.1k 28 102 156
11 Answers
Bill the Lizard
152k 106 378 652
I'm testing something in Oracle and populated a table with some sample data, but in the process I
accidentally loaded duplicate records, so now I can't create a primary key using some of the columns.
How can I delete all duplicate rows and leave only one of them?
oracle duplicate-removal delete-row
edited Aug 17 at 13:54 asked Feb 9 '09 at 17:34
Use the rowid pseudocolumn.
DELETE FROM your_table
WHERE rowid not in
(SELECT MIN(rowid)
FROM your_table
GROUP BY column1, column2, column3);
Where column1 , column2 , and column3 make up the identifying key for each record. You might list
all your columns.
edited Jun 12 at 14:13 answered Feb 9 '09 at 17:41
3 From here: devx.com/tips/Tip/14665 Bill the Lizard Feb 9 '09 at 17:44
+1 I had to find two duplicate phone numbers buried in 12,000+ records. Changed the DELETE to SELECT and
this found them in seconds. Saved me a ton of time, thank you. shimonyk Sep 23 '10 at 15:30
1 This approach did not work for me. I don't know why. When I replaced "DELETE" with "SELECT *", it returned
the rows I wanted to delete, but when I executed with "DELETE" it was just hanging indefinitely. aro_biz Jun 25
'12 at 12:05
Mine is also either hanging or just executing extremely long. Been running for about 22 hours and still going.
Table have 21M records. user1208908 Aug 22 '13 at 5:57
I suggest to add further filtering to the WHERE statement if you have a very large data set and if feasible, this
might help folks with long running queries. Ricardo Apr 8 at 16:58
show 1 more comment
From DevX.com:
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no
registration required.
Removing duplicate rows from table in Oracle
sign up log in tour help careers 2.0

Removing duplicate rows from table in Oracle - Stack Overflow http://stackoverflow.com/questions/529098/removing-duplicate-rows-fro...
1 of 4 20/09/2014 5:34 PM
Mark
363 1 6
user187624
Mohammed khaled
39 1
Dead Programmer
5,488 8 41 78
JgSudhakar
20 5
DELETE FROM our_table
WHERE rowid not in
(SELECT MIN(rowid)
FROM our_table
GROUP BY column1, column2, column3...) ;
Where column1, column2, etc. is the key you want to use.
answered Feb 9 '09 at 17:43
DELETE FROM tablename a
WHERE a.ROWID > ANY (SELECT b.ROWID
FROM tablename b
WHERE a.fieldname = b.fieldname
AND a.fieldname2 = b.fieldname2)
answered Nov 9 '09 at 6:18
Re my comment above on the top-voted answer, it was this request which actually solved my problem. aro_biz
Jun 25 '12 at 12:06
1 This will be -a lot- slower on huge tables than Bill's solution. Wouter May 15 at 14:01
create table t2 as select distinct * from t1;
answered Jan 11 '13 at 17:01
not an answer - distinct * will take every record which differs in at least 1 symbol in 1 column. All you need is
to select distinct values only from columns you want to make primary keys - Bill's answer is great example of this
approach. Nogard Jan 11 '13 at 17:28
1 That was what I needed (remove entirely identical lines). Thanks ! Emmanuel Feb 20 '13 at 11:43
Another disadvantage of this method is that you have to create a copy of your table. For huge tables, this implies
providing additional tablespace, and deleting or shrinking the tablespace after the copy. Bill's method has more
benefits, and no additional disadvantages. Wouter May 15 at 13:59
From Ask Tom
delete from t
where rowid IN ( select rid
from (select rowid rid,
row_number() over (partition by
companyid, agentid, class , status, terminationdate
order by rowid) rn
from t)
where rn <> 1;
answered Mar 18 '11 at 6:11
Parenthesis missing in statement. I assume it should be at the end? user1208908 Aug 22 '13 at 6:02
DELETE FROM tableName WHERE ROWID NOT IN (SELECT MIN (ROWID) FROM table GROUP BY
answered Jan 12 at 5:32
2 of 4 20/09/2014 5:34 PM
Nic
2,263 3 11 33
user3655760
1
Stephen Ostermiller
3,276 3 18 36
user2158672
1
user1799846
Same answer as the more elaborate answer of Bill the Lizard. Wouter May 15 at 13:55
delete from dept
where rowid in (
select rowid
from dept
minus
select max(rowid)
from dept
group by DEPTNO, DNAME, LOC
);
edited May 20 at 9:14 answered May 20 at 8:49
Can you add more information about your way? Thanks. reporter May 20 at 9:16
The Fastest way for really big tables
Create exception table with structure below: exceptions_table
ROW_ID ROWID
OWNER VARCHAR2(30)
TABLE_NAME VARCHAR2(30)
CONSTRAINT VARCHAR2(30)
1.
Try create a unique constraint or primary key which will be violated by the duplicates. You will get an
error message because you have duplicates. The exceptions table will contain the rowids for the
duplicate rows.
alter table add constraint
unique or primary key
(dupfield1,dupfield2) exceptions into exceptions_table;
2.
Join your table with exceptions_table by rowid and delete dups
delete original_dups where rowid in (select ROW_ID from exceptions_table);
3.
If the amount of rows to delete is big, then create a new table (with all grants and indexes) anti-joining
with exceptions_table by rowid and rename the original table into original_dups table and rename
new_table_with_no_dups into original table
create table new_table_with_no_dups AS (
select field1, field2 ........
from original_dups t1
where not exists ( select null from exceptions_table T2 where t1.rowid = t2
)
4.
edited May 30 at 1:36 answered May 30 at 1:07
To select the duplicates only the query format can be: SELECT GroupFunction(column1),
GroupFunction(column2),..., COUNT(column1), column1, column2... FROM our_table GROUP BY column1,
column2, column3... HAVING COUNT(column1) > 1
So the correct query as per other suggestion is: DELETE FROM tablename a WHERE a.ROWID > ANY
(SELECT b.ROWID FROM tablename b WHERE a.fieldname = b.fieldname AND a.fieldname2 =
b.fieldname2 AND ....so on.. to identify the duplicate rows....) This query will keep the oldest record in the
database for the criteria chosen in the WHERE CLAUSE.
Oracle Certified Associate (2008)
edited Jun 17 at 14:34 answered Jun 17 at 12:24
3 of 4 20/09/2014 5:34 PM
9 2
Nick
1,378 5 11
Radim Khler
23.2k 11 24 37
Ashish sinha
77 5
You should do a small pl/sql block using a cursor for loop and delete the rows you don't want to keep. For
instance:
declare
prev_var my_table.var1%TYPE;
begin
for t in (select var1 from my_table order by var 1) LOOP
if previous var equal current var, delete the row, else keep on going.
end loop;
end;
answered Feb 9 '09 at 17:44
I believe the downvote is because you are using PL/SQL when you can do it in SQL, incase you are wondering.
WW. Feb 10 '09 at 1:39
2 Just because you can do it in SQL, doesn't mean its the only solution. I posted this solution, after I had seen the
SQL-only solution. I thought down votes were for incorrect answers. Nick Feb 10 '09 at 2:43
create or replace procedure delete_duplicate_enq as
cursor c1 is
select *
from enquiry;
begin
for z in c1 loop
delete enquiry
where enquiry.enquiryno = z.enquiryno
and rowid > any
(select rowid
from enquiry
where enquiry.enquiryno = z.enquiryno);
end loop;
end delete_duplicate_enq;
edited Nov 26 '13 at 9:27 answered Nov 26 '13 at 9:04
A major disadvantage of this method is the inner join. For big tables this will be a lot slower than Bill's method.
Also, using PL/SQL to do this is overkill, you could also use this by simply using sql. Wouter May 15 at 13:57
Not the answer you're looking for? Browse other questions tagged oracle
duplicate-removal delete-row or ask your own question.
4 of 4 20/09/2014 5:34 PM

Removing Duplicate Rows From Table in Oracle: 11 Answers

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Removing Duplicate Rows From Table in Oracle: 11 Answers

Hochgeladen von

Copyright:

Verfügbare Formate

Sam

Das könnte Ihnen auch gefallen