Different methods of deleting duplicate records.
I am using Employees table with FirstName and Department columns.
First Method.
Delete duplicate records/rows by creating identity column.
duplicate records in table looks like shown in first image.
First of all we need to create a identity column in our table byusing code mentioned below.
And table will look like image on the left.
1ALTER TABLE dbo.Employees ADD ID INT IDENTITY(1,1)Now write this query to delete duplicate rows.
1DELETE FROM dbo.Employees2WHERE ID NOT IN (SELECT MIN(ID)3FROM dbo.Employees GROUP BY FirstName,Department)This should remove all duplicate records from table.
Second Method.
Delete duplicate records using Row_Number()
If you do not want to make any changes in table design or don't want to create identity column on table then you can remove duplicate records using Row_Number in sql server 2005 onwards.
for this write below mentioned code and execute.
1WITH DuplicateRecords AS2(3SELECT *,row_number() OVER(PARTITION BY FirstName,Department ORDER BY 4 5FirstName) 6AS RowNumber FROM dbo.Employees7)8DELETE FROM DuplicateRecords WHERE RowNumber>1This should remove all duplicate records from table.
Third Method.
Remove duplicate rows/Records using temporary table
Use below mentioned code to delete duplicates by moving them to temporary table using DISTINCT.
1SELECT DISTINCT * INTO TempTable FROM dbo.Employees2GROUP BY FirstName,Department3HAVING COUNT(FirstName) > 14 5DELETE dbo.Employees WHERE FirstName6IN (SELECT FirstName FROM TempTable)7 8INSERT dbo.Employees SELECT * FROM TempTable9DROP TABLE TempTable 
 
No comments:
Post a Comment