- Posts: 17
- Thank you received: 0
Detect and remove duplicate or more similar records
7 years 7 months ago - 7 years 7 months ago #14449
by Kurt
Detect and remove duplicate or more similar records was created by Kurt
I am new to this and this should be easy. My excel file contains row values that has been duplicated, triplicated and more. How should clean and remove the redundancies ?
Code Description
100 abc
100 abc
100 abc
101 bcd
101 bcd
101 aab
101 bba
102 bbc
102 bbc
102 bbc
102 bbc
Hoping to get the following output
100 abc
101 bcd
101 aab
101 bba
102 bbc
Code Description
100 abc
100 abc
100 abc
101 bcd
101 bcd
101 aab
101 bba
102 bbc
102 bbc
102 bbc
102 bbc
Hoping to get the following output
100 abc
101 bcd
101 aab
101 bba
102 bbc
Last edit: 7 years 7 months ago by Kurt.
Please Log in or Create an account to join the conversation.
7 years 7 months ago - 2 years 3 months ago #14458
by KevinJohn
Replied by KevinJohn on topic Detect and remove duplicate or more similar records
We can detect duplicates by first declaring columns that are supposed to be unique as the primary key and create an SQL query in the transformation that will test and compare each and every row for duplicates.
We can also work with ETL-tools deduplicator.
Further Reading
We can also work with ETL-tools deduplicator.
Further Reading
Last edit: 2 years 3 months ago by admin.
The following user(s) said Thank You: Kurt
Please Log in or Create an account to join the conversation.