Data Cleansing is a process of correcting data errors and removing invalid information
Bad data example

data quality issues

Inconsistent values: US and USA from a human point of view are the same but for computers they are different. This can happen when merging data from different data sources

Missing values: UK and Germany values are missing. Most likely this data is incorrect and must be removed from the final dataset.

Data entry errors: Spain and SPain are two different values

Uniqueness: ORDER_ID must be unique

Inconsistent Date Formats: It is a common problem when merging data from various countries

Non-numeric characters inside numeric fields: Same as above, can be easily corrected using delete characters transformation function

Leading and trailing spaces: Invisible enemy of a data analyst. Use trim transformation function to correct this error

Data Cleansing Example

Steps to follow

  • Download and install Advanced ETL Processor [Link]
  • Download and Unzip example[Link]
  • Create a new transformation and open the .ats file

Data Cleansing 

  • Double click on the Reader object and amend the source file path
  • Double click on the Writer object and amend the target file path
  • Run the transformation by pressing the green arrow.
 How the Data Validation process works

Data reader loads Excel file into memory, validator rejects rows with empty Country name field. 

Removing Empty Values

Removing Empty Values

Cleansing the data.

Once bad records are rejected the transformer performs additional cleaning

  • Delete Characters Transformation function deletes Dollar sign, Pound sign, Comma and Space characters from Amount field.
  • Date Format Transformation function reformats Order Date field into standard ODBC format.
  • Lookup transformation Function corrects Country Field values

transforming and cleansing data

delete characters

date format properties

lookup properties 1

lookup properties 2

Please contact us if you need help with transforming the data

Visit ETL Tools Forum

User Rating: 5 / 5

Star ActiveStar ActiveStar ActiveStar ActiveStar Active
 
 
Xerox
Swiss banking
Bank Of Oklahoma
Red Cross
Alta Pacific bank
Copeinca
Gas alberta
NHS
Royal Brunei
First Oklahoma bank
Noresco
Iqvia

Testimonials

What customers say about us

  • swissbanking

    I used Advanced ETL Processor in 2 Enterprises for many business processes and Business automation (outside finance department). I did not find any other tool with so many functions and broad flexibility for that Price! If you need support for bugs or solution design you will get it very fast. Best Support I have ever seen.

    Lionel Albrecht
  • iqvia

    IQVIA and DB Software Laboratory (DBSL) partnered in 2010 and have been working in close cooperation ever since. Over this period of time, DBSL software components formed an integral part of a large number of IQVIA applications currently used by over 20 UK NHS Trusts (Hospitals).

    Dmitry Dorsky,
    Director
  • xerox

    The product is easy to learn and once a developer understands the ETL way for solving the problem at hand, the developer's productivity will increase. Even our DBAs now uses the ETL software to quickly create solutions instead of SSIS or SQL jobs.

    Daniel Fung
    Solutions Architect

Read ETL Software customers feedback

This site uses cookies. By continuing to browse the site, you are agreeing to our use of cookies