Question from the customer

I have 350+ text files in the same folder.

  1. There is no pattern in files names so they cannot be grouped together.
  2. All CSV files have certain columns that are found in all (Store Name, Address, City, State, Zip Code, Phone Number)
  3. Fields order is not guaranteed
  4. Some CSV file will have extra columns such as Fax Number, Store Website, Store Hours...

I would like to combine all 350 CSV files and line up the common columns but also include the extra columns
Also, I would like to add a new column to each record which would be the file name.
Also, a log of the operation that would identify the problem CSV (Ones unable to import/combine because if format errors)o add the file name and record name to the output file

Proposed Solution

The first thing we need to is to do create the list of fields this can be easily done with the help of validator, pivot and deduplication objects.

As usual, the list of fields is at the first line of the file so we need to generate file lines number and reject all lines except the first one.

Creating a List of Fields step by step guide
Generating File Line Number

Creating List Of Fields 1

Filtering First Line

Creating List Of Fields 2

Unpivot the Data

Creating List Of Fields Using Unpivot

Deduplicate the data

Creating List Of Fields Using Deduplicator

The result

Creating List Of Fields Result


Direct link, no registration required.