Background:

This article was created as an answer to one of our ETL forum posts, we hope it will help a lot of users in a similar situation.

"The scenario is based on files landing in a monitored directory. At this stage, I am not worried about volume or frequency. But will be handling each file individually for internal reasons. Each file has a unique name but a common format of xxx-yyyymmddhhmmssnnnn.ext

"xxx" represents a client prefix (eg. abc, bcd, cde). The format of each file for the same client will be the same. But the format is different for each client. The file extension '.ext' will generally be .csv, .xml, or even .xls.

The file transformations between clients will be different. However, the output is always the same format and the target is a SQL Server database table.

I am developing each of the transformations in isolation, but with the same basic structure.

My idea is to have a single monitor looking at a single folder for *.* That then executes a single Package based on the creation of a file. The Monitor passes the file to the Package that then does some environment management, put the source file in a staging location, then based on the file prefix conditionally (Probably through a CASE action) call the associated transformation and load to the SQL server and then clean up."

Notes:

The monitor (Directory monitor) is a windows service that is constantly monitoring a directory for changes. When it happens it adds predefined action to the execution queue. Monitor passes the file name as a variable into the action, this variable name is <Directory Monitor File Name>.

Solution:

Example of data warehouse workflow

"Get ClientID" script

 is a starting point, it extracts the first three characters of the file name. The Result is stored in '<ClientID>' variable.

var
FileName: String;
ClientID: String;
begin
FileName := GetVariable('<Directory Monitor File Name>');
ClientID:=UpperCase(SubString(FileName,1,3));
SetVariable('<ClientID>',ClientID);

Result:=true;
end;

Check Client directory

Checks if the client directory exists and if it does not it creates it together with subdirectories.

 Every transformation has an ID. We use this ID to execute a specific transformation for a specific customer.

Table Transformations

Holds mapping between ClientID and TransformationID:

CREATE TABLE [dbo].[Transformations]
( [ClientID] [char](3) NOT NULL, [TransformationID] [int] NULL, 
CONSTRAINT [PK_Transformations]
PRIMARY KEY CLUSTERED ( [ClientID] ASC)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
ON [PRIMARY]) ON [PRIMARY]
GO 

Loookup

"Insert Into Transformations"

Sql script inserts a record for missing clients:

insert into Transformations(
[ClientID],
[TransformationID])
values ('<ClientID>',0)

GO

Once the file is ready for ready processing it is moved into Staging Area.

Get transformation ID

executes the following SQL:

 Select TransformationID
from Transformations
where ClientID='<ClientID>'

The result is saved into <TransformationID> variable.

Execute Transformation script

begin
ExecuteObject(GetVariable('<TransformationID>'));
Result:=true;
end;

Upon completion, the file is moved into Processed Area.

Directory Structure

 DirectoryStructure

Variable as Mask

Mask

Error message

Acts as a default error handler

Direct link, no registration required.

User Rating: 5 / 5

Star ActiveStar ActiveStar ActiveStar ActiveStar Active
 
 
Xerox
Swiss banking
Bank Of Oklahoma
Red Cross
Alta Pacific bank
Copeinca
Gas alberta
NHS
Royal Brunei
First Oklahoma bank
Noresco
Iqvia

Testimonials

What customers say about us

  • swissbanking

    I used Advanced ETL Processor in 2 Enterprises for many business processes and Business automation (outside finance department). I did not find any other tool with so many functions and broad flexibility for that Price! If you need support for bugs or solution design you will get it very fast. Best Support I have ever seen.

    Lionel Albrecht
  • iqvia

    IQVIA and DB Software Laboratory (DBSL) partnered in 2010 and have been working in close cooperation ever since. Over this period of time, DBSL software components formed an integral part of a large number of IQVIA applications currently used by over 20 UK NHS Trusts (Hospitals).

    Dmitry Dorsky,
    Director
  • xerox

    The product is easy to learn and once a developer understands the ETL way for solving the problem at hand, the developer's productivity will increase. Even our DBAs now uses the ETL software to quickly create solutions instead of SSIS or SQL jobs.

    Daniel Fung
    Solutions Architect

 

This site uses cookies. By continuing to browse the site, you are agreeing to our use of cookies