Data Warehousing and Data integration

Variables, Conditions, Packages and Transformations

More
5 years 4 months ago #17870 by bruce.gibbins
Hi

I realise this will be a mixture of things and you have documented some of this in parts within the Wiki and Forum. However, I am struggling with a clean cut solution and in my opinion this would be a relatively common activity. Hence, I'd like to get the response all into one spot.

The scenario is based on files landing in a monitored directory. At this stage I am not worried about volume or frequency. But will be handling each file individually for internal reasons. Each file has a uniqiue name but a common format of xxx-yyyymmddhhmmssnnnn.ext

"xxx" represents a client prefix (eg. abc, bcd, cde). The format of each file for the same client will be the same. But the format is different for each client. The file extension '.ext' will generally be .csv, .xml, or even .xls.

The file transformations between clients will be different. However, the output is always the same format and the the target is a SQL Server database table.

I am developing each of the transformations in isolation, but with the same basic structure.

My idea is to have a single monitor looking at a single folder for *.* That then executes a single Package based on the creation of a file. The Monitor passes the file to the Package that then does some environment management, put the source file in a staging location, then based on the file prefix conditionally (Probably through a CASE action) call the associated transformation and load to the SQL server and then cleanup.

My problem is pure inexperience with AETL. I know I need to use a variable to hold the inbound file name prefix and then use that in a CASE Action. But for the life of me I cannot seem to get it to work. I am never sure if I should use a script and set variables that way, or use the SetVariable Package Action. I am then not sure how to get the CASE Statement to use this variable. The example in the Wiki uses vCase but I suspect it should be <vCase>. Which beings me to a another query as I am never sure when to use "<>" around a variable as I am also unsure of when to use "getVariable()" or indeed when to use "{}"

I was hoping that for my sake you could spend some time to provide a single complete answer here rather than me trying to piece together bits and pieces from all of the other content in the Wiki and Forums.

Please Log in or Create an account to join the conversation.

More
5 years 4 months ago #17877 by admin
We will have a look at it and provide you with some examples

Mike
ETL Architect

Please Log in or Create an account to join the conversation.

More
5 years 4 months ago - 5 years 4 months ago #17883 by Peter.Jonson
Hi Bruce.

We have created a working example for you.
Please see repository backup attached.

We also created the following article, We hope It will benefit a lot of users.
www.etl-tools.com/datawarehousing/exampl...ehouse-workflow.html

Peter Jonson
ETL Developer
Attachments:
Last edit: 5 years 4 months ago by Peter.Jonson.
The following user(s) said Thank You: bruce.gibbins

Please Log in or Create an account to join the conversation.

More
5 years 4 months ago #17884 by Peter.Jonson
The example in the Wiki uses vCase but I suspect it should be <vCase>

The variable name can be any string,
We decided to use <VariableFormat> but you do not have to.

You can use "Case" package action if you wish, but if you have a lot of customers the package will be massive therefore it will difficult to maintain/update.
I think what we created provides the more flexible solution.

"Which beings me to a another query as I am never sure when to use "<>" around a variable as I am also unsure of when to use "getVariable()" or indeed when to use "{}""

{} allows using simple calculations inside file names

getVariable() can be used inside {} or inside script

I hope that clears some confusion

Peter Jonson
ETL Developer
The following user(s) said Thank You: bruce.gibbins

Please Log in or Create an account to join the conversation.