Choosing right ETL tool

We have two ETL products Visual Importer ETL and Advanced ETL Processor.

Which one to choose depends on the complexity of the data and now you want to work with it

This article provide some examples of complex data

If you just want to load data from table, delimited file or simple XML file choose Visual Importer ETL.

If you work with complex data and require data validation, transformation or sorting and on top of it you also want to be able to generate the data select Advanced ETL Processor.

Both Advanced ETL Processor and Visual Importer ETL have 3 editions:

Standard Edition is designed for desktop user who just wants to run data load or transformation manually.

Professional Edition is designed for user who wants to execute business automation processes manually (Developers).

Enterprise Edition is designed for user who wants complete automation of business processes.

Selecting ETL Software

Download Visual Importer ETL Standard      Download Advanced ETL Processor Standard     
Download Visual Importer ETL Professional Download Advanced ETL Processor Professional
Download Visual Importer ETL Enterprise    Download Advanced ETL Processor Enterprise   

We made it easy to get support for etl software

Simple ways of getting support for ETL software.

In this article we are going to show you how easy it is to get support for our products.

First of all there are several ways of getting support.

  • Support Forum
  • Email To: This email address is being protected from spambots. You need JavaScript enabled to view it.
  • Online Contact Form

If you are stuck we can call you and connect remotely.
It is very important to provide as much information as possible first time.

  • Software name
  • Version number
  • Steps to reproduce the problem
  • Screen shots
  • Scripts to create the tables
  • Actual data example
  • Repository

Our support team is very fast and quite often they resolve the problem within one hour.

The software is designed in such a way that it is easy to contact support.

For Advanced ETL Processor click “Help” and then click “Email repository and Log Files to support” than the default email client will open and the use can enter additional information and attach more files. Using same menu the use can send email to support team or visit support forum.

Getting Support for Advanced ETL Processor

Using same menu the user can send email to support team or visit support forum.

Same principle works for Visual Importer ETL and Active Table Editor.

Getting Support for Visual Importer ETL

Getting Support for Active Table Editor

We encourage you to use our support forum first, so other users will benefit as well. If your issue is urgent contact us, We are always happy to connect remotely and assist.


SSIS excel problem

Solving SSIS problems with loading data from Excel

We have the following situation:

Imagine that you are working for large finance organization and would like to load the year end balances into SQL Server database. The excel file look like this.

Excel File

So we create a simple SSIS package using Excel connector and Ole DB destination, before loading the data we hit preview button.

SSIS Preview Result

For some reason last 3 lines have null values, This is because our connection string has not got IMEX=1

Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\DEMO\Excel\mixed_data_types.xls;Extended Properties="EXCEL 8.0;HDR=YES";

We correct it

Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\DEMO\Excel\mixed_data_types.xls;Extended Properties="EXCEL 8.0;HDR=YES;IMEX=1";

SSIS Preview Result

That looks much better so we load the data. Later same day, the finance manager says that the balance is wrong you look at the data and there are some null values again.

Why we have null in the database?

It is a common problem: trying to load the data from Excel file half of the data is coming as nulls, or columns with more than 255 characters are truncated

The logic behind Excel mixed data types

Our source file has mixed datatypes some account codes have anly numbers and some have characters as well. We have more cells with numbers only so Excel ODBC/Jet assumes the field type is numeric and shows the rest of it as nulls.

Setting IMEX=1 tells the driver to use Import mode. In this state, the registry setting ImportMixedTypes=Text will be noticed. This forces mixed data to be converted into text. For this to work reliably, you may also have to modify the registry setting, TypeGuessRows=8. The ISAM driver by default looks at the first eight rows and from that sampling determines the datatype. If this eight row sampling is all numeric, then setting IMEX=1 will not convert the default datatype to Text; it will remain numeric.

As partially explained here

http://support.microsoft.com/kb/257819

ODBC/MS Jet scans first TypeGuessRows to determine field type

Here how Excel ODBC/MS Jet works

(TypeGuessRows=8 IMEX=1)

In your eight (8) scanned rows, if the column contains five (5) numeric values and three (3) text values, the provider returns five (5) numbers and three (3) null values.
In your eight (8) scanned rows, if the column contains three (3) numeric values and five (5) text values, the provider returns three (3) null values and five (5) text values.
In your eight (8) scanned rows, if the column contains four (4) numeric values and four (4) text values, the provider returns four (4) numbers and four (4) null values.
In your eight (8) scanned rows all of them less than 255 characters the provider will truncate all data to 255 characters
In your eight (8) scanned rows, if the column contains five (5) values with more length than 255 the provider will return more than 255 characters

What else can we do?

We can change TypeGuessRows and set it to 0. This works very well for files up to 16384 lines.

The valid range of values for the TypeGuessRows key is 0 to 16. However, if the value is 0, the number of source rows scanned is 16384.
Read this Microsoft KB article: http://support.microsoft.com/kb/281517.

For larger files it may not work again, so the best solution to the problem is not to use excel as data source but use text files instead. But this may bring another problem if user open CSV after it was exported it may loose leading zeros.

The only way to make import from Excel work is

Set IMEX=1 in connection string

Close any programs that are running.
On the Start menu, click Run. Type regedit and click OK.
In the Registry Editor, expand the following key depending on the version of Excel that you are running:
Excel 97
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\3.5\Engines\Excel
Excel 2000 and later versions
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel

Select TypeGuessRows and on the Edit menu click Modify.
In the Edit DWORD Value dialogue box, click Decimal under Base.
Set the value to 1

Open Excel file

Make sure that the cells in the first line of the table have relevant data for example

  • mixed numbers and text characters for text fields
  • only numbers for numeric fields
  • If some of the data will be longer than 255 characters make sure that first line cell has more 255 characters otherwise it will be truncated

This solution apply to all versions of MS Excel ODBC driver, Ole DB, MS Jet, .NET, DTS and SSIS

We have spent enormous amount of time trying to get it fixed. So far we were not able to find a better solution.

The way Excel import works makes it not possible to automate it. You have to modify most of excel files manually in order to load them.

This why we are no longer using ODBC/OleDB/Ms Jet for Excel connections. Our ETL solutions work correctly with Excel all the time. Try it yourself now.

Advanced ETL Processor Working with Excel

  • Works directly with Excel 3.0-2007
  • No ODBC, OleDB or MS Jet Required
  • Works correctly with mixed data types
  • Works correctly with cells with more 255 characters
  • No need for IMEX=1, HDR=Yes or Registry hacks (TypeGuessRows)

  • Loads data correctly all the time + no need to edit Excel file
  • Can create Excel files in Excel 3.0-2007 format
  • Can insert data starting from specific cell
  • Can clear area before adding data into Excel
  • Can add headers

Related Microsoft KB's

  1. PRB: Excel Values Returned as NULL Using DAO OpenRecordset
  2. Data truncated to 255 characters with Excel ODBC driver
  3. Excel ODBC Driver May Determine Wrong Data Type
  4. BUG: Excel ODBC Driver Disregards the FirstRowHasNames or Header Setting
  5. How To Use ADO with Excel Data from Visual Basic or VBA

Working with repository

Why it is important to know how to work with repository.

The repository database stores all the information about connections, transformation scripts, packages, sql scripts, reports and execution logging.  This is where the results of ETL designer hard is stored and obviously no one wants to lose it.

Working with repositoryDefault repository.

The default repository is MS access. This type of repository works fine for development and for small production environment. From time to time we recommend to perform “compact and repair” using MS access. For heavy production environment for example when we want to run packages every minute in parallel we recommend using something else like MS SQL server or Oracle.

Note:

Information about current repository connection can be seen at the window header

Creating repository.

Repository can be created by running script or by using repository creation wizard.
They are located in C:\Users\Public\Documents\DBSL\Repository Scripts.

To create new MS Access repository copy
C:\Users\Public\Documents\DBSL\Repository Scripts\Repository\ Repository.mdb in to different directory and connect to it.

Default Repository connections.

When software is installed default repository connections are created such as MS Access, Oracle, SQL Server, MySQL, PosgreSQL.  When application first stars it connects to default MS Access repository.

Connecting to different repository.

  • Click Maintain tab
  • Select desired connection from the drop down box
  • Click reconnect.

Provided that you are using default settings all open objects will be saved and application will connect to different repository.

Creating new repository.

  • Click Maintain tab
  • Click create new
  • Follow the wizard steps
  • Once repository is created connect to it

Creating new repository connection

If you want to create new connection to the existing repository:

  • Click Maintain tab
  • Click options
  • Click plus
  • Fill all necessary details
  • Test connection
  • Click OK

Same dialogue can be used to amend exiting repository connections.

Backing up repository

  • Click Maintain tab
  • Click backup
  • Select the file to save data into
  • Done

Restoring repository

  • Click Maintain tab
  • Click restore
  • Select the file to restore data from
  • Done

To copy repository data from development to production environment

  • Connect to development environment
  • Backup repository
  • Connect to production environment
  • Restore repository

Controlling size of repository

While execution scheduled tasks the size of the repository is constantly growing. From time to time it might become necessary to clear execution log tables otherwise it might have negative impact on performance.

  • Click monitor tab
  • Click clear log button

clear log

 

Information above is relevant to both Advanced ETL Processor and Visual Importer ETL and some of of it to Active Table Editor

Magic button for Business Intelligence consultant

Advanced ETL Processor has a lot of useful features designed to make the life of Business intelligence consultants easier. One of it is "Magic Button". At any time the user can press it and see the result of data transformation or validation without actually loading the data.

Here is data sample we want to check if customer id is numeric and reject it if not

Source data Here is the mapping we will use for it.

Mapping ExampleIs Number transformation functionWe press the "Magic Button"

Magic buttonAnd we can immediately see the result of transformation: record rejected 

Mapping Result failure

record passed the test Mapping Result success

It is also possible to enter the data directly into source grid and test it the transformation that way

Mapping Result success

Learn MoreDownloadBuy Now

What is complex data

This Article provides examples of complex and simple data. We recommend using   Advanced ETL Processor for working with complex data structures and Visual Importer ETL with simple, if in doubt please contact us and we will help you to choose.

Simple data example : Excel file.

Data has stable structure and there is only one value in the cell. Most database tables and delimited files can be considered as simple data. except when single field has multiple values or holds xml data.

Simple excel fileSimple XML file:

XML tags are in the same order, number of tags is always the same and there are no attributes, This XML does not require any additional transformation and can be easily loaded into the database.

Simple XML

Complex XML message:

Structure is not stable, number of tags is different for different customer record, plus it uses attributes. This XML may need some additional transformations.

Complex XML

Complex Text file:

Data is split between different rows some empty rows should be ignored, some data must be taken from the header

Complex TEXT

Sql insert script:

Data can be easily extracted or generated using   Advanced ETL Processor

Insert Statement

Download Visual Importer ETL Enterprise Download Advanced ETL Processor Enterprise

ETL for Ordinary People

With our ETL software you can automate your business processes in couple of hours, than you can concentrate on what is important: running the business.

Our prices start only from 100 usd per licence.

Today 95 percent of the businesses are small and medium size companies. Most of ETL and business automation software is designed for large companies, it requires highly skilled staff, difficult to support and far too expensive.

Our ETL software was especially designed for ordinary users.

It takes seconds to install it, works straight out of the box, easy to use and easy to support. The trial is fully functional for 30 day and if necessary the trial period can be extended.

Our support team is always happy to connect remotely and assist

Transformation sample


Download Visual Importer ETL Enterprise

Download Advanced ETL Processor Enterprise

Which ETL tool is right for me?

More Articles...