You are here: HomeArticles

QVD vs QVX

Comparison of data loading strategy into QliKView

Using QVD files

Working With QVD files

There some advantages and disadvantages of using QVD files

  • Same data for all users
  • Loader QVS pulls data into memory and saves it into the file
  • Additional processing on the QlikView server
  • QVD’s can be generated only by QlikView

Using QVX files

Working With QVX files

Benefits of using QVX files

  • Same data for all users
  • Advanced ETL Processor data saves the data into the file without loading everything into the memory
  • AETL can perform data transformation and validation on the fly
  • Can automate file transfers via FTP/Emails/HTTP/File operations.
  • Can work in distributed environment

QlikView - Integration performance boost

Great News: our QlikView connector can generate 20 million records in 11 minutes. that 7+ times more than previous version

QlikView - Data Integration performance boost

More about ETL-Tools QlikView connector

ETL Version Control why you need it

We do a lot of consutancy work and we decided to introduce version control after one of our customers modified transformations beyond use. Version control helped us to restore project to it's original state. Another situation when version control is very usefull is a large project with multiple developers.

Advanced ETL Processor version control tab

More about ETL-Tools

Latest ETL Software changes

+ New Version control tab was introduced
+ Version control can submit several objects now
+ Version control dialogue can delete multiple objects now
- Various bug fixes

Choosing right ETL tool

We have two ETL products Visual Importer ETL and Advanced ETL Processor.

Which one to choose depends on the complexity of the data and now you want to work with it

This article provide some examples of complex data

If you just want to load data from table, delimited file or simple XML file choose Visual Importer ETL.

If you work with complex data and require data validation, transformation or sorting and on top of it you also want to be able to generate the data select Advanced ETL Processor.

Both Advanced ETL Processor and Visual Importer ETL have 3 editions:

Standard, Professional and Enterprise:

Standard Edition is designed for desktop user who just wants to run data load or transformation manually.

Professional Edition is designed for user who wants to execute business automation processes manually.

Enterprise Edition is designed for user who wants complete automation of business processes.

Selecting ETL Software

Download Visual Importer ETL Standard      Download Advanced ETL Processor Standard     
Download Visual Importer ETL Professional Download Advanced ETL Processor Professional
Download Visual Importer ETL Enterprise    Download Advanced ETL Processor Enterprise   

We made it easy to get support for etl software

Simple ways of getting support for ETL software.

In this article we are going to show you how easy it is to get support for our products.

First of all there are several ways of getting support.

  • Support Forum
  • Email To: This email address is being protected from spambots. You need JavaScript enabled to view it.
  • Online Contact Form

If you are stuck we can call you and connect remotely.
It is very important to provide as much information as possible first time.

  • Software name
  • Version number
  • Steps to reproduce the problem
  • Screen shots
  • Scripts to create the tables
  • Actual data example
  • Repository

Our support team is very fast and quite often they resolve the problem within one hour.

The software is designed in such a way that it is easy to contact support.

For Advanced ETL Processor click “Help” and then click “Email repository and Log Files to support” than the default email client will open and the use can enter additional information and attach more files. Using same menu the use can send email to support team or visit support forum.

Getting Support for Advanced ETL Processor

Using same menu the user can send email to support team or visit support forum.

Same principle works for Visual Importer ETL and Active Table Editor.

Getting Support for Visual Importer ETL

Getting Support for Active Table Editor

We encourage you to use our support forum first, so other users will benefit as well. If your issue is urgent contact us, We are always happy to connect remotely and assist.


SSIS excel problem

Solving SSIS problems with loading data from Excel

We have the following situation:

Imagine that you are working for large finance organization and would like to load the year end balances into SQL Server database. The excel file look like this.

Excel File

So we create a simple SSIS package using Excel connector and Ole DB destination, before loading the data we hit preview button.

SSIS Preview Result

For some reason last 3 lines have null values, This is because our connection string has not got IMEX=1

Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\DEMO\Excel\mixed_data_types.xls;Extended Properties="EXCEL 8.0;HDR=YES";

We correct it

Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\DEMO\Excel\mixed_data_types.xls;Extended Properties="EXCEL 8.0;HDR=YES;IMEX=1";

SSIS Preview Result

That looks much better so we load the data. Later same day, the finance manager says that the balance is wrong you look at the data and there are some null values again.

Why we have null in the database?

It is a common problem: trying to load the data from Excel file half of the data is coming as nulls, or columns with more than 255 characters are truncated

The logic behind Excel mixed data types

Our source file has mixed datatypes some account codes have anly numbers and some have characters as well. We have more cells with numbers only so Excel ODBC/Jet assumes the field type is numeric and shows the rest of it as nulls.

Setting IMEX=1 tells the driver to use Import mode. In this state, the registry setting ImportMixedTypes=Text will be noticed. This forces mixed data to be converted into text. For this to work reliably, you may also have to modify the registry setting, TypeGuessRows=8. The ISAM driver by default looks at the first eight rows and from that sampling determines the datatype. If this eight row sampling is all numeric, then setting IMEX=1 will not convert the default datatype to Text; it will remain numeric.

As partially explained here

http://support.microsoft.com/kb/257819

ODBC/MS Jet scans first TypeGuessRows to determine field type

Here how Excel ODBC/MS Jet works

(TypeGuessRows=8 IMEX=1)

In your eight (8) scanned rows, if the column contains five (5) numeric values and three (3) text values, the provider returns five (5) numbers and three (3) null values.
In your eight (8) scanned rows, if the column contains three (3) numeric values and five (5) text values, the provider returns three (3) null values and five (5) text values.
In your eight (8) scanned rows, if the column contains four (4) numeric values and four (4) text values, the provider returns four (4) numbers and four (4) null values.
In your eight (8) scanned rows all of them less than 255 characters the provider will truncate all data to 255 characters
In your eight (8) scanned rows, if the column contains five (5) values with more length than 255 the provider will return more than 255 characters

What else can we do?

We can change TypeGuessRows and set it to 0. This works very well for files up to 16384 lines.

The valid range of values for the TypeGuessRows key is 0 to 16. However, if the value is 0, the number of source rows scanned is 16384.
Read this Microsoft KB article: http://support.microsoft.com/kb/281517.

For larger files it may not work again, so the best solution to the problem is not to use excel as data source but use text files instead. But this may bring another problem if user open CSV after it was exported it may loose leading zeros.

The only way to make import from Excel work is

Set IMEX=1 in connection string

Close any programs that are running.
On the Start menu, click Run. Type regedit and click OK.
In the Registry Editor, expand the following key depending on the version of Excel that you are running:
Excel 97
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\3.5\Engines\Excel
Excel 2000 and later versions
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel

Select TypeGuessRows and on the Edit menu click Modify.
In the Edit DWORD Value dialogue box, click Decimal under Base.
Set the value to 1

Open Excel file

Make sure that the cells in the first line of the table have relevant data for example

  • mixed numbers and text characters for text fields
  • only numbers for numeric fields
  • If some of the data will be longer than 255 characters make sure that first line cell has more 255 characters otherwise it will be truncated

This solution apply to all versions of MS Excel ODBC driver, Ole DB, MS Jet, .NET, DTS and SSIS

We have spent enormous amount of time trying to get it fixed. So far we were not able to find a better solution.

The way Excel import works makes it not possible to automate it. You have to modify most of excel files manually in order to load them.

This why we are no longer using ODBC/OleDB/Ms Jet for Excel connections. Our ETL solutions work correctly with Excel all the time. Try it yourself now.

Advanced ETL Processor Working with Excel

  • Works directly with Excel 3.0-2007
  • No ODBC, OleDB or MS Jet Required
  • Works correctly with mixed data types
  • Works correctly with cells with more 255 characters
  • No need for IMEX=1, HDR=Yes or Registry hacks (TypeGuessRows)

  • Loads data correctly all the time + no need to edit Excel file
  • Can create Excel files in Excel 3.0-2007 format
  • Can insert data starting from specific cell
  • Can clear area before adding data into Excel
  • Can add headers

Related Microsoft KB's

  1. PRB: Excel Values Returned as NULL Using DAO OpenRecordset
  2. Data truncated to 255 characters with Excel ODBC driver
  3. Excel ODBC Driver May Determine Wrong Data Type
  4. BUG: Excel ODBC Driver Disregards the FirstRowHasNames or Header Setting
  5. How To Use ADO with Excel Data from Visual Basic or VBA

Working with repository

Why it is important to know how to work with repository.

The repository database stores all the information about connections, transformation scripts, packages, sql scripts, reports and execution logging.  This is where the results of ETL designer hard is stored and obviously no one wants to lose it.

Working with repositoryDefault repository.

The default repository is MS access. This type of repository works fine for development and for small production environment. From time to time we recommend to perform “compact and repair” using MS access. For heavy production environment for example when we want to run packages every minute in parallel we recommend using something else like MS SQL server or Oracle.

Note:

Information about current repository connection can be seen at the window header

Creating repository.

Repository can be created by running script or by using repository creation wizard.
They are located in C:\Users\Public\Documents\DBSL\Repository Scripts.

To create new MS Access repository copy
C:\Users\Public\Documents\DBSL\Repository Scripts\Repository\ Repository.mdb in to different directory and connect to it.

Default Repository connections.

When software is installed default repository connections are created such as MS Access, Oracle, SQL Server, MySQL, PosgreSQL.  When application first stars it connects to default MS Access repository.

Connecting to different repository.

  • Click Maintain tab
  • Select desired connection from the drop down box
  • Click reconnect.

Provided that you are using default settings all open objects will be saved and application will connect to different repository.

Creating new repository.

  • Click Maintain tab
  • Click create new
  • Follow the wizard steps
  • Once repository is created connect to it

Creating new repository connection

If you want to create new connection to the existing repository:

  • Click Maintain tab
  • Click options
  • Click plus
  • Fill all necessary details
  • Test connection
  • Click OK

Same dialogue can be used to amend exiting repository connections.

Backing up repository

  • Click Maintain tab
  • Click backup
  • Select the file to save data into
  • Done

Restoring repository

  • Click Maintain tab
  • Click restore
  • Select the file to restore data from
  • Done

To copy repository data from development to production environment

  • Connect to development environment
  • Backup repository
  • Connect to production environment
  • Restore repository

Controlling size of repository

While execution scheduled tasks the size of the repository is constantly growing. From time to time it might become necessary to clear execution log tables otherwise it might have negative impact on performance.

  • Click monitor tab
  • Click clear log button

clear log

 

Information above is relevant to both Advanced ETL Processor and Visual Importer ETL and some of of it to Active Table Editor

Page 3 of 6

Go to top