Solving SSIS problems with loading data from Excel

We have the following situation:

Imagine that you are working for a large financial organization and would like to load the year-end balances into the SQL Server database. The excel file looks like this.

Excel File

So we create a simple SSIS package using Excel connector and Ole DB destination, before loading the data we hit the preview button.

SSIS Preview Result

For some reason last 3 lines have null values, This is because our connection string has not got IMEX=1

Provider=Microsoft.ACE.OLEDB.16.0;Data Source=C:\DEMO\Excel\mixed_data_types.xls;Extended Properties="EXCEL 8.0;HDR=YES";

This can be easily corrected by adding IMEX=1

Provider=Microsoft.ACE.OLEDB.16.0;Data Source=C:\DEMO\Excel\mixed_data_types.xls;Extended Properties="EXCEL 8.0;HDR=YES;IMEX=1";

Note:

In this example, we are using Microsoft Access Engine 2016, the same applies to the older versions

SSIS Preview Result

That looks much better so we load the data. Later the same day, the finance manager says that the balance is wrong so you look at the data and there are some null values again.

Why do we have nulls in the database?

It is a common problem: trying to load the data from an Excel file half of the data is coming as nulls, or columns with more than 255 characters are truncated

The logic behind Excel mixed data types

Our source file has mixed data types some account codes have any numbers and some have characters as well. We have more cells with numbers only so Excel ODBC/Jet assumes the field type is numeric and shows the rest of it as nulls.

Setting IMEX=1 tells the driver to use the Import mode. In this state, the registry setting ImportMixedTypes=Text will be noticed. This forces mixed data to be converted into text. For this to work reliably, you may also have to modify the registry setting, TypeGuessRows=8. The driver by default looks at the first eight rows and from that sampling determines the datatype. If this eight-row sampling is all numeric, then setting IMEX=1 will not convert the default datatype to Text; it will remain numeric.

As partially explained here:

Initializing the Microsoft Excel driver

ODBC/MS Jet scans first TypeGuessRows to determine field type

Here is how Excel ODBC/MS Jet works

(TypeGuessRows=8 IMEX=1)

In your eight (8) scanned rows, if the column contains five (5) numeric values and three (3) text values, the provider returns five (5) numbers and three (3) null values.
In your eight (8) scanned rows, if the column contains three (3) numeric values and five (5) text values, the provider returns three (3) null values and five (5) text values.
In your eight (8) scanned rows, if the column contains four (4) numeric values and four (4) text values, the provider returns four (4) numbers and four (4) null values.
In your eight (8) scanned rows all of them less than 255 characters the provider will truncate all data to 255 characters
In your eight (8) scanned rows, if the column contains five (5) values with more length than 255 the provider will return more than 255 characters

What else can we do?

We can change TypeGuessRows and set it to 1048576.

  • The valid range of values for the TypeGuessRows key is 0 to 1048576
  • If the value is 0, the number of source rows scanned is 16384.

It makes things better but still, some values may be converted to nulls so the best solution to the problem is not to use excel as a data source but use text files instead. But this may bring another problem if a user opens CSV after it was exported it may lose leading zeros.

The only way to make import from Excel work is

Set IMEX=1 in the connection string

Close all programs that are running.
Click Start menu, click Run. Type Regedit and click OK.
In the Registry Editor, expand the following key depending on the version of Excel that you are running:

For 64 bit version of Excel

Excel 97
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\3.5\Engines\Excel
Excel 2000
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel
Excel 2007
HKEY_LOCAL_MACHINE\Software\Microsoft\Office\12.0\Access Connectivity Engine\Engines\Excel
Excel 2010
HKEY_LOCAL_MACHINE\Software\Microsoft\Office\14.0\Access Connectivity Engine\Engines\Excel
Excel 2013
HKEY_LOCAL_MACHINE\Software\Microsoft\Office\15.0\Access Connectivity Engine\Engines\Excel
Excel 2016
HKEY_LOCAL_MACHINE\Software\Microsoft\Office\16.0\Access Connectivity Engine\Engines\Excel

For 32 bit version of Excel

Excel 97
HKEY_LOCAL_MACHINE\SOFTWARE\WOW6432Node\Microsoft\Jet\3.5\Engines\Excel
Excel 2000
HKEY_LOCAL_MACHINE\SOFTWARE\WOW6432Node\Microsoft\Jet\4.0\Engines\Excel
Excel 2007
HKEY_LOCAL_MACHINE\Software\WOW6432Node\Microsoft\Office\12.0\Access Connectivity Engine\Engines\Excel
Excel 2010
HKEY_LOCAL_MACHINE\Software\WOW6432Node\Microsoft\Office\14.0\Access Connectivity Engine\Engines\Excel
Excel 2013
HKEY_LOCAL_MACHINE\Software\WOW6432Node\Microsoft\Office\15.0\Access Connectivity Engine\Engines\Excel
Excel 2016
HKEY_LOCAL_MACHINE\Software\WOW6432Node\Microsoft\Office\16.0\Access Connectivity Engine\Engines\Excel

 

Select TypeGuessRows and on the Edit menu click Modify.
In the Edit DWORD Value dialogue box, click Decimal under Base.
Set the value to 1

Open Excel file

Make sure that the cells in the first line of the table have relevant data for example

  • mixed numbers and text characters for text fields
  • only numbers for numeric fields
  • If some of the data will be longer than 255 characters make sure that the first line cell has more than 255 characters otherwise it will be truncated

This solution applies to all versions of MS Excel ODBC driver, Ole DB, MS Jet, .NET and SSIS

We have spent an enormous amount of time trying to get it fixed. So far we were not able to find a better solution.

The way Excel import works make it not possible to automate it. You have to modify most of the excel files manually in order to load them.

This is why we are no longer using ODBC/OleDB/Ms Jet for Excel connections. Our ETL solutions work correctly with Excel all the time. Try it yourself now.

Advanced ETL Processor Working with Excel

  • Works directly with all versions of Excel
  • No ODBC, OleDB or MS Jet Required
  • Works correctly with mixed data types
  • Works correctly with cells with more than 255 characters
  • No need for IMEX=1, HDR=Yes or Registry hacks (TypeGuessRows)
  • Supports custom formats
  • Supports calculations
 
  • Can insert data starting from certain cell
  • Loads data correctly all the time + no need to edit Excel file
  • Can create Excel files in Excel 3.0-2016 format
  • Can insert data starting from specific cell
  • Can clear area before adding data into Excel
  • Can add headers
  • Works with ranges

Video Tutorial

 

Direct link, no registration required.

User Rating: 5 / 5

Star ActiveStar ActiveStar ActiveStar ActiveStar Active
 
 
Xerox
Swiss banking
Bank Of Oklahoma
Red Cross
Alta Pacific bank
Copeinca
Gas alberta
NHS
Royal Brunei
First Oklahoma bank
Noresco
Iqvia

Testimonials

What customers say about us

  • swissbanking

    I used Advanced ETL Processor in 2 Enterprises for many business processes and Business automation (outside finance department). I did not find any other tool with so many functions and broad flexibility for that Price! If you need support for bugs or solution design you will get it very fast. Best Support I have ever seen.

    Lionel Albrecht
  • iqvia

    IQVIA and DB Software Laboratory (DBSL) partnered in 2010 and have been working in close cooperation ever since. Over this period of time, DBSL software components formed an integral part of a large number of IQVIA applications currently used by over 20 UK NHS Trusts (Hospitals).

    Dmitry Dorsky,
    Director
  • xerox

    The product is easy to learn and once a developer understands the ETL way for solving the problem at hand, the developer's productivity will increase. Even our DBAs now uses the ETL software to quickly create solutions instead of SSIS or SQL jobs.

    Daniel Fung
    Solutions Architect

Read ETL Software customers feedback

This site uses cookies. By continuing to browse the site, you are agreeing to our use of cookies