There are several ways to connect Excel Files and run SQL against it:

Data can be extracted via ODBC, OLE DB(Jet) or .Net providers
Odbc connection strings:
For Excel 2.0-2003
DRIVER={Microsoft Excel Driver (*.xls)};IMEX=1;MaxScanRows=16;DBQ=C:\Invoice.xls;
 
For Excel 2007+
DRIVER={Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb)};IMEX=1;MaxScanRows=16;DBQ=C:\Invoice.xls;

Both Advanced ETL Processor and Visual Import ETL can use ODBC to connect to Excel

OLE DB connection strings:
For Excel 2.0-2003

Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\MyExcel.xls;Extended Properties="Excel 8.0;HDR=Yes;IMEX=1";

For Excel 2007+
Provider=Microsoft.ACE.OLEDB.16.0;Data Source=c:\myFolder\myExcel2007file.xlsx;Extended Properties="Excel 12.0 Xml;HDR=YES;IMEX=1";

"HDR=Yes;" indicates that the first row contains column names, not data. "HDR=No;" it does not

A very useful feature for the Excel file below when you want to retrieve the Invoice Number:

Invoice Excel File

Running SQL against Excel file

SQL syntax for ODBC and Ole DB is slightly different

  • For ODBC: SELECT "Column Name" FROM "Sheet One$". I.e. excel worksheet name followed by a "$" and wrapped in double-quotes.
  • For Ole DB: SELECT [Column Name] FROM [Sheet One$]. I.e. excel worksheet name followed by a "$" and wrapped in "[" "]" brackets.
Extract data from a single Excel worksheet
SELECT * FROM [Sheet1$]
To specify a named range of cells as your record source, simply use the defined name. For example:
SELECT * FROM MyRange
 

Visual   Importer - Data source properties dialog

Unnamed Range

To specify an unnamed range of cells as your record source, append standard Excel row/column notation to the end of the sheet name in the square brackets. For example:

SELECT * FROM [Sheet1$A1:B10]

Visual Importer - Extracting data from Excel

Common Excel Import problems

ADO/ODBC must determine the data type for each column in your Excel worksheet or range  (This is not affected by Excel cell formatting settings.) This is done by scanning number of rows defined by the registry setting TypeGuessRows (default value is 8). Quite often there are numeric values mixed with text values in the same column,

For example, sorted financial coding structures often have numbers at the beginning of the list 001-999 than text AAA-XXX

Both the Jet and the ODBC Provider return the data of the majority type but return NULL (empty) values for the minority data type. If the two types are equally mixed in the column, the provider chooses numeric over text.

Plus when first rows are less than 255 characters long it will truncate all data to 255 characters even if cell values below are longer

One of the ways of avoiding this problem is using Import Mode (IMEX=1). This forces mixed data to be converted to text. However, it only works when the first TypeGuessRows Rows have mixed values.
If all values are numeric than setting IMEX=1 will not convert the default datatype to Text, it will remain numeric.  

The best combination to avoid problems is TypeGuessRows=0 + IMEX=1.
Setting TypeGuessRows=0 forces driver to read all data to determine field type.
Unfortunately, our own experience shows that quite often it does not work.
And when it does work it slows everything down

So the only solution is not to use mixed values and be prepared for the data being truncated to 255 characters

As you can see there is plenty of room for improvement for Microsoft. The first version of Microsoft Office was released very long time ago and it is still not possible to read excel file correctly. Why do not they just add another setting to ODBC driver EverhingIsText=1? This so simple and will solve all the problems.

All our products work correctly with Excel because we do not use OleDb or ODBC

Direct link, no registration required.
Video Tutorial