Sie sind auf Seite 1von 4

Why SSIS always gets Excel data types

wrong, and how to fix it!

Working as a BI Developer with the Microsoft stack of tools, I often deal with multiple sources
of data, one of which being Excel, a very common data source in Data Warehousing projects. As
some of you may be aware, working with Excel in SSIS (SQL Server Integration Services) can
have its issues, if youre aware of these, this post will help people in more detail and provide you
with some possible solutions to save you some time.

This blog will also help answer questions about:

Why doesnt SSIS pick up my data types correctly when using Excel as a source
Why are some of my columns NULL when the excel file contains data
How do I fix problems with my excel source
Excel SSIS driver issues

The main driver for the problems we experience whilst working with SSIS and Excel originate
from the fact SSIS auto-determines the data types for when connecting to the Excel source. The
SSIS Excel driver determines the data type for each column based on reading values of the first 8
rows.

If the top 8 records contain equal number of numeric and character types then the
priority is numeric (column2 in figures 1 & 2)
If the majority of top 8 records are numeric then it assigns the data type as numeric and
all character values are read as NULLs (column3 in figures 1 & 2)
If the majority of top 8 records are of character type then it assigns the data type as string
and all numeric values are read as NULLs (column4 in figures 1 & 2)
Figure 1: Excel input data

Figure 2: SSIS Excel Source Component Metadata

Figure 3: Output Preview window in SSIS Excel Source Component

As you can see from the above screenshots, we receive NULLs in our destination table where we
had data to load, this is not the desired output. So, how do we fix this?

By adding IMEX=1 (Import Export Mode) property to the connection string, excel reads the data
correctly (see Figure 4).

Set the connection string property as below:


Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\Folder1\Book1.xls;Extended
Properties=EXCEL 8.0;HDR=YES;IMEX=1;
Figure 4: SSIS Excel Connection Manager Connection String Property

After adding this property, the excel driver reads the columns with mixed types as Unicode
String data type (see Figure 5) and reads the data properly (see Figure 6).

Figure 5: SSIS Excel Source Component Metadat

Figure 6: Output Preview window in SSIS Excel Source Component with IMEX = 1

Also in situations where the 1st 8 rows are NULLs in the excel file we need to edit the registry
settings and set the TypeGuessRows property to 0 in order to let the excel driver to use all rows
in the file to determine the data types properly (see Figure 7) by sampling all of the rows rather
than the 1st 8. Range of values this property allows are from 0-16. So, we can have 1-16 rows, or
all rows as permissible sample sizes.

The location of the key needing to changes is as follows:

HKEY_LOCAL_MACHINE -> SOFTWARE -> Wow6432Node -> Microsoft -> Jet -> 4.0 -
>Engines -> Excel -> TypeGuessRows has the value 0. (It contains 8 by default)
Figure 7: Registry Editor Excel Driver TypeGuessRows property

I hope this helps investigating any Excel driver issues you may be experiencing. Please feel free
to comment on the post with your experiences.

Das könnte Ihnen auch gefallen