Sie sind auf Seite 1von 22

Excel Source

The Excel source extracts data from worksheets or ranges in Microsoft Excel workbooks.
The Excel source provides four different data access modes for extracting data:
• A table or view.
• A table or view specified in a variable.
• The results of an SQL statement. The query can be a parameterized query.
• The results of an SQL statement stored in a variable.
Important
In Excel, a worksheet or range is the equivalent of a table or view. The list of available tables in
the Excel Source and Destination editors displays existing worksheets (identified by the $ sign
appended to the worksheet name, such as Sheet1$) and named ranges (identified by the absence
of the $ sign, such as MyRange). For more information, see the Usage Considerations section.

The Excel source uses an Excel connection manager to connect to a data source, and the
connection manager specifies the workbook file to use. For more information, see Excel
Connection Manager.
The Excel source has one regular output and one error output.
Usage Considerations

The Excel Connection Manager uses the Microsoft OLE DB Provider for Jet 4.0 and its
supporting Excel ISAM (Indexed Sequential Access Method) driver to connect and read and
write data to Excel data sources.
Many existing Microsoft Knowledge Base articles document the behavior of this provider and
driver, and although these articles are not specific to Integration Services or its predecessor Data
Transformation Services, you may want to know about certain behaviors that can lead to
unexpected results. For general information on the use and behavior of the Excel driver, see
HOWTO: Use ADO with Excel Data from Visual Basic or VBA.
The following behaviors of the Jet provider with the Excel driver can lead to unexpected results
when reading data from an Excel data source.
• Data sources. The source of data in an Excel workbook can be a worksheet, to which the
$ sign must be appended (for example, Sheet1$), or a named range (for example,
MyRange). In a SQL statement, the name of a worksheet must be delimited (for example,
[Sheet1$]) to avoid a syntax error caused by the $ sign. The Query Builder automatically
adds these delimiters. When you specify a worksheet or range, the driver reads the
contiguous block of cells starting with the first non-empty cell in the upper-left corner of
the worksheet or range. Therefore you cannot have empty rows in the source data, or an
empty row between title or header rows and the data rows.
• Missing values. The Excel driver reads a certain number of rows (by default, 8 rows) in
the specified source to guess at the data type of each column. When a column appears to
contain mixed data types, especially numeric data mixed with text data, the driver decides
in favor of the majority data type, and returns null values for cells that contain data of the
other type. (In a tie, the numeric type wins.) Most cell formatting options in the Excel
worksheet do not seem to affect this data type determination. You can modify this
behavior of the Excel driver by specifying Import Mode. To specify Import Mode, add
IMEX=1 to the value of Extended Properties in the connection string of the Excel
connection manager in the Properties window. For more information, see PRB: Excel
Values Returned as NULL Using DAO OpenRecordset.
• Truncated text. When the driver determines that an Excel column contains text data, the
driver selects the data type (string or memo) based on the longest value that it samples. If
the driver does not discover any values longer than 255 characters in the rows that it
samples, it treats the column as a 255-character string column instead of a memo column.
Therefore, values longer than 255 characters may be truncated. To import data from a
memo column without truncation, you must make sure that the memo column in at least
one of the sampled rows contains a value longer than 255 characters, or you must
increase the number of rows sampled by the driver to include such a row. You can
increase the number of rows sampled by increasing the value of TypeGuessRows under
the HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel
registry key. For more information, see PRB: Transfer of Data from Jet 4.0 OLEDB
Source Fails w/ Error.
• Data types. The Excel driver recognizes only a limited set of data types. For example, all
numeric columns are interpreted as doubles (DT_R8), and all string columns (other than
memo columns) are interpreted as 255-character Unicode strings (DT_WSTR).
Integration Services maps the Excel data types as follows:
○ Numeric – double-precision float (DT_R8)
○ Currency – currency (DT_CY)
○ Boolean – Boolean (DT_BOOL)
○ Date/time – datetime (DT_DATE)
○ String – Unicode string, length 255 (DT_WSTR)
○ Memo – Unicode text stream (DT_NTEXT)
• Data type and length conversions. Integration Services does not implicitly convert data
types. As a result, you may need to use Derived Column or Data Conversion
transformations to convert Excel data explicitly before loading it into a non-Excel
destination, or to convert non-Excel data before loading it into an Excel destination. In
this case, it may be useful to create the initial package by using the Import and Export
Wizard, which configures the necessary conversions for you. Some examples of the
conversions that may be required include the following:
○ Conversion between Unicode Excel string columns and non-Unicode string
columns with specific codepages
○ Conversion between 255-character Excel string columns and string columns of
different lengths
○ Conversion between double-precision Excel numeric columns and numeric
columns of other types
Configuring the Excel Source

You can set properties through SSIS Designer or programmatically.


For more information about the properties that you can set in the Excel Source Editor dialog box,
click one of the following topics:
• Excel Source Editor (Connection Manager Page)
• Excel Source Editor (Columns Page)
• Excel Source Editor (Error Output Page)
The Advanced Editor dialog box reflects all the properties that can be set programmatically. For
more information about the properties that you can set in the Advanced Editor dialog box or
programmatically, click one of the following topics:
• Common Properties
• Source Custom Properties
For more information about how to set the properties, click one of the following topics:
• How to: Map Query Parameters to Variables in a Data Flow Component
• How to: Set the Properties of a Data Flow Component
• How to: Sort Data for the Merge and Merge Join Transformations
For information about looping through a group of Excel files, see How to: Loop through Excel
Files and Tables by Using a Foreach Loop Container.

Stay Up to Date with Integration Services


For the latest downloads, articles, samples, and videos from Microsoft, as well as selected
solutions from the community, visit the Integration Services page on MSDN or TechNet:
• Visit the Integration Services page on MSDN
• Visit the Integration Services page on TechNet
For automatic notification of these updates, subscribe to the RSS feeds available on the
page.
See Also

Other Resources
How to: Loop through Excel Files and Tables by Using a Foreach Loop Container
Excel Destination
Integration Services Variables
Designing Package Data Flow
Integration Services Sources
Working with Excel Files with the Script Task
64-bit Considerations for Integration Services
Connecting to an XLSX using SSIS
By dataintegrity

You have likely noticed that documents, templates, spreadsheets, and presentations that you
create in the 2007 Office release are saved with new file-name extensions with an x at the end of
the extension. For example, when you save a spreadsheet in Excel, the file now uses the .xlsx
extension, instead of the .xls extension. In the 2007 Office release, Microsoft has adopted and
XML-based file format that is said to improve file compression and provide better integration
and interoperability of data. Unfortunately, SSIS cannot connect to an XLSX file using the
Excel Connection Manager.
To use SSIS to connect to an XLSX file, we must use an OLE DB connection manager. First,
drag and drop an OLEDB source in to your data flow view. (if you are exporting data to an
XLSX you will want to use the OLEDB destination)

Double click on the OLE DB Source and create a new OLE DB connection manager. Select
‘New’. At the connection manager editor, choose the Microsoft Office 12.0 Access Database
Engine OLE DB Provider.
Once you have chosen the correct provider, type the full path of your XLSX file into the ‘Server
or file name’ field.
‘D:\XLSX\SAMPLE_SOURCE.XLSX’
Next, choose the ‘All’ button on the left side of the connection manager editor. Under the
extended properties, enter excel 12.0 and then ‘Test Connection’.
After you have verified that you are able to successfully connect to the XLSX file click ‘Ok’.
Choose the sheet that you wish to connect to (or the sheet you wish to write to, if you are using
the XLSX as your destination) and click ok.
Configure the rest of your dataflow and now you will be able to read and/or write to and XLSX
file.

Tags: Excel, Excel 2007, SSIS, XLSX


This entry was posted on October 16, 2009 at 9:43 pm and is filed under Data Analytics, SSIS. You can follow any
responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

6 Responses to “Connecting to an XLSX using SSIS”

1. G Clark Says:
April 16, 2010 at 5:23 pm | Reply
Is this SQL Server 2008? I’m using SQL Server 2005 and the Connection Manager
dialog I’m seeing doesn’t look anything like your screen shots above. In SQL Server
2005 the OLE DB Source is expected to be SQL and doesn’t allow other connection
types like “Native OLE DB”.

2. dataintegrity Says:
April 16, 2010 at 5:48 pm | Reply
This is 2005 Integration Services. Odd that you do not see any additional connection
strings. Out of curiosity, what version of SS 2005 are you running? (Enterprise,
Developer, etc.)

3. G Clark Says:
April 16, 2010 at 6:00 pm | Reply
This is all Enterprise. With Remote Desktop I’m logging into the server and running
Visual Studio. When I click About Visual Studio I see this info:
Microsoft SQL Server Integration Services Designer
Version 9.00.1399.00
When I run SQL Server Management Studio and click About I see:
Microsoft SQL Server Management Studio 9.00.1399.00
Microsoft Analysis Services Client Tools 2005.090.1399.00
Microsoft Data Access Components (MDAC) 2000.086.3959.00
(srv03_sp2_rtm.070216-1710)
Microsoft MSXML 2.6 3.0 5.0 6.0
Microsoft Internet Explorer 6.0.3790.3959
Microsoft .NET Framework 2.0.50727.3607
Operating System 5.2.3790
I have a very conservative IT department that as a rule doesn’t install Service Packs for
fear of introducing more problems than solving existing ones. As a result, I believe SQL
Server 2005 hasn’t been updated on this server since it was installed from the CD in mid-
2008.
Thanks for your help with this.

4. dataintegrity Says:
April 16, 2010 at 6:25 pm | Reply
All of the posts in this blog have been implemented on SP 2 (9.2.3042.00). You are on
the release version of SQL Server. I would recommend upgrading to SP 2 to take
advantage of at least the security enahncements, the 150 + fixes and the native client
enhancements. Depending on the complexity of your packages (and because of the
conservative nature of your IT group), you may want to roll out SP 2 on a development
server verify that there is no data degradation prior to implementing in production.
5. Jon Says:
May 3, 2010 at 7:03 pm | Reply
This does not work in 64 bit machines because MS DB ACCESS 12.0 does not work in
64 bit.

6. dataintegrity Says:
May 3, 2010 at 7:39 pm | Reply
Yes, unfortunately Microsoft admittedly neglected that feature.
http://msdn.microsoft.com/en-us/library/cc280527(SQL.100).aspx
Time to run in 32-bit mode… It looks like BIDS 2008 Excel Connection Manager now
supports the Excel 2007 file format.
SQL Server Analysis Services (SSAS) Connection Properties Key Performance
Indicators based on Earned Value Analysis / Management

Sep 10

Using Excel 2007 files as a Source in SSIS 2005


Excel, Integration Services, SQL Server 2005 Add comments

Thanks to “jaegd” for his post here: http://forums.microsoft.com/… on how to accomplish this.
Below please find some more details along w/ a example which can be downloaded here (ssis-
2005-excel-2007-source-example).

Steps
1. Create new OLEDB Connection by right clicking on the Connection Manager
tray

New OLEDB Connection in SSIS

2. Chose any valid type or value for the connection and click okay
OLEDB Connection Manager

3. Rename the Connection Manager to something that makes sense, like “Excel
2007″
4. Now that the the connection exists right click on it and select properties
5. Edit the Connection String property IN THE PROPERTIES WINDOW to match
below, updating with your Excel File Location
○ Data
Source=C:\MyExcelFile.xlsx;Provider=Microsoft.ACE.OLEDB.12.0;Exten
ded Properties=”Excel 12.0;HDR=YES;”

manually-edit-connection-string
You can now use this connection by referencing it with any task/object that would normally
connect to an OLEDB source. To query it use [Sheet1$] for the table name (eg. SELECT *
FROM [Sheet1$])
To access native excel 2007 data, create an OLEDB connection manager (not
an Excel connection) with the following connection string specific to the
source file.
Data
Source=c:\data\myfilehere.xlsx;Provider=Microsoft.ACE.OLEDB.12.0;Extende
d Properties="Excel 12.0;HDR=YES";
Access the spreadsheet in a dataflow using an OLEDB source with its select
statement set to "select * from [Sheet1$]", replacing Sheet1$ with the name
of the appropriate worksheet.
You will need the OLEDB provider, naturally, which comes with the install of Office 2007.
Using Excel 2007 in SSIS 2005
change text size: A A A

• Comments (2)
blog

• Forward to a friend
• Print
• Rating: 0/5 0 Votes
posted 3/2/2010 3:16:14 PM by DevinKnight
Many companies are not in a rush to upgrade their SQL Servers because of the
enormous cost to upgrade. This results in the majority of companies still running
previous versions of SQL Server (2005, 2000, and even earlier). Many times as the
developer you are forced to work with older server components but new file sources
like Excel 2007 with SQL Server and SSIS 2005. In this case, there are some
workarounds that will allow using what seem like two incompatible platforms. This
is a highly blogged about topic but with some recent questions about it I thought I'd
throw one more in the mix.

Before following these instructions ensure that you have the most up to date
service packs installed to have to correct data provider for this example.

Create a new OLE DB Connection Manager and select Microsoft Office 12.0 Access
Database Engine OLE DB Provider from the Provider list. Then change the Server or
file name to the Excel 2007 workbook file path.
Select the All page and change the Extended Properties to Excel 12.0. Then back to
the Connection page and hit Test Connection to verify the setup worked.
Now you can use an OLE DB Source in your Data Flow to connect to any sheet in
that Excel workbook.
SSIS and Excel 2007
By Dinesh Asanka, 2008/10/24
Total article views: 10191 | Views in the last 30 days: 149

Rate this | Join the discussion | Briefcase | Print

Introduction
Importing Excel files into SQL Server database is a common task needs to carry out by the
DBAs and developers. Also, in case of data warehouse, you need to extract data from various
data sources. Most of the times, Excel is one of the data sources.
As you are aware, importing and exporting data from and to the Excel is simple. it is just a
matter of drag and drop few data flow controls and configuring them according to your need.
Importing Excel 2007 File
If you are asked to import an Excel file, you can use Excel Source from the Data Flow Sources
in SQL Server Integration Services (SSIS) and select correct version from the available list.

You can see that you can only import Excel files up to Microsoft Excel 97-2005 version, which
means that you are not allow to import Excel 2007 files from the above control.
However, if you follow below steps, you can import Excel 2007 files into the SQL Server.
1. Drag and drop OLE DB Source data flow source to the data flow task.
2. Double click the OLE DB Source and click New button for OLE DB Connection Manager.
3. Click New button in the Configure OLE DB Connection Manager screen.
4. Select Native OLE DB\Microsoft Office 12.0 Access Database Engine OLE DB Provider
from the OLE DB Provider list.
5. Select All option and at the Extended Properties enter Excel 12.0. After this you will see a
screen like following image.
You can see the selected provider at the top of the screen.
6. Enter the file name with full path and make sure you have the extension xlsx.
7. After clicking OK button, you will be taken to the initial screen, in which you have to select
the worksheet you want.
Exporting to Excel 2007 File
There is no different when exporting to Excel file. it is again you have to modify the destination
connection as above.
SQL Server 2008
Though things are difficult in SQL Server 2005, things have become easy with SQL Server
2008. In SQL Server 2008, you simply need to select the Excel 2007 from the drop down.

Like import, you can use same way to export data to Excel 2007 by using Excel destination in
SQL Server 2008.
The Excel Source and Connection Manager – The basics
Posted by BI Monkey on Tuesday, May 19, 2009 · Leave a Comment Subscribe via RSS
In this post I will be reviewing the Excel Source and Excel Connection Manager. The sample
package and files can be found here for 2008 and here for 2005 and guidelines on use are here.

Fig 1: The SSIS Excel Source

How do you read data from an Excel Workbook in SSIS?


The answer to that is, it depends on the version of Excel . If it is 2003 or earlier, you can use the
the Excel Connection Manager and Excel Source. If it is 2007 or later, you use a specially
configured OLE DB Connection Manager, as described in the MSDN article How to: Connect to
an Excel Workbook. An example of this is in the sample package (Data Flow 4). Though the
sample package for this post includes an example of each version of Excel, but I will only be
discussing the Excel source for 2003 and earlier from here on.

Configuring the Excel Connection Manager


The Excel connection manager is pretty simple to set up – all it requires is the file path, Excel
version, and to know whether it has Column Names in the first row.

Fig 2: The Excel Connection Manager


If you look at the Properties of the Connection Manager once it is set up, you can see a Password
field – this is misleading – you cannot connect to a password protected workbook. So, if you
have to connect to a secure workbook you need to look at either other means of extracting that
data or alternative security for the workbook.

Configuring the Excel Source


The Excel Source is very similar to the OLE DB source. This can initially be confusing as in the
Data Access Mode drop-down it talks in terms of Tables, Views and SQL Commands. When it
says Table or View, what it means in Excel speak is Sheets and Named Ranges. When it talks in
terms of SQL – it really means it. You can construct SQL statements to pull restricted amounts
of, or modified versions of the spreadsheet data. An example is below:

Fig 3: SQL in the Excel Source


Just remember to qualify the Sheet / Range Name with square brackets – e.g. [Sheet1$]- if hand
writing code. Examples of each type of access is available in Data Flows 1 – 3 in the sample
package.

Summary
Use the Excel Source and Excel Connection manager when reading from workbooks from Excel
2003 and prior. Be aware the driver behind it can behave unexpectedly at times, and it is worth
paying attention to the “Usage Considerations” section of the MSDN documentation if you are
having unexpected results.
Documentation for the Excel Source can be found here for 2008 and here for 2005. Similarly
Documentation for the Excel Connection Manager can be found here for 2008 and here for 2005.

Das könnte Ihnen auch gefallen