Excel & SSIS (Part 1)
Came across an irritating behaviour with the Excel source in SSIS. The Excel ODBC Jet Driver that SSIS uses has the annoying default configuration of only checking the first 8 lines in a spreadsheet, to evaluate the field length of any external columns in the Excel source task…
If for any reason you have a string of data further down the spreadsheet that is longer than the allotted 255 characters long, the SSIS package will fail with a ‘field would be truncated error’.
Possible solutions are:
HKEY_LOCAL_MACHINE > SOFTWARE > Microsoft > Jet > 4.0 > Engines > Excel > TypeGuessRows to a higher value. (Ensure it scans at least as far as the long text cells and also ensure this is done in all environments or you will be caught out!)
January 16th, 2007 at 12:40 am
I was caught out by this one too. Remember that the SSIS designer uses your desktop configurations regardless of which server you are running against. Best way to test is to log on remotely and use the dtexecui tool.
January 29th, 2008 at 11:29 am
Also remember that if you are going to use the DT_NTEXT datatype you cannot perform all the string manipulation functions that you can with DT_WSTR. You may need a datatype conversion downstream.
March 28th, 2008 at 11:39 am
It also uses the first 8 rows to evaluate the data type of each col, so if the first 8 rows contain strings and the 15th contains a int - the int will be ignored and in it’s place you’ll get a NULL. Most annoying.
The best solution I’ve found is to alter the Excel connection string, appending IMEX=1 to the end, so you end up with:
Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\Dir\Source.xls;Extended Properties=”EXCEL 8.0;HDR=YES;IMEX=1″;
This tells the Jet Driver to use “import mode” which always reads intermixed data columns as text. Means you might have to do some data conversion in your data flow further down the pipe line, but at least you won’t be loosing any data.
(Obviously if you’re picking up your spreadsheet dynamically at run time, you’ll need to use an expression to build the connection string).
April 2nd, 2008 at 1:38 pm
Setting IMEX=1 can solve the problem, but not always. For this to work the excel sheet must be correctly formatted in the first place. So all cells need to be in text format, and simply formatting the cells alone isn’t always enough. You have re-enter the values after…
Lastly, even if you have correctly formatted the spreadsheet to text, reentered the values and set IMEX=1 in your connection string, it may still fail if your datatypes change after the initial sampling of rows. i.e. If you have numbers in your first 8 records and then a string after, it will still expect numeric data.
A royal pain in the a$£e!