Deprecated: Assigning the return value of new by reference is deprecated in /home/protean/public_html/b/wp-includes/cache.php on line 33
Protean » Blog Archive » Excel & SSIS (Part 1)

Excel & SSIS (Part 1)

Came across an irritating behaviour with the Excel source in SSIS. The Excel ODBC Jet Driver that SSIS uses has the annoying default configuration of only checking the first 8 lines in a spreadsheet, to evaluate the field length of any external columns in the Excel source task…

Professional Excel Development: The Definitive Guide to Developing Applications Using Microsoft Excel and VBA

 

If for any reason you have a string of data further down the spreadsheet that is longer than the allotted 255 characters long, the SSIS package will fail with a ‘field would be truncated error’.

Possible solutions are: 

  • Put the long string at the top of the spreadsheet, so that it is evaluated correctly at runtime.
  • Put a dummy record at the top of the spreadsheet which has character lengths as long as the desired character length within the SQL Schema.
  • Set the ODBC Jet Driver to scan more rows. This can be done by setting the registry
    HKEY_LOCAL_MACHINE > SOFTWARE > Microsoft > Jet > 4.0 > Engines > Excel > TypeGuessRows to a higher value. (Ensure it scans at least as far as the long text cells and also ensure this is done in all environments or you will be caught out!)
  • 4 Responses to “Excel & SSIS (Part 1)”

    1. baz1976 Says:

      I was caught out by this one too. Remember that the SSIS designer uses your desktop configurations regardless of which server you are running against. Best way to test is to log on remotely and use the dtexecui tool.

    2. FB Says:

      Also remember that if you are going to use the DT_NTEXT datatype you cannot perform all the string manipulation functions that you can with DT_WSTR. You may need a datatype conversion downstream.

    3. David Wynne Says:

      It also uses the first 8 rows to evaluate the data type of each col, so if the first 8 rows contain strings and the 15th contains a int - the int will be ignored and in it’s place you’ll get a NULL. Most annoying.

      The best solution I’ve found is to alter the Excel connection string, appending IMEX=1 to the end, so you end up with:

      Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\Dir\Source.xls;Extended Properties=”EXCEL 8.0;HDR=YES;IMEX=1″;

      This tells the Jet Driver to use “import mode” which always reads intermixed data columns as text. Means you might have to do some data conversion in your data flow further down the pipe line, but at least you won’t be loosing any data.

      (Obviously if you’re picking up your spreadsheet dynamically at run time, you’ll need to use an expression to build the connection string).

    4. FB Says:

      Setting IMEX=1 can solve the problem, but not always. For this to work the excel sheet must be correctly formatted in the first place. So all cells need to be in text format, and simply formatting the cells alone isn’t always enough. You have re-enter the values after…

      Lastly, even if you have correctly formatted the spreadsheet to text, reentered the values and set IMEX=1 in your connection string, it may still fail if your datatypes change after the initial sampling of rows. i.e. If you have numbers in your first 8 records and then a string after, it will still expect numeric data.

      A royal pain in the a$£e!

    Leave a Reply