如何跳过ssis平面文件源中的坏行

2021-12-30 00:00:00 csv etl sql-server ssis flat-file

我正在将一个 17 列的 CSV 文件读入数据库.偶尔,该文件有一个少于 17 列"的行.我试图忽略该行,但即使所有列都设置为忽略,我也无法忽略该行并且程序包失败.

I am reading in a 17-column CSV file into a database. once in a while the file has a "less then 17-column" row. I am trying to ignore the row, but even when all columns are set to ignore, I can't ignore that row and the package fails.

如何忽略这些行?

推荐答案

解决方案概览

您可以通过添加一个平面文件连接管理器来实现这一点,只添加一个数据类型为DT_WSTR且长度为4000的列(假设它的名字是 Column0) - 所以所有的列都被认为是一个大列

Solution Overview

you can do this by adding one Flat File Connection Manager add only one column with Data type DT_WSTR and a length of 4000 (assuming it's name is Column0) - So all column are considered as one big column

  • Dataflow task中,在Flat File Source
  • 之后添加一个Script Component
  • Column0标记为输入列并添加17个输出列
  • Input0_ProcessInputRow方法中,通过分隔符分割Column0,然后检查数组的长度是否为= 17,然后为输出列赋值,否则忽略该行.
  • In the Dataflow task add a Script Component after the Flat File Source
  • In mark Column0 as Input Column and Add 17 Output Columns
  • In the Input0_ProcessInputRow method split Column0 by delimiter, Then check if the length of array is = 17 then assign values to output columns, Else ignore the row.
  1. 添加一个平面文件连接管理器,选择文本文件
  2. 转到高级选项卡,删除除一列之外的所有列
  3. 将剩余的列的数据类型更改为 DT_WSTR 并且长度 = 4000
  1. Add a Flat file connection manager, Select the text file
  2. Go to the Advanced Tab, Delete all Columns except one Column
  3. Change the datatype of the remianing Column to DT_WSTR and length = 4000

  1. 添加数据流任务
  2. 在数据流任务中添加平面文件源、脚本组件和 OLEDB 目标

  1. 在脚本组件中选择Column0作为输入列

  1. 添加 17 个输出列(最佳输出列)
  2. OutputBuffer SynchronousInput 属性更改为 None
  1. Add 17 Output Columns (the optimal output columns)
  2. Change the OutputBuffer SynchronousInput property to None

  1. 选择Visual Basic的脚本语言

  1. 在脚本编辑器中编写以下脚本

  1. In the Script Editor write the following Script

Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)

    If Not Row.Column0_IsNull AndAlso
            Not String.IsNullOrEmpty(Row.Column0.Trim) Then


        Dim strColumns As String() = Row.Column0.Split(CChar(";"))

        If strColumns.Length <> 17 Then Exit Sub


        Output0Buffer.AddRow()
        Output0Buffer.Column = strColumns(0)
        Output0Buffer.Column1 = strColumns(1)
        Output0Buffer.Column2 = strColumns(2)
        Output0Buffer.Column3 = strColumns(3)
        Output0Buffer.Column4 = strColumns(4)
        Output0Buffer.Column5 = strColumns(5)
        Output0Buffer.Column6 = strColumns(6)
        Output0Buffer.Column7 = strColumns(7)
        Output0Buffer.Column8 = strColumns(8)
        Output0Buffer.Column9 = strColumns(9)
        Output0Buffer.Column10 = strColumns(10)
        Output0Buffer.Column11 = strColumns(11)
        Output0Buffer.Column12 = strColumns(12)
        Output0Buffer.Column13 = strColumns(13)
        Output0Buffer.Column14 = strColumns(14)
        Output0Buffer.Column15 = strColumns(15)
        Output0Buffer.Column16 = strColumns(16)

    End If

End Sub

  • 将输出列映射到目标列

  • Map the Output Columns to the Destination Columns

  • 相关文章