SSIS 参差不齐的文件无法识别 CRLF
在 SSIS 中,我尝试从平面文件加载数据.平面文件有固定宽度的列,但有些列不存在于一行中(一列可以有一个 CRLF,它必须是一个新行)像这样
In SSIS, I try to load data from a flat file. The flat file have fixed width columns, but some column are not present in a row (a column can have a CRLF, which must be a new line) like this
a b c
the first rowok<CRLF>
iu jjrjdd<CRLF>
this is a newline<CRLF>
如何在我的输出中拥有完全相同的行数和准确的数据?
How I can have exactly the same number of line and exact data in my output?
我设置了一个不规则类型的平面文件连接.
I setup a flat file connection, of ragged right type.
在此示例中,第 1 行被正确检索,但对于第 2 行,它无法识别 CRLF,并将所有第 3 行放在 b 列中.
In this sample, row 1 is correctly retrieve, but for row 2, it didn't recognize CRLF, and put in b column all the 3rd row.
推荐答案
解决方法
- 在平面文件连接管理器中将整行读取为一列(仅添加一列类型为 DT_STR 且长度为 4000 的列)
- 然后在数据流任务中添加一个脚本组件
- 添加三个
DT_STR
类型的输出列(a,b,c)
- Then in the dataflow task add a script component
- Add three output column (a,b,c) of type
DT_STR
- 编写一个脚本来拆分每一行并将值放入列中(如果缺少一个值则为空)(我使用 vb.net)
制表符分隔的列
Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
If Not Row.Column0_IsNull AndAlso
Not String.IsNullOrEmpty(Row.Column0.Trim) Then
Dim str() As String = Row.Column0.Split(CChar(vbTab))
If str.Length >= 3 Then
Row.a = str(0)
Row.b = str(1)
Row.c = str(2)
ElseIf str.Length = 2 Then
Row.a = str(0)
Row.b = str(1)
Row.c_IsNull = True
ElseIf str.Length = 1 Then
Row.a = str(0)
Row.b_IsNull = True
Row.c_IsNull = True
Else
Row.a_IsNull = True
Row.b_IsNull = True
Row.c_IsNull = True
End If
Else
Row.a_IsNull = True
Row.b_IsNull = True
Row.c_IsNull = True
End If
End Sub
固定宽度的列
Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
If Not Row.Column0_IsNull AndAlso
Not String.IsNullOrEmpty(Row.Column0.Trim) Then
'Assuming that
'Col a => 0-5
'Col b => 5-15
'Col c => 15-
Dim intlength As Integer = Row.Column0.Length
If intlength <= 5 Then
Row.a = Row.Column0
Row.b_IsNull = True
Row.c_IsNull = True
ElseIf intlength > 5 AndAlso intlength <= 15 Then
Row.a = Row.Column0.Substring(0, 5)
Row.b = Row.Column0.Substring(5, 10)
Row.c_IsNull = True
ElseIf intlength > 15 Then
Row.a = Row.Column0.Substring(0, 5)
Row.b = Row.Column0.Substring(5, 10)
Row.c = Row.Column0.Substring(15)
End If
Else
Row.a_IsNull = True
Row.b_IsNull = True
Row.c_IsNull = True
End If
End Sub
您也可以使用派生列转换来实现这一点
相关文章