SSIS 数据流任务在预执行阶段的执行中挂起

2021-12-30 00:00:00 etl sql-server ssis

我有一个正在执行的数据流任务.
流程很简单,对不同的表进行两次查询(都有几个连接),然后通过一个公共 id 对输出进行排序和合并,向所有记录添加一个静态列,将行数保存在用户变量中以备后用使用并最终插入到另一个数据库上的表中.我们正在使用 OLE DB 源和目标.源是 MSSQL 2000,目标是 MSSQL 2012

症状:

  • 执行时,数据流会获得通常的黄色正在运行"图标.但是,当您双击查看数据流时,所有元素都没有任何黄色、红色或绿色标记.
  • 这种情况持续了很长时间,起初持续了大约 20 分钟,之后开始变得越来越长,或者根本没有恢复.
  • 输出显示:
    信息:0x40043006 at Load Sandbox Table,SSIS.Pipeline:准备执行阶段开始.
    信息:加载沙盒表中的 0x40043007,SSIS.Pipeline:预执行阶段开始.

    在停止执行之前仅此而已.
  • 是的,这曾经奏效过.是的,我们使用了单个查询(在存储过程中)来执行此 ETL,但我们希望将所有步骤迁移到 SSIS.
  • I have a Data Flow Task that is hanging on excecution.
    The flow is simple, makes two queries to different tables (Both with a couple of joins), then sorts and merges the otuputs through a common id, adds a static column to all the records, saves the row count in a user variable for later use and finally inserts into a table on another DB. We are using OLE DB Sources and Destination. Source is MSSQL 2000 and Destination is MSSQL 2012

    Symptoms:

  • When excecuting, the Data Flow gets the usual yellow "running" icon. However when you double click to see the Data Flow, non of the elements have any yellow, red or green mark.
  • This goes on for long periods of time, at first it lasted around 20 minutes, after that it started getting longer or simply not returning at all.
  • Output shows:
    Information: 0x40043006 at Load Sandbox Table, SSIS.Pipeline: Prepare for Execute phase is beginning.
    Information: 0x40043007 at Load Sandbox Table, SSIS.Pipeline: Pre-Execute phase is beginning.

    And nothing more until the excecution is stopped.
  • Yes, this has worked before. And yes, we have used a single query (in a stored procedure) to do this ETL but we wanted to migrate all the steps to SSIS.
  • 失败的解决方案:

  • 没有查找.
  • 任务流的默认缓冲区大小增加到 40485760,然后增加到 80971520.
  • 任务的默认缓冲区最大行数设置为 1000000.
  • 该任务的延迟验证设置为 True.
  • 任务中的所有元素都将验证外部数据"设置为假".
  • 两个查询都有:
    SET FMTONLY OFF;
    SET NOCOUNT ON;

    在开始时添加.
  • 两个查询都将 MAXDOP 设置为 1.
  • 将项目的运行 64 位运行时设置为 False.
  • 将目标加载从表或视图更改为表或视图 - 快速加载,没有锁定或约束.
  • 将每批次的行数设置为 1000 以实现快速加载.
  • 一些变通办法建议将任务流分成两个或多个任务流.但这是不可能的,因为我们需要做的是合并在两个源查询中找到的信息.
  • Failed solutions:

  • There are no lookups.
  • Default buffer size for the task flow was increased to 40485760 then to 80971520.
  • Default buffer max rows for task was set to 1000000.
  • Delay Validation was set to True for the task.
  • All elements inside the task were set Validate External Data to False.
  • Both queries had:
    SET FMTONLY OFF;
    SET NOCOUNT ON;

    added at the beggining.
  • Both queries had MAXDOP set to 1.
  • Setting project's Run 64 bit Runtime to False.
  • Changed destination load from Table or View to Table or View - Fast load with no locks or constraints.
  • Set rows per batch to 1000 for fast load.
  • Some work arounds propose to separate the task flow into two or more task flows. But this is not possible since what we need to do is a merge of the information found on both source queries.
  • 额外的位:我真的希望有人能帮助我.我对 SSIS 相当陌生,这是我第一次使用它.我通常为我的 ETL 使用 Pentaho,但客户需要在 SSIS 上实施该解决方案.几天来我一直在与这个问题作斗争,我开始没有想法来解决它.

    Extra bits: I really hope someone can help me. I am fairly new to SSIS, this is the first time I use it. I usually work with Pentaho for my ETL but the client needs the solution to be implemented on SSIS. I've been battling with this issue for a couple of days now and I'm starting to run out of ideas to solve it.

    当运行命令行时它也卡住了,我得到以下输出:

    When ran through the command line it gets stuck too and I get the following output:

    Progress: 2013-03-19 14:36:26.21
       Source: Load Sandbox Table
       Validating: 0% complete
    End Progress
    Progress: 2013-03-19 14:36:26.21
       Source: Load Sandbox Table
       Validating: 12% complete
    End Progress
    Progress: 2013-03-19 14:36:26.22
       Source: Load Sandbox Table
       Validating: 25% complete
    End Progress
    Progress: 2013-03-19 14:36:26.22
       Source: Load Sandbox Table
       Validating: 37% complete
    End Progress
    Progress: 2013-03-19 14:36:26.23
       Source: Load Sandbox Table
       Validating: 50% complete
    End Progress
    Progress: 2013-03-19 14:36:26.25
       Source: Load Sandbox Table
       Validating: 62% complete
    End Progress
    Progress: 2013-03-19 14:36:26.25
       Source: Load Sandbox Table
       Validating: 75% complete
    End Progress
    Progress: 2013-03-19 14:36:26.25
       Source: Load Sandbox Table
       Validating: 87% complete
    End Progress
    Progress: 2013-03-19 14:36:26.25
       Source: Load Sandbox Table
       Validating: 100% complete
    End Progress
    Warning: 2013-03-19 14:36:26.26
       Code: 0x80047076
       Source: Load Sandbox Table SSIS.Pipeline
       Description: The output column "ITEM_OID (1)" (47) on output "Merge Join Outp
    ut" (28) and component "Merge Join" (11) is not subsequently used in the Data Fl
    ow task. Removing this unused output column can increase Data Flow task performa
    nce.
    End Warning
    Progress: 2013-03-19 14:36:26.27
       Source: Load Sandbox Table
       Prepare for Execute: 0% complete
    End Progress
    Progress: 2013-03-19 14:36:26.27
       Source: Load Sandbox Table
       Prepare for Execute: 12% complete
    End Progress
    Progress: 2013-03-19 14:36:26.27
       Source: Load Sandbox Table
       Prepare for Execute: 25% complete
    End Progress
    Progress: 2013-03-19 14:36:26.27
       Source: Load Sandbox Table
       Prepare for Execute: 37% complete
    End Progress
    Progress: 2013-03-19 14:36:26.27
       Source: Load Sandbox Table
       Prepare for Execute: 50% complete
    End Progress
    Progress: 2013-03-19 14:36:26.27
       Source: Load Sandbox Table
       Prepare for Execute: 62% complete
    End Progress
    Progress: 2013-03-19 14:36:26.27
       Source: Load Sandbox Table
       Prepare for Execute: 75% complete
    End Progress
    Progress: 2013-03-19 14:36:26.27
       Source: Load Sandbox Table
       Prepare for Execute: 87% complete
    End Progress
    Progress: 2013-03-19 14:36:26.27
       Source: Load Sandbox Table
       Prepare for Execute: 100% complete
    End Progress
    Progress: 2013-03-19 14:36:26.31
       Source: Load Sandbox Table
       Pre-Execute: 0% complete
    End Progress
    Progress: 2013-03-19 14:36:26.31
       Source: Load Sandbox Table
       Pre-Execute: 12% complete
    End Progress
    Progress: 2013-03-19 14:36:26.31
       Source: Load Sandbox Table
       Pre-Execute: 25% complete
    End Progress
    Progress: 2013-03-19 14:36:26.34
       Source: Load Sandbox Table
       Pre-Execute: 37% complete
    End Progress
    Progress: 2013-03-19 14:36:45.69
       Source: Load Sandbox Table
       Pre-Execute: 50% complete
    End Progress
    

    之后它再次冻结.

    解决方案 (在这里发布这个是因为我还有 5 个小时无法回答我自己的问题,我会在被允许的情况下回答.)
    我终于明白了.
    事实证明,验证存在问题,但不仅仅是 SSIS 元素通过了验证,如问题的第四个失败的解决方案中所述.
    CONNECTIONS 也得到验证并拥有自己的 Delay Validation 属性,该属性需要设置为 true.
    之后,执行时间从 40 多分钟或不运行到整个过程的不到一分钟(这只是更大过程的一个步骤)
    我希望遇到同样问题的人可以轻松找到这个解决方案,因为有很多人遇到了这个问题,而且几乎没有在线发布解决方案.

    SOLUTION (Posting this here because I can't answer my own question for another 5 hours, I'll do it when I'm allowed to.)
    I finally got it.
    It turns out there is an issue with the validation, but not only SSIS elements go through that validation, as stated in the fourth failed solution of the question.
    The CONNECTIONS also get validated and have their own Delay Validation property, which needs to be set to true.
    After that the excecution time went from 40+ mins or no running to less than a minute for the full process (This is just one step of a much bigger process)
    I hope people with this same problem can find this solution easily because there's a lot of people running into this problem and nearly no solutions posted online.

    简而言之:检查任务中涉及的所有元素,包括数据库连接的延迟验证属性是否设置为 True.

    In a nutshell: Check that all your elements involved in the task, including the DB connections have Delay Validation Property set to True.

    推荐答案

    我终于明白了.事实证明,验证存在问题,但不仅仅是 SSIS 元素通过该验证,如问题的第四个失败的解决方案中所述.CONNECTIONS 也得到验证并拥有自己的 Delay Validation 属性,该属性需要设置为 true.之后,执行时间从 40+ 分钟或不运行到整个过程的不到一分钟(这只是更大过程的一个步骤)我希望遇到同样问题的人可以轻松找到此解决方案,因为遇到此问题的人很多,而且几乎没有在线发布解决方案.

    I finally got it. It turns out there is an issue with the validation, but not only SSIS elements go through that validation, as stated in the fourth failed solution of the question. The CONNECTIONS also get validated and have their own Delay Validation property, which needs to be set to true. After that the excecution time went from 40+ mins or no running to less than a minute for the full process (This is just one step of a much bigger process) I hope people with this same problem can find this solution easily because there's a lot of people running into this problem and nearly no solutions posted online.

    简而言之:检查任务中涉及的所有元素(包括数据库连接)是否将延迟验证属性设置为 True.

    In a nutshell: Check that all your elements involved in the task, including the DB connections have Delay Verification Property set to True.

    相关文章