“ROW PER BATCH"是否有任何相关性?和“最大插入提交大小"在 SSIS 包中?

我有一个 SSIS 包,它将包含 1000 万条记录的 2.5 GB 数据导出到 Sql Server 数据库,该数据库有 10 个分区,包括 PRIMARY FILE GROUP.

I've have SSIS Package that is exporting 2.5 GB OF DATA containing 10 million records into Sql Server Database which has 10 partitions including PRIMARY FILE GROUP.

更改默认最大插入提交大小之前,即2147483647"和每批行.使用快速加载选项完成转换需要 7 分钟.

Before Changing default Max Insert Commit size i.e."2147483647" and Row per batch.It was taking 7 mins for completed transformation with fast load option.

但是在用一些公式改变了一些体面的值之后,执行只用了 2 分钟.

But After chaning it some decent value with some formula, the execution was done in only 2 minutes.

仅供参考 - DefaultMaxBufferRows &DefaultMaxBufferSize 是两个场景中的默认值,即分别为 10000 和 10 MB.

FYI- DefaultMaxBufferRows & DefaultMaxBufferSize were default value in both scenorio i.e. 10000 and 10 MB respectively.

计算最大插入提交大小 &每批次行使用以下计算.

To calculate Max Insert Commit size & Row per batch Below calucation are used.

1) 正在传输的源记录的计算长度.大约有 1038 个字节.

1) Calculated length of records from source that is being transfered. which comes around 1038 bytes.

CREATE TABLE [dbo].[Game_DATA2](
    [ID] [int] IDENTITY(1,1) NOT NULL, -- AUTO CALCULATED
    [Number] [varchar](255) NOT NULL, -- 255 bytes
    [AccountTypeId] [int] NOT NULL, -- 4 bytes
    [Amount] [float] NOT NULL,-- 4 bytes
    [CashAccountNumber] [varchar](255) NULL, -- 255 bytes
    [StartDate] [datetime] NULL,-- 8 bytes
    [Status] [varchar](255) NOT NULL,-- 255 bytes
    [ClientCardNumber] [varchar](255) NULL -- 255 bytes
)

2) 每批的行数 = 每条记录的包大小/字节数 =32767/1038 =32 大约

2) Rows per batch = packate_size/bytes per record =32767/1038 =32 approx.

3) 最大插入提交大小=包大小*事务数=32767*100=3276700(包装大小和交易数量可变,可根据需要更改)

3) Max insert commit size = packate size *number of transaction = 32767*100=3276700 (Packate size and number transaction are variable can change as per requirement)

问题:

  • 每批次的行数和最大插入提交大小是否有任何相关性?由于存档 文章 中没有提到用于调整 DFT(DATA FLOW TASK) 执行.

  • Is there any relevance of rows per batch and max insert commit size? As there's no information mentioned in an archive article for tunning DFT(DATA FLOW TASK) execution.

这些配置是否与 DefaultBuffermaxzie 和
DefualtBuffermaxrows?如果是,如何?

Are these configuration works along with DefaultBuffermaxzie and
DefualtBuffermaxrows?if yes how?

推荐答案

这些参数仅适用于具有快速加载模式的 DFT OLE DB 目标.快速加载中的 OLE DB 目标发出 insert bulk 命令.这两个参数通过以下方式对其进行控制:

These parameters refer to DFT OLE DB Destination with Fast Load mode only. OLE DB Destination in Fast Load issues an insert bulk command. These two parameters control it in the following way:

  • 最大插入提交大小 - 控制在单个批次中插入的数据量.因此,如果您将 MICS 设置为 5000,并且您有 9000 行,并且您在前 5000 个结果中遇到错误,那么整批 5000 将被回滚.MISC 等同于 BULK INSERT transact-sql 命令中的 BATCHSIZE 参数.
  • 每批行数 - 只是对查询优化器的提示.此值应设置为实际预期的行数.RPB 等同于 BULK INSERT transact-sql 命令的 ROWS_PER_BATCH 参数.
    为 MICS 指定一个值会产生一些影响.每个批次都被复制到事务日志中,这将导致它快速增长,但提供了在每个批次之后备份该事务日志的能力.此外,如果您在目标表上有索引,那么拥有大批量会对内存产生负面影响,如果您没有使用表锁定,则可能会发生更多阻塞.
  • Maximum insert commit size - controls how much data inserted in a single batch. So, if you have MICS set to 5000 and you have 9000 rows and you encounter an error in the first 5000 results, the entire batch of 5000 will be rolled back. MISC equates to the BATCHSIZE argument in the BULK INSERT transact-sql command.
  • Rows Per Batch - merely a hint to the query optimizer. The value of this should be set to the actual expected number of rows. RPB equates to the ROWS_PER_BATCH argument to the BULK INSERT transact-sql command.
    Specifying a value for the MICS will have a few effects. Each batch is copied to the transaction log, which will cause it to grow quickly, but offers the ability to back up that transaction log after each batch. Also, having a large batch will negatively affect memory if you have indexes on the target table, and if you are not using table locking, you might have more blocking going on.

批量插入 (Transact-SQL) - 关于此命令的 MS 文章.

DefaultBuffermaxsize 和 DefaultBuffermaxrows 控制 DFT 自身内部的 RAM 缓冲区管理,并且不干扰上述选项.

DefaultBuffermaxsize and DefaultBuffermaxrows controls RAM buffer management inside DFT itself, and has no interference with options mentioned above.

相关文章