SSIS 中带参数的日期计算没有给出正确的结果
我想从数据源加载过去 n 天的数据.为此,我有一个项目参数number_of_days".我在带有 SQL 命令的 OleDB 数据源中使用参数,并带有子句
WHERE StartDate >= CAST(GETDATE() -? 作为日期)
这个参数被映射到一个项目参数,一个 Int32.但是,如果我想加载最后 10 天,它只给我最后 8 天.
版本信息:
- SQL Server 数据工具 15.1.61710.120
- Server 是 SQL Server 2017 标准版.
我设置了一个测试包,数据尽可能少.有这个数据源:
参数:
参数映射:
T-SQL 表达式(错误结果):
CAST(GETDATE() -? 作为日期)
date_calc 的 SSIS 表达式(正确结果):
(DT_DBTIMESTAMP) (DT_DBDATE) DATEADD("DD", - @[$Project::number_of_days] , GETDATE())
我认为 T-SQL 表达式和 SSIS 表达式给出了相同的结果(今天减去 10 天),但当我运行包并将结果存储在表中时,情况并非如此.请参阅 date_diff 列,它给出 8 天而不是 10 天:
如果我用实际值替换参数,我确实得到了正确的结果.
数据查看器也显示了错误的日期.当我部署包时,我得到与调试器相同的结果.
这是一个错误,还是我在这里遗漏了什么?
解决方案我认为主要问题是OLEDB源如何检测参数数据类型,我没有找到提到的官方文档,但是你可以做一个小实验来看看这个:
尝试在 OLEDB Source 中的 SQL 命令中编写以下查询:
SELECT ?作为第 1 列
然后尝试解析查询,会出现如下错误:
<块引用>'@P1' 的参数类型不能唯一推导;两种可能性是sql_variant"和xml".
这意味着查询解析器试图弄清楚这些参数的数据类型是什么,它与您映射到它的变量数据类型无关.
然后尝试编写以下查询:
SELECT CAST(? AS INT) AS Column1
然后尝试解析查询,你会得到:
<块引用>SQL 语句已成功解析.
现在,让我们将这些实验应用到您的查询中:
尝试 SELECT CAST(GETDATE() - ? AS DATE) as Column1
,你会得到一个错误的值,然后尝试 SELECT CAST(GETDATE() - CAST(? AS INT)AS DATE) AS Column1
,你会得到一个正确的值.
更新 1 - 来自官方文档的信息
来自以下
显示参数数据类型被认为是datetime
.
其他命令显示了一些奇怪的语句:
- 首先将
@P1
的值设为1
- 使用以下值执行最终查询
1900-01-09 00:00:00
讨论
在 SQL Server 数据库引擎中,基准日期时间值为 1900-01-01 00:00:00
,可以通过执行以下查询来检索:
声明@dt 日期时间设置@dt = 0选择@dt
另一方面,在 SSIS 中:
<块引用>由年、月、日、时、分、秒和小数秒组成的日期结构.小数秒有固定的 7 位数字.
DT_DATE 数据类型使用 8 字节浮点数实现.天数由整数增量表示,从 1899 年 12 月 30 日开始,午夜作为时间零.小时值表示为数字小数部分的绝对值.但是,浮点值不能代表所有实数值;因此,DT_DATE 中可以显示的日期范围是有限制的.
另一方面,DT_DBTIMESTAMP 由内部具有单独字段的结构表示,用于表示年、月、日、小时、分钟、秒和毫秒.这种数据类型对它可以显示的日期范围有更大的限制.
基于此,我认为 SSIS 日期数据类型(1899-12-30
)和 SQL Server 日期时间(1900-01-01
),这导致在执行隐式转换以评估参数值时两天的差异.
参考文献
- 集成服务数据类型
- 解析数据
- 数据类型转换(数据库引擎)
I want to load data from the last n days from a data source. To do this, I have a project parameter "number_of_days". I use the parameter in an OleDB data source with a SQL Command, with a clause
WHERE StartDate >= CAST(GETDATE() -? as date)
This parameter is mapped to a project parameter, an Int32. But, if I want to load the last 10 days, it is only giving me the last 8 days.
Version info:
- SQL Server Data Tools 15.1.61710.120
- Server is SQL Server 2017 standard edition.
I set up a test package, with as little data as possible. There is this data source:
Parameter:
Parameter mapping:
The T-SQL expression (wrong result):
CAST(GETDATE() -? as date)
The SSIS expression for date_calc (correct result):
(DT_DBTIMESTAMP) (DT_DBDATE) DATEADD("DD", - @[$Project::number_of_days] , GETDATE())
I would think that the T-SQL expression and the SSIS expression give the same result (today minus 10 days) but that is not the case when I run the package and store the results in a table. See column date_diff, which gives 8 days instead of 10:
If I replace the parameter by the actual value, I do get the correct result.
A data viewer also shows the incorrect date. When I deploy the package, I get the same result as from the debugger.
Is this a bug, or am I missing something here?
解决方案I think the main problem is how OLEDB source detect the parameter data type, i didn't find an official documentation that mentioned that, but you can do a small experiment to see this:
Try to write the following Query in the SQL Command in the OLEDB Source:
SELECT ? as Column1
And then try to parse the query, you will get the following error:
The parameter type for '@P1' cannot be uniquely deduced; two possibilities are 'sql_variant' and 'xml'.
Which means that the query parser try to figure out what is the data type of these parameter, it is not related to the variable data type that you have mapped to it.
Then try to write the following query:
SELECT CAST(? AS INT) AS Column1
And then try to parse the query, you will get:
The SQL Statement was successfully parsed.
Now, let's apply these experiment to your query:
Try SELECT CAST(GETDATE() - ? AS DATE) as Column1
and you will get a wrong value, then try SELECT CAST(GETDATE() - CAST(? AS INT) AS DATE) AS Column1
and you will get a correct value.
Update 1 - Info from official documentation
From the following OLEDB Source - Documentation:
The parameters are mapped to variables that provide the parameter values at run time. The variables are typically user-defined variables, although you can also use the system variables that Integration Services provides. If you use user-defined variables, make sure that you set the data type to a type that is compatible with the data type of the column that the mapped parameter references.
Which implies that the parameter datatype is not related to the variable data type.
Update 2 - Experiments using SQL Profiler
As experiments, i created an SSIS package that export data from OLEDB Source to Recordset Destination. The Data source is the result of the following query:
SELECT *
FROM dbo.DatabaseLog
WHERE PostTime < CAST(GETDATE() - ? as date)
And The Parameter ?
is mapped to a Variable of type Int32
and has the Value 10
Before executing the package, i started and SQL Profiler Trace on the SQL Server Instance, after executing the package the following queries are recorded into the trace:
exec [sys].sp_describe_undeclared_parameters N'SELECT *
FROM dbo.DatabaseLog
WHERE PostTime < CAST(GETDATE() -@P1 as date)'
declare @p1 int
set @p1=1
exec sp_prepare @p1 output,N'@P1 datetime',N'SELECT *
FROM dbo.DatabaseLog
WHERE PostTime < CAST(GETDATE() -@P1 as date)',1
select @p1
exec sp_execute 1,'1900-01-09 00:00:00'
exec sp_unprepare 1
The first command exec [sys].sp_describe_undeclared_parameters
is to describe the parameter type, if we run it separately it returns the following information:
It shows that the parameter data type is considered as datetime
.
The other commands shows some weird statement:
- First, the value of
@P1
is set to1
- The final query is executed with the following value
1900-01-09 00:00:00
Discussion
In SQL Server database engine the base datetime value is 1900-01-01 00:00:00
which can be retrieved by executing the folloing query:
declare @dt datetime
set @dt = 0
Select @dt
On the other hand, in SSIS:
A date structure that consists of year, month, day, hour, minute, seconds, and fractional seconds. The fractional seconds have a fixed scale of 7 digits.
The DT_DATE data type is implemented using an 8-byte floating-point number. Days are represented by whole number increments, starting with 30 December 1899, and midnight as time zero. Hour values are expressed as the absolute value of the fractional part of the number. However, a floating point value cannot represent all real values; therefore, there are limits on the range of dates that can be presented in DT_DATE.
On the other hand, DT_DBTIMESTAMP is represented by a structure that internally has individual fields for year, month, day, hours, minutes, seconds, and milliseconds. This data type has larger limits on ranges of the dates it can present.
Based on that, i think that there is a difference between the datetime base value between SSIS date data type (1899-12-30
) and the SQL Server datetime (1900-01-01
), which leads to a difference in two days when performing an implicit conversion to evaluate the parameter value.
References
- Integration Services Data Types
- Parsing Data
- Data type conversion (Database Engine)
相关文章