Django中DateTime字段之间的聚合差异

2022-01-01 00:00:00 postgresql django orm aggregate sqlite

我有一个表格,其中包含一系列与时间段(特别是为客户工作的时间)相关的条目:

I have a table containing a series of entries which relate to time periods (specifically, time worked for a client):

task_time:
id     |    start_time    |    end_time       |    client (fk)
1        08/12/2011 14:48   08/12/2011 14:50     2

我正在尝试从我的 Django 应用程序中汇总为给定客户工作的所有时间:

I am trying to aggregate all the time worked for a given client, from my Django app:

time_worked_aggregate = models.TaskTime.objects.
                        filter(client = some_client_id).
                        extra(select = {'elapsed': 'SUM(task_time.end_time - task_time.start_time)'}).
                        values('elapsed')

if len(time_worked_aggregate) > 0:
    time_worked = time_worked_aggregate[0]['elapsed'].total_seconds()
else:
    time_worked = 0

这看起来不雅,但它确实有效.或者至少我是这么想的:事实证明它在 PostgreSQL 数据库上运行良好,但是当我转移到 SQLite 时,一切都消失了.

This seems inelegant, but it does work. Or at least so I thought: it turns out that it works fine on a PostgreSQL database, but when I move over to SQLite, everything dies.

一些挖掘表明,这样做的原因是 DateTime 不是 SQLite 中的一流数据.以下原始 SQLite 查询将完成我的工作:

A bit of digging suggests that the reason for this is that DateTimes aren't first-class data in SQLite. The following raw SQLite query will do my job:

SELECT SUM(strftime('%s', end_time) - strftime('%s', start_time)) FROM task_time WHERE ...;

我的问题如下:

  • 上面的 Python 示例似乎很迂回.我们可以更优雅地做到这一点吗?
  • 更重要的是,在这个阶段,我们能否以一种同时适用于 Postgres 和 SQLite 的方式来实现?理想情况下,我不希望编写原始 SQL 查询并打开恰好到位的数据库后端;一般,Django 非常擅长保护我们免受此类攻击.Django 对这个操作有合理的抽象吗?如果没有,对我来说在后端进行条件切换的明智方法是什么?
  • The Python sample above seems roundabout. Can we do this more elegantly?
  • More importantly at this stage, can we do it in a way that will work on both Postgres and SQLite? Ideally, I'd like not to be writing raw SQL queries and switching on the database backend that happens to be in place; in general, Django is extremely good at protecting us from this. Does Django have a reasonable abstraction for this operation? If not, what's a sensible way for me to do a conditional switch on the backend?

我应该在上下文中提到数据集有数千个条目;以下内容并不实用:

I should mention for context that the dataset is many thousands of entries; the following is not really practical:

sum([task_time.end_date - task_time.start_date for task_time in models.TaskTime.objects.filter(...)])

推荐答案

我认为从 Django 1.8 开始我们可以做得更好:

I think since Django 1.8 we can do better:

我只想绘制带有注释的部分,带有聚合的进一步部分应该很简单:

I would like just to draw the part with annotation, the further part with aggregation should be straightforward:

from django.db.models import F, Func
SomeModel.objects.annotate(
    duration = Func(F('end_date'), F('start_date'), function='age')
)

[有关 postgres 年龄函数的更多信息:http://www.postgresql.org/docs/8.4/static/functions-datetime.html ]

[more about postgres age function here: http://www.postgresql.org/docs/8.4/static/functions-datetime.html ]

SomeModel 的每个实例都将使用包含时差的 duration 字段进行注释,在 python 中将是一个 datetime.timedelta() 对象[更多关于 datetime timedelta 在这里:https://docs.python.org/2/library/datetime.html#timedelta-objects ]

each instance of SomeModel will be anotated with duration field containg time difference, which in python will be a datetime.timedelta() object [more about datetime timedelta here: https://docs.python.org/2/library/datetime.html#timedelta-objects ]

相关文章