测量 Celery 任务执行时间
问题描述
我已将一个独立的批处理作业转换为使用 celery 来调度要完成的工作.我正在使用 RabbitMQ.一切都在一台机器上运行,没有其他进程正在使用 RabbitMQ 实例.我的脚本只是创建了一堆由工作人员处理的任务.
I have converted a standalone batch job to use celery for dispatching the work to be done. I'm using RabbitMQ. Everything is running on a single machine and no other processes are using the RabbitMQ instance. My script just creates a bunch of tasks which are processed by workers.
有没有一种简单的方法来测量从我的脚本开始到所有任务完成的时间?我知道在使用消息队列时这在设计上有点复杂.但我不想在生产中这样做,只是为了测试和获得性能估计.
Is there a simple way to measure the time from the start of my script until all tasks are finished? I know that this a bit complicated by design when using message queues. But I don't want to do it in production, just for testing and getting a performance estimation.
解决方案
你可以使用 chord 通过在末尾添加一个假任务,该任务将经过发送任务的时间,并将返回当前时间与执行时经过的时间之间的差.
You could use a chord by adding a fake task at the end that would be passed the time at which the tasks were sent, and that would return the difference between current time and the time passed when executed.
import celery
import datetime
from celery import chord
@celery.task
def dummy_task(res=None, start_time=None):
print datetime.datetime.now() - start_time
def send_my_task():
chord(my_task.s(), dummy_task.s(start_time=datetime.datetime.now()).delay()
send_my_task
发送您想要分析的任务以及一个 dummy_task
,该任务将打印花费了多长时间(或多或少).如果您想要更准确的数字,我建议将 start_time 直接传递给您的任务,并使用 信号.
send_my_task
sends the task that you want to profile along with a dummy_task
that would print how long it took (more or less). If you want more accurate numbers, I suggest passing the start_time directly to your tasks, and using the signals.
相关文章