如何在 django 视图中使用 python 多处理模块

2022-01-12 00:00:00 python django multiprocessing

问题描述

我有一个简单的函数来遍历 URL 列表,使用 GET 来检索一些信息并相应地更新 DB (PostgresSQL).该功能完美运行.但是,一次一个地浏览每个 URL 会占用太多时间.

I have a simple function that go over a list of URLs, using GET to retrieve some information and update the DB (PostgresSQL) accordingly. The function works perfect. However, going over each URL one at a time talking too much time.

使用 python,我可以执行以下操作来并行执行这些任务:

Using python, I'm able to do to following to parallel these tasks:

from multiprocessing import Pool

def updateDB(ip):
     code goes here...

if __name__ == '__main__':
    pool = Pool(processes=4)              # process per core
    pool.map(updateDB, ip)

这工作得很好.但是,我试图找到如何在 django 项目上做同样的事情.目前我有一个函数(视图),可以遍历每个 URL 以获取信息并更新数据库.

This is working pretty well. However, I'm trying to find how do the same on django project. Currently I have a function (view) that go over each URL to get the information, and update the DB.

我唯一能找到的就是使用 Celery,但这对于我想要执行的简单任务来说似乎有点过于强大了.

The only thing I could find is using Celery, but this seems to be a bit overpower for the simple task I want to perform.

有什么简单的我可以做或者我必须使用 Celery 吗?

Is there anything simple that i can do or do I have to use Celery?


解决方案

目前我有一个函数(视图)可以遍历每个 URL 以获取信息,并更新数据库.

Currently I have a function (view) that go over each URL to get the information, and update the DB.

这意味着响应时间对您来说并不重要,而不是在后台(异步)执行,如果您的响应时间减少 4(使用 4 个子进程/线程),您可以在前台执行.如果是这种情况,您可以简单地将示例代码放在您的视图中.喜欢

It means response time does not matter for you and instead of doing it in the background (asynchronously), you are OK with doing it in the foreground if your response time is cut by 4 (using 4 sub-processes/threads). If that is the case you can simply put your sample code in your view. Like

from multiprocessing import Pool

def updateDB(ip):
     code goes here...

def my_view(request):
    pool = Pool(processes=4)              # process per core
    pool.map(updateDB, ip)
    return HttpResponse("SUCCESS")

但是,如果您想在后台异步执行此操作,那么您应该使用 Celery 或遵循@BasicWolf 的建议之一.

But, if you want to do it asynchronously in the background then you should use Celery or follow one of @BasicWolf's suggestions.

相关文章