与常规 dict 相比,Python manager.dict() 非常慢

问题描述

我有一个存储对象的字典:

I have a dict to store objects:

jobs = {}
job = Job()
jobs[job.name] = job

现在我想将它转换为使用 manager dict,因为我想使用多处理并且需要在进程中共享这个 dict

now I want to convert it to use manager dict because I want to use multiprocessing and need to share this dict amonst processes

mgr = multiprocessing.Manager()
jobs = mgr.dict()
job = Job()
jobs[job.name] = job

仅仅通过转换为使用 manager.dict() 事情变得非常缓慢.

just by converting to use manager.dict() things got extremely slow.

例如,如果使用原生 dict,创建 625 个对象并将其存储到 dict 中只需要 0.65 秒.

For example, if using native dict, it only took .65 seconds to create 625 objects and store it into the dict.

同样的任务现在需要 126 秒!

The very same task now takes 126 seconds!

我可以做任何优化以使 manager.dict() 与 python {} 保持一致?

Any optimization i can do to keep manager.dict() on par with python {}?


解决方案

问题是由于某种原因每次插入都很慢(在我的机器上慢了 117 倍),但是如果你更新你的 manager.dict() 使用普通的dict,这将是一个快速的操作.

The problem is that each insert is quite slow for some reason (117x slower on my machine), but if you update your manager.dict() with a normal dict, it will be a single and fast operation.

jobs = {}
job = Job()
jobs[job.name] = job
# insert other jobs in the normal dictionary

mgr = multiprocessing.Manager()
mgr_jobs = mgr.dict()
mgr_jobs.update(jobs)

然后使用 mgr_jobs 变量.

另一种选择是使用广泛采用的 multiprocessing.Queue 类.

Another option is to use the widely adopted multiprocessing.Queue class.

相关文章