Python multiprocessing PicklingError: Can't pickle <type 'function'>

问题描述

很抱歉,我无法用更简单的示例重现该错误,并且我的代码太复杂而无法发布.如果我在 IPython shell 而不是常规的 Python 中运行程序,一切都会顺利进行.

I am sorry that I can't reproduce the error with a simpler example, and my code is too complicated to post. If I run the program in IPython shell instead of the regular Python, things work out well.

我查阅了有关此问题的一些以前的注释.它们都是由使用池调用类函数中定义的函数引起的.但对我来说不是这样.

I looked up some previous notes on this problem. They were all caused by using pool to call function defined within a class function. But this is not the case for me.

Exception in thread Thread-3:
Traceback (most recent call last):
  File "/usr/lib64/python2.7/threading.py", line 552, in __bootstrap_inner
    self.run()
  File "/usr/lib64/python2.7/threading.py", line 505, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/lib64/python2.7/multiprocessing/pool.py", line 313, in _handle_tasks
    put(task)
PicklingError: Can't pickle <type 'function'>: attribute lookup __builtin__.function failed

如果有任何帮助,我将不胜感激.

I would appreciate any help.

更新:我pickle的函数定义在模块的顶层.虽然它调用了一个包含嵌套函数的函数.即,f() 调用 g() 调用具有嵌套函数 i()h(),我正在调用 pool.apply_async(f).f()g()h() 都是在顶层定义的.我用这种模式尝试了更简单的例子,但它确实有效.

Update: The function I pickle is defined at the top level of the module. Though it calls a function that contains a nested function. i.e, f() calls g() calls h() which has a nested function i(), and I am calling pool.apply_async(f). f(), g(), h() are all defined at the top level. I tried simpler example with this pattern and it works though.


解决方案

这里有一个可以腌制的内容列表.特别是,只有在模块的顶层定义的函数才是可挑选的.

Here is a list of what can be pickled. In particular, functions are only picklable if they are defined at the top-level of a module.

这段代码:

import multiprocessing as mp

class Foo():
    @staticmethod
    def work(self):
        pass

if __name__ == '__main__':   
    pool = mp.Pool()
    foo = Foo()
    pool.apply_async(foo.work)
    pool.close()
    pool.join()

产生与您发布的错误几乎相同的错误:

yields an error almost identical to the one you posted:

Exception in thread Thread-2:
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 552, in __bootstrap_inner
    self.run()
  File "/usr/lib/python2.7/threading.py", line 505, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 315, in _handle_tasks
    put(task)
PicklingError: Can't pickle <type 'function'>: attribute lookup __builtin__.function failed

问题在于 pool 方法都使用 mp.SimpleQueue 将任务传递给工作进程.通过 mp.SimpleQueue 的所有内容都必须是可挑选的,而 foo.work 是不可挑选的,因为它没有在模块的顶层定义.

The problem is that the pool methods all use a mp.SimpleQueue to pass tasks to the worker processes. Everything that goes through the mp.SimpleQueue must be pickable, and foo.work is not picklable since it is not defined at the top level of the module.

可以通过在顶层定义一个函数来修复它,该函数调用foo.work():

It can be fixed by defining a function at the top level, which calls foo.work():

def work(foo):
    foo.work()

pool.apply_async(work,args=(foo,))

注意 foo 是可挑选的,因为 Foo 是在顶层定义的,而 foo.__dict__ 是可挑选的.

Notice that foo is pickable, since Foo is defined at the top level and foo.__dict__ is picklable.

相关文章