为什么多处理中的新对象具有相同的 id?
问题描述
我尝试在使用多处理模块时在进程中创建新对象.但是,有些事情让我感到困惑.
I tried to create a new object in a process when using multiprocessing module. However, something confuses me.
当我使用多处理模块时,新对象的id是一样的
When I use multiprocessing module, the id of the new object is the same
for i in range(4):
p = multiprocessing.Process(target=worker)
p.start()
def worker():
# stanford named entity tagger
st = StanfordNERTagger(model_path,stanford_ner_path)
print id(st) # all the processes print the same id
但是当我使用线程时,它们是不同的:
But when I use threading, they are different:
for i in range(4):
p = threading.Thread(target=worker)
p.start()
def worker():
# stanford named entity tagger
st = StanfordNERTagger(model_path,stanford_ner_path)
print id(st) # threads print differnt ids
我想知道为什么它们不同.
I am wondering why they are different.
解决方案
idCPython 中的 返回给定对象的指针.由于线程共享地址空间,一个对象的两个不同实例将被分配在两个不同的位置,返回两个不同的 id(也称为虚拟地址指针).
id in CPython returns the pointer of the given object. As threads have shared address space, two different instances of an object will be allocated in two different locations returning two different ids (aka virtual address pointers).
对于拥有自己地址空间的独立进程来说,情况并非如此.碰巧他们得到了相同的地址指针.
This is not the case for separate processes which own their own address space. By chance, they happen to get the same address pointer.
请记住,地址指针是虚拟的,因此它们表示进程地址空间本身的偏移量.这就是为什么它们是相同的.
Keep in mind that address pointers are virtual, therefore they represent an offset within the process address space itself. That's why they are the same.
通常最好不要依赖 id() 来区分对象,因为新对象可能会得到旧对象的 id,随着时间的推移很难跟踪它们.它通常会导致棘手的错误.
It is usually better not to rely on id() for distinguishing objects, as new ones might get ids of old ones making hard to track them over time. It usually leads to tricky bugs.
相关文章