Python“文件存在"制作目录时出错

问题描述

我有几个线程在集群系统上从 Python 并行运行.每个 python 线程都输出到一个目录 mydir.每个脚本,在输出之前检查 mydir 是否存在,如果不存在则创建它:

I have several threads running in parallel from Python on a cluster system. Each python thread outputs to a directory mydir. Each script, before outputting checks if mydir exists and if not creates it:

if not os.path.isdir(mydir):
    os.makedirs(mydir)

但这会产生错误:

os.makedirs(self.log_dir)                                             
  File "/usr/lib/python2.6/os.py", line 157, in makedirs
mkdir(name,mode)
OSError: [Errno 17] File exists

我怀疑这可能是由于竞争条件造成的,其中一个工作在另一个工作之前创建了 dir.这可能吗?如果是这样,如何避免这个错误?

I suspect it might be due to a race condition, where one job creates the dir before the other gets to it. Is this possible? If so, how can this error be avoided?

我不确定这是一个竞争条件,所以想知道 Python 中的其他问题是否会导致这个奇怪的错误.

I'm not sure it's a race condition so was wondering if other issues in Python can cause this odd error.


解决方案

任何时间代码都可以在你检查某事和你采取行动之间执行,你将有一个竞争条件.避免这种情况的一种方法(以及 Python 中的常用方法)是尝试然后处理异常

Any time code can execute between when you check something and when you act on it, you will have a race condition. One way to avoid this (and the usual way in Python) is to just try and then handle the exception

while True:
    mydir = next_dir_name()
    try:
        os.makedirs(mydir)
        break
    except OSError, e:
        if e.errno != errno.EEXIST:
            raise   
        # time.sleep might help here
        pass

如果你有很多线程试图创建一系列可预测的目录,这仍然会引发很多异常,但你最终会到达那里.在这种情况下,最好只有一个线程创建目录

If you have a lot of threads trying to make a predictable series of directories this will still raise a lot of exceptions, but you will get there in the end. Better to just have one thread creating the dirs in that case

相关文章