在 Python 线程/队列方面需要一些帮助
问题描述
import threading
import Queue
import urllib2
import time
class ThreadURL(threading.Thread):
def __init__(self, queue):
threading.Thread.__init__(self)
self.queue = queue
def run(self):
while True:
host = self.queue.get()
sock = urllib2.urlopen(host)
data = sock.read()
self.queue.task_done()
hosts = ['http://www.google.com', 'http://www.yahoo.com', 'http://www.facebook.com', 'http://stackoverflow.com']
start = time.time()
def main():
queue = Queue.Queue()
for i in range(len(hosts)):
t = ThreadURL(queue)
t.start()
for host in hosts:
queue.put(host)
queue.join()
if __name__ == '__main__':
main()
print 'Elapsed time: {0}'.format(time.time() - start)
我一直在尝试了解如何执行线程,在学习了一些教程之后,我想出了上述内容.
I've been trying to get my head around how to perform Threading and after a few tutorials, I've come up with the above.
它应该做的是:
- 初始化队列
- 创建我的线程池,然后将主机列表排队
- 一旦主机在队列中并读取网站数据,我的 ThreadURL 类就会开始工作
- 程序应该完成
我首先想知道的是,我这样做是否正确?这是处理线程的最佳方式吗?
What I want to know first off is, am I doing this correctly? Is this the best way to handle threads?
其次,我的程序无法退出.它打印出 Elapsed time
行,然后挂在那里.我必须杀死我的终端才能让它消失.我假设这是由于我对 queue.join()
的错误使用?
Secondly, my program fails to exit. It prints out the Elapsed time
line and then hangs there. I have to kill my terminal for it to go away. I'm assuming this is due to my incorrect use of queue.join()
?
解决方案
你的代码看起来不错,也很干净.
Your code looks fine and is quite clean.
您的应用程序仍然挂起"的原因是工作线程仍在运行,等待主应用程序将某些内容放入队列中,即使您的主线程已完成.
The reason your application still "hangs" is that the worker threads are still running, waiting for the main application to put something in the queue, even though your main thread is finished.
解决此问题的最简单方法是将线程标记为守护进程,方法是在调用 start 之前执行 t.daemon = True
.这样,线程就不会阻塞程序停止.
The simplest way to fix this is to mark the threads as daemons, by doing t.daemon = True
before your call to start. This way, the threads will not block the program stopping.
相关文章