用于多处理日志记录的QueueHandler

问题描述

我正在尝试调整我的程序,以便将不同进程的日志记录到单个日志文件中。 多天来,我一直在寻找解决方案,但没有成功。我想我仍然不明白队列处理程序是如何工作的。在我看来,这个过程是这样的:

  • 创建Q
  • 将qHandler添加到主记录器
  • 所有日志都将被重定向到Q,然后Q将使用附加到记录器的其他处理程序(通过logger.Handle(Record))。 我创建了该程序的简化版本来说明记录器的行为方式
# logger.py

import logging
   
def listener_configurer():
    """This sets the settings for the root logger. The highest in the hierarchy. 
    All the handlers added to this root logger are available for all the subloggers.
    """
    root = logging.getLogger('main')
    file = logging.FileHandler(r'logs	emp.log', 'w')
    fmt = logging.Formatter('%(asctime)s %(processName)-10s %(name)s %(levelname)-8s %(message)s')
    stream = logging.StreamHandler()
    stream.setFormatter(fmt)
    file.setFormatter(fmt)
    root.addHandler(file)
    root.addHandler(stream)
    root.setLevel(logging.DEBUG)


def listener_process(queue):
    listener_configurer()
    while True:
        try:
            record = queue.get()
            if record is not None:
                print("-------------- using q ------------------ " + record.name + " -> " + record.message)
                logger = logging.getLogger(record.name)
                logger.handle(record)
            else:
                break
        except Exception:
            import sys, traceback
            logger.error('Whoops! Problem: %s', "problem", exc_info=1)
            traceback.print_exc(file=sys.stderr)
# saver.py (worker)
import logging
import typing

log = logging.getLogger('main.Saver')

class Saver:
    def __init__(self) -> None:
        log.warning("Instantiating a saver obj")
       
    def doStuff(self, input_line: typing.Tuple,) -> None:
        log.info(f"Exporting: {input_line}") # ASSUMING A TUPLE AS INPUT like: email, email_id, email_url
        (email, email_id, email_url, *other) = input_line
        log.info("Source URL: " + email_url)
        log.info(f"EmailName: {email}")
        log.warning(f"EmailID: {email_id}")
        log.debug("Exporting done!")
# manager.py
import logging
import logging.config
import logging.handlers
import multiprocessing
import logger
from saver import Saver

class Manager:

    def __init__(self) -> None:
        ### LOGGER
        # initializing listener -> this queue is going to be used for the multiprocessing logging
        self.queue = multiprocessing.Queue(-1)
        self.log = self.root_configurer(self.queue)  # getting a reference to the root logger -> used to log from this module
        self.listener = multiprocessing.Process(target=logger.listener_process, args=(self.queue,))
        self.listener.start()
        # utils
        self.log.info(f"Starting program at 10 am")
        # instantiate
        self.save = Saver()

    def root_configurer(self, queue):
        root = logging.getLogger('main')
        h = logging.handlers.QueueHandler(queue)  # Just the one handler needed
        root.setLevel(DEBUG)
        root.addHandler(h)
        return root # this is the main function -> we need to retrieve the root logger here

    def run(self):
        tuples = [("email1","id1","url1",""), ("email2","id2","url2",""), ("email3","id3","url3",""), ("email4","id4","url4",""), ("email4","id4","url4","")]
        procs = []
        for res in tuples:
            proc = multiprocessing.Process(target=self.save.doStuff, args=(res,)) 
            procs.append(proc)
            proc.start()
        # complete the processes
        for proc in procs:
            proc.join()
        
        self.log.debug("We reached this part!")
        # close listener
        self.queue.put_nowait(None)
        self.listener.join()   

if __name__ == "__main__":
    m = Manager()
    m.run()

我所期望的是一串类似于:

的行
-------- using q -------------  main.saver INFO Source URL: ...
-------- using q -------------  main.saver INFO EmailName ...
-------- using q -------------  main.saver WARNING EmailID
-------- using q -------------  main.saver DEBUG ....

加上写入日志的所有这些行。出于某种原因,我收到:

EmailID: id4
EmailID: id3
EmailID: id2
-------------- using q ------------------ main -> Starting program at 10 am
2021-07-01 11:42:16,385 MainProcess main INFO     Starting program at 10 am      
-------------- using q ------------------ main.Saver -> Instantiating a saver obj
2021-07-01 11:42:16,386 MainProcess main.Saver WARNING  Instantiating a saver obj
EmailID: id4
EmailID: id1
-------------- using q ------------------ main -> We reached this part!
2021-07-01 11:42:16,852 MainProcess main DEBUG    We reached this part!

和如下文件:

2021-07-01 11:42:16,385 MainProcess main INFO     Starting program at 10 am
2021-07-01 11:42:16,386 MainProcess main.Saver WARNING  Instantiating a saver obj
2021-07-01 11:42:16,852 MainProcess main DEBUG    We reached this part!

有什么想法吗?

编辑 代码取自:

  • https://docs.python.org/3/howto/logging-cookbook.html#a-more-elaborate-multiprocessing-example

  • https://fanchenbao.medium.com/python3-logging-with-multiprocessing-f51f460b8778

解决方案

您的工作进程不写入队列。

您的代码似乎基于记录Cookbook的Logging to a single file from multiple processes。您可以在那里看到,工作人员将队列作为参数进行配置(通过worker_configurer)。在您的代码中,您只配置您的经理,而不是您的员工。

只需将self.queue添加到进程参数中,并将(略微编辑)root_configurer方法复制到saver.py中,以便在doStuff启动时调用,即可正常工作。



主题吹毛求疵(您没有要求,但它们是免费的!):

  • 根记录器";未命名,您可以通过执行logging.getLogger()(不带参数)获得它。因此,记录器"main"不是根目录。请考虑将其命名为main_logger
  • 保留breakrecord is Nonebreak的评论,我一开始以为这是个错误。
  • 如果第一次get队列中的记录出现错误,您永远不会设置logger变量,因此异常处理程序将在UnboundLocalError写入stderr之前引发它。
  • 您没有编写的信用代码:您提交的内容在很大程度上基于日志记录指南示例。

相关文章