python爬虫之三:解析网络报文xml

2023-01-31 04:01:25 爬虫 报文 之三

本节主要是讲解在项目中怎么解析获取的xml报文并获取相关字段。
xml解析第三方库学习地址:Http://www.runoob.com/python/Python-xml.html

xml文件如下:

<?xml version="1.0" encoding="UTF-8"?>
<Task version="1.3" xmlns="http://schemas.microsoft.com/windows/2004/02/mit/task">
  <ReGIStrationInfo>
    <Date>2018-03-19T03:57:44.2908045</Date>
    <Author>FANBINGLIN\Administrator</Author>
    <Description>开机提醒事件</Description>
  </RegistrationInfo>
  <Triggers>
    <LoGonTrigger>
      <Enabled>true</Enabled>
    </LogonTrigger>
  </Triggers>
  <Principals>
    <Principal id="Author">
      <UserId>FANBINGLIN\Administrator</UserId>
      <LogonType>InteractiveToken</LogonType>
      <RunLevel>LeastPrivilege</RunLevel>
    </Principal>
  </Principals>
  <Settings>
    <MultipleInstancesPolicy>IgnoreNew</MultipleInstancesPolicy>
    <DisallowStartIfOnBatteries>true</DisallowStartIfOnBatteries>
    <StopIfGoingOnBatteries>true</StopIfGoingOnBatteries>
    <AllowHardTerminate>true</AllowHardTerminate>
    <StartWhenAvailable>false</StartWhenAvailable>
    <RunOnlyIfNetworkAvailable>false</RunOnlyIfNetworkAvailable>
    <IdleSettings>
      <StopOnIdleEnd>true</StopOnIdleEnd>
      <RestartOnIdle>false</RestartOnIdle>
    </IdleSettings>
    <AllowStartOnDemand>true</AllowStartOnDemand>
    <Enabled>true</Enabled>
    <Hidden>false</Hidden>
    <RunOnlyIfIdle>false</RunOnlyIfIdle>
    <DisallowStartOnRemoteAppSession>false</DisallowStartOnRemoteAppSession>
    <UseUnifiedSchedulingEngine>false</UseUnifiedSchedulingEngine>
    <WakeToRun>false</WakeToRun>
    <ExecutionTimeLimit>P3D</ExecutionTimeLimit>
    <Priority>7</Priority>
  </Settings>
  <Actions Context="Author">
    <ShowMessage>
      <Title>每日提醒</Title>
      <Body>
1、掌握python基本语法,3.19-3.24 
2、VBA程序研究
3、工作任务总结</Body>
    </ShowMessage>
  </Actions>
</Task>

解析的代码(中间有部分调试文件):

#!/usr/bin/python3
#coding:utf-8

from xml.dom.minidom import parse
import xml.dom.minidom
Root = xml.dom.minidom.parse('开机提醒.xml')
# print(dir(DOMTree))
task = Root.documentElement
# print(dir())
for line in task.childnodes:
    # print('line.nodeName:',line.nodeName,'line.nodeType:',line.nodeType,'line.nodeValue:',line.nodeValue,'line.nORMalize:',line.normalize)
    # print(len(line))
    # print(line)
    if 3 == line.nodeType:
        continue
    if 'Actions' == line.nodeName:

        for tmp in line.childNodes:
            # print(tmp)
            if 3 == tmp.nodeType:
                continue
            # print(tmp)
            for tmp1 in tmp.childNodes:
                if 3 == tmp1.nodeType:
                    continue     
                for tmp2 in tmp1.childNodes:
                    # print(tmp2)
                    # if 3 == tmp2.nodeType:
                    #   continue
                    print(tmp2.nodeValue)
    # for line1 in line.childNodes:
    #   if 3 == line1.nodeType:
    #       continue
    #   # print(line1.nodeName)
    #   # print(dir(line1))

    #   for line2 in line1.childNodes:
    #       if 3 == line2.nodeType:
    #           continue
            # print(line2.nodeValue)
            # print(line2.data)

效果图:
这里写图片描述

相关文章