python爬虫之三:解析网络报文xml
本节主要是讲解在项目中怎么解析获取的xml报文并获取相关字段。
xml解析第三方库学习地址:Http://www.runoob.com/python/Python-xml.html
xml文件如下:
<?xml version="1.0" encoding="UTF-8"?>
<Task version="1.3" xmlns="http://schemas.microsoft.com/windows/2004/02/mit/task">
<ReGIStrationInfo>
<Date>2018-03-19T03:57:44.2908045</Date>
<Author>FANBINGLIN\Administrator</Author>
<Description>开机提醒事件</Description>
</RegistrationInfo>
<Triggers>
<LoGonTrigger>
<Enabled>true</Enabled>
</LogonTrigger>
</Triggers>
<Principals>
<Principal id="Author">
<UserId>FANBINGLIN\Administrator</UserId>
<LogonType>InteractiveToken</LogonType>
<RunLevel>LeastPrivilege</RunLevel>
</Principal>
</Principals>
<Settings>
<MultipleInstancesPolicy>IgnoreNew</MultipleInstancesPolicy>
<DisallowStartIfOnBatteries>true</DisallowStartIfOnBatteries>
<StopIfGoingOnBatteries>true</StopIfGoingOnBatteries>
<AllowHardTerminate>true</AllowHardTerminate>
<StartWhenAvailable>false</StartWhenAvailable>
<RunOnlyIfNetworkAvailable>false</RunOnlyIfNetworkAvailable>
<IdleSettings>
<StopOnIdleEnd>true</StopOnIdleEnd>
<RestartOnIdle>false</RestartOnIdle>
</IdleSettings>
<AllowStartOnDemand>true</AllowStartOnDemand>
<Enabled>true</Enabled>
<Hidden>false</Hidden>
<RunOnlyIfIdle>false</RunOnlyIfIdle>
<DisallowStartOnRemoteAppSession>false</DisallowStartOnRemoteAppSession>
<UseUnifiedSchedulingEngine>false</UseUnifiedSchedulingEngine>
<WakeToRun>false</WakeToRun>
<ExecutionTimeLimit>P3D</ExecutionTimeLimit>
<Priority>7</Priority>
</Settings>
<Actions Context="Author">
<ShowMessage>
<Title>每日提醒</Title>
<Body>
1、掌握python基本语法,3.19-3.24
2、VBA程序研究
3、工作任务总结</Body>
</ShowMessage>
</Actions>
</Task>
解析的代码(中间有部分调试文件):
#!/usr/bin/python3
#coding:utf-8
from xml.dom.minidom import parse
import xml.dom.minidom
Root = xml.dom.minidom.parse('开机提醒.xml')
# print(dir(DOMTree))
task = Root.documentElement
# print(dir())
for line in task.childnodes:
# print('line.nodeName:',line.nodeName,'line.nodeType:',line.nodeType,'line.nodeValue:',line.nodeValue,'line.nORMalize:',line.normalize)
# print(len(line))
# print(line)
if 3 == line.nodeType:
continue
if 'Actions' == line.nodeName:
for tmp in line.childNodes:
# print(tmp)
if 3 == tmp.nodeType:
continue
# print(tmp)
for tmp1 in tmp.childNodes:
if 3 == tmp1.nodeType:
continue
for tmp2 in tmp1.childNodes:
# print(tmp2)
# if 3 == tmp2.nodeType:
# continue
print(tmp2.nodeValue)
# for line1 in line.childNodes:
# if 3 == line1.nodeType:
# continue
# # print(line1.nodeName)
# # print(dir(line1))
# for line2 in line1.childNodes:
# if 3 == line2.nodeType:
# continue
# print(line2.nodeValue)
# print(line2.data)
效果图:
相关文章