replaceWith() 后的 find() 不起作用(使用 BeautifulSoup)
问题描述
请考虑以下 python 会话:
Please consider the following python session:
>>> from BeautifulSoup import BeautifulSoup
>>> s = BeautifulSoup("<p>This <i>is</i> a <i>test</i>.</p>"); myi = s.find("i")
>>> myi.replaceWith(BeautifulSoup("was"))
>>> s.find("i")
>>> s = BeautifulSoup("<p>This <i>is</i> a <i>test</i>.</p>"); myi = s.find("i")
>>> myi.replaceWith("was")
>>> s.find("i")
<i>test</i>
请注意第 4 行后 s.find("i") 的缺失输出!
Please note the missing output of s.find("i") after line 4!
这是什么原因?有解决办法吗?
What's the reason for this? Is there a workaround?
实际上,该示例并未演示用例,即:
Actually, the example doesn't demonstrate the usecase, which is:
myi.replaceWith(BeautifulSoup("wa<b>s</b>"))
每当插入的部分包含自己重要的 html 代码时,我看不出如何用其他内容替换此语法.只是有
Whenever the inserted part contains itself nontrivial html code, I don't see how you could replace this syntax with something else. Just having
myi.replaceWith("wa<b>s</b>")
将用实体替换 html 特殊字符.
will replace the html special chars by entities.
解决方案
更简单的答案:调用 replaceWith
后,通过调用 s 重新生成并清理
.然后你就可以再次s
= BeautifulSoup(s.renderContents())find
了.
Simpler answer : after your call to replaceWith
, regenerate and clean s
by calling s = BeautifulSoup(s.renderContents())
. Then you can find
again.
相关文章