嵌套标签的正则表达式(最里面使它更容易)
我对此进行了相当多的研究,但找不到如何将嵌套的 html 标签 与 属性匹配的工作示例.我知道可以匹配没有属性的平衡/嵌套最里面的标签(例如,正则表达式和将是 #<div[^>]*>(?:(?> [^<]+ ) |<(?!div[^>]*>))*?</div>
#x).
但是,我希望看到一个正则表达式模式,它可以找到带有属性的 html 标记对.
例子:基本上应该匹配
<div class="aaa">**<div class="aaa">** <div></div>**</div>** </div>
而不是
<div class="aaa">**<div class="aaa">** <div>**</div>** </div></div>
有人有什么想法吗?
出于测试目的,我们可以使用:http://www.lumadis.be/regex/test_regex.php
<小时>PS.Steven 在他的博客中提到了一个解决方案(实际上是在评论中),但它不起作用
http://blog.stevenlevithan.com/archives/match-innermost-html元素
$regex = '/
解决方案 匹配 <div>
&</div>
标签,加上它们的属性 &内容:
#<div(?:(?!(<div|</div>)).)*</div>#s
这里的关键是 (?:(?!STRING).)*
是字符串,就像 [^CHAR]*
是字符一样.
来源:https://stackoverflow.com/a/6996274
<小时>PHP 中的示例:
$match) {回声************".
".$匹配."
";}
输出:
************<div id="3">在 3</div>************<div id="5">在 5</div>
I researched this quite a bit, but couldn't find a working example how to match nested html tags with attributes. I know it is possible to match balanced/nested innermost tags without attributes (for example a regex for and would be #<div[^>]*>(?:(?> [^<]+ ) |<(?!div[^>]*>))*?</div>
#x).
However, I would like to see a regex pattern that finds an html tag pair with attributes.
Example: It basically should match
<div class="aaa"> **<div class="aaa">** <div> <div> </div> **</div>** </div>
and not
<div class="aaa"> **<div class="aaa">** <div> <div> **</div>** </div> </div>
Anybody has some ideas?
For testing purposes we could use: http://www.lumadis.be/regex/test_regex.php
PS. Steven mentioned a solution in his blog (actually in a comment), but it doesn't work
http://blog.stevenlevithan.com/archives/match-innermost-html-element
$regex = '/<div[^>]+?ids*=s*"MyID"[^>]*>(?:((?:[^<]++|<(?!/?div[^>]*>))+)|(<div[^>]*>(?>(?1)|(?2))*</div>))?</div>/i';
解决方案
Matching innermost matching pairs of <div>
& </div>
tags, plus their attributes & content:
#<div(?:(?!(<div|</div>)).)*</div>#s
The key here is that (?:(?!STRING).)*
is to strings as [^CHAR]*
is to characters.
Credit: https://stackoverflow.com/a/6996274
Example in PHP:
<?php
$text = <<<'EOD'
<div id="1">
in 1
<div id="2">
in 2
<div id="3">
in 3
</div>
</div>
</div>
<div id="4">
in 4
<div id="5">
in 5
</div>
</div>
EOD;
$matches = array();
preg_match_all('#<div(?:(?!(<div|</div>)).)*</div>#s', $text, $matches);
foreach ($matches[0] as $index => $match) {
echo "************" . "
" . $match . "
";
}
Outputs:
************
<div id="3">
in 3
</div>
************
<div id="5">
in 5
</div>
相关文章