是否有 Python 的 re.findall/re.finditer(迭代正则表达式结果)的 Perl 等价物?
问题描述
在 Python 编译的正则表达式模式中有一个 findall
方法 执行以下操作:
In Python compiled regex patterns have a findall
method that does the following:
返回所有不重叠的匹配项字符串中的模式,作为列表字符串.字符串被扫描从左到右,匹配是按找到的顺序返回.如果一个或更多的群体出现在模式,返回组列表;这将是一个元组列表,如果模式有多个组.空的匹配项包含在结果中除非他们触及另一场比赛.
Return all non-overlapping matches of pattern in string, as a list of strings. The string is scanned left-to-right, and matches are returned in the order found. If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group. Empty matches are included in the result unless they touch the beginning of another match.
在 Perl 中执行此操作的规范方法是什么?我能想到的一个天真的算法是当搜索并用空字符串替换成功时,执行 [suite]".我希望有更好的方法.:-)
What's the canonical way of doing this in Perl? A naive algorithm I can think of is along the lines of "while a search and replace with the empty string is successful, do [suite]". I'm hoping there's a nicer way. :-)
提前致谢!
解决方案
在你的比赛中使用 /g
修饰符.来自 perlop
手册:
Use the /g
modifier in your match. From the perlop
manual:
/g
"修饰符指定全局模式匹配——即在字符串中尽可能多地匹配.它的行为方式取决于上下文.在列表上下文中,它返回与正则表达式中的任何捕获括号匹配的子字符串列表.如果没有括号,则返回所有匹配字符串的列表,就好像整个模式都有括号一样.
The "
/g
" modifier specifies global pattern matching--that is, matching as many times as possible within the string. How it behaves depends on the context. In list context, it returns a list of the substrings matched by any capturing parentheses in the regular expression. If there are no parentheses, it returns a list of all the matched strings, as if there were parentheses around the whole pattern.
在标量上下文中,每次执行 "m//g
" 都会找到下一个匹配项,如果匹配则返回 true,如果没有进一步匹配则返回 false.可以使用 pos()
函数读取或设置最后一次匹配后的位置;请参阅 perlfunc
中的pos
".失败的匹配通常会将搜索位置重置为字符串的开头,但您可以通过添加/c
"修饰符来避免这种情况(例如m//gc
").修改目标字符串也会重置搜索位置.
In scalar context, each execution of "m//g
" finds the next match, returning true if it matches, and false if there is no further match. The position after the last match can be read or set using the pos()
function; see "pos
" in perlfunc
. A failed match normally resets the search position to the beginning of the string, but you can avoid that by adding the "/c
" modifier (e.g. "m//gc
"). Modifying the target string also resets the search position.
相关文章