用 <?php 替换 PHP 短打开标签的批处理脚本
我有大量多年来编写的 php 文件,我需要将所有短的开放标签正确替换为适当的显式开放标签.
I have a large collection of php files written over the years and I need to properly replace all the short open tags into proper explicit open tags.
change "<?" into "<?php"
我认为这个正则表达式会正确地选择它们:
I think this regular expression will properly select them :
<?(s|
| |[^a-zA-Z])
处理像
<?//
<?/*
但我不确定如何处理整个文件夹树并检测 .php 文件扩展名并应用正则表达式并在更改后保存文件.
but I am not sure how to process a whole folder tree and detect the .php file extension and apply the regular expression and save the file after it has been changed.
我觉得如果您掌握了正确的工具,这会非常简单.(sed 手册中有一个有趣的 hack:4.3 示例/将文件重命名为小写).
I have the feeling this can be pretty straightforward if you master the right tools. (There is an interesting hack in the sed manual: 4.3 Example/Rename files to lower case).
也许我错了.
或者这可能是一个单线?
Maybe I'm wrong.
Or maybe this could be a oneliner?
推荐答案
不要使用正则表达式来解析正式语言——你总是会遇到你没有预料到的大海捞针.喜欢:
don't use regexps for parsing formal languages - you'll always run into haystacks you did not anticipate. like:
<?
$bla = '?> now what? <?';
使用了解语言结构的处理器更安全.对于 html,那将是一个 xml 处理器;对于 php,内置 tokenizer 扩展.它有 T_OPEN_TAG
解析器令牌,它匹配 、
或
<%
和 T_OPEN_TAG_WITH_ECHO
,匹配 =
或 <%=
.要替换所有短的开放标签,您需要找到所有这些标记并将 T_OPEN_TAG
替换为 <?php
并将 T_OPEN_TAG_WITH_ECHO
替换为 .
it's safer to use a processor that knows about the structure of the language. for html, that would be a xml processor; for php, the built-in tokenizer extension. it has the T_OPEN_TAG
parser token, which matches <?php
, <?
or <%
, and T_OPEN_TAG_WITH_ECHO
, which matches <?=
or <%=
. to replace all short open tags, you find all these tokens and replace T_OPEN_TAG
with <?php
and T_OPEN_TAG_WITH_ECHO
with <?php echo
.
实现留给读者作为练习:)
the implementation is left as an exercise for the reader :)
编辑 1:ringmaster 对 提供一个.
EDIT 1: ringmaster was so kind to provide one.
EDIT 2:在带有 short_open_tag
在 php.ini
、、
<%
和 中关闭=
不会被替换脚本识别.要使脚本在此类系统上运行,请通过命令行选项启用 short_open_tag
:
EDIT 2: on systems with short_open_tag
turned off in php.ini
, <?
, <%
, and <?=
won't be recognized by a replacement script. to make the script work on such systems, enable short_open_tag
via command line option:
php -d short_open_tag=On short_open_tag_replacement_script.php
附言token_get_all() 的手册页 和谷歌搜索 tokenizer、token_get_all 和解析器令牌名称可能会有所帮助.
p.s. the man page for token_get_all() and googleing for creative combinations of tokenizer, token_get_all, and the parser token names might help.
p.p.s.另见正则表达式解析define()内容,可能吗?所以
相关文章