用 <?php 替换 PHP 短打开标签的批处理脚本

2021-12-24 00:00:00 find sed awk php

我有大量多年来编写的 php 文件,我需要将所有短的开放标签正确替换为适当的显式开放标签.

I have a large collection of php files written over the years and I need to properly replace all the short open tags into proper explicit open tags.

change "<?" into "<?php"

我认为这个正则表达式会正确地选择它们:

I think this regular expression will properly select them :

<?(s|
|	|[^a-zA-Z])

处理像

<?//
<?/*

但我不确定如何处理整个文件夹树并检测 .php 文件扩展名并应用正则表达式并在更改后保存文件.

but I am not sure how to process a whole folder tree and detect the .php file extension and apply the regular expression and save the file after it has been changed.

我觉得如果您掌握了正确的工具,这会非常简单.(sed 手册中有一个有趣的 hack:4.3 示例/将文件重命名为小写).

I have the feeling this can be pretty straightforward if you master the right tools. (There is an interesting hack in the sed manual: 4.3 Example/Rename files to lower case).

也许我错了.
或者这可能是一个单线?

Maybe I'm wrong.
Or maybe this could be a oneliner?

推荐答案

不要使用正则表达式来解析正式语言——你总是会遇到你没有预料到的大海捞针.喜欢:

don't use regexps for parsing formal languages - you'll always run into haystacks you did not anticipate. like:

<?
$bla = '?> now what? <?';

使用了解语言结构的处理器更安全.对于 html,那将是一个 xml 处理器;对于 php,内置 tokenizer 扩展.它有 T_OPEN_TAG 解析器令牌,它匹配 <%T_OPEN_TAG_WITH_ECHO,匹配 <%=.要替换所有短的开放标签,您需要找到所有这些标记并将 T_OPEN_TAG 替换为 <?php 并将 T_OPEN_TAG_WITH_ECHO 替换为 .

it's safer to use a processor that knows about the structure of the language. for html, that would be a xml processor; for php, the built-in tokenizer extension. it has the T_OPEN_TAG parser token, which matches <?php, <? or <%, and T_OPEN_TAG_WITH_ECHO, which matches <?= or <%=. to replace all short open tags, you find all these tokens and replace T_OPEN_TAG with <?php and T_OPEN_TAG_WITH_ECHO with <?php echo .

实现留给读者作为练习:)

the implementation is left as an exercise for the reader :)

编辑 1:ringmaster 对 提供一个.

EDIT 1: ringmaster was so kind to provide one.

EDIT 2:在带有 short_open_tagphp.ini<% 中关闭 不会被替换脚本识别.要使脚本在此类系统上运行,请通过命令行选项启用 short_open_tag:

EDIT 2: on systems with short_open_tag turned off in php.ini, <?, <%, and <?= won't be recognized by a replacement script. to make the script work on such systems, enable short_open_tag via command line option:

php -d short_open_tag=On short_open_tag_replacement_script.php

附言token_get_all() 的手册页 和谷歌搜索 tokenizer、token_get_all 和解析器令牌名称可能会有所帮助.

p.s. the man page for token_get_all() and googleing for creative combinations of tokenizer, token_get_all, and the parser token names might help.

p.p.s.另见正则表达式解析define()内容,可能吗?所以

相关文章