PREG_MATCH_ALL`u‘标志依赖于什么？

2022-03-29 00:00:00 regex php preg-match

我在一个PHP应用程序中有一些代码，当我尝试在生产服务器上使用它时，它返回空值，但它在开发服务器上运行得很好。以下是代码行：

// use the regex unicode support to separate the UTF-8 characters into an array
preg_match_all( '/./us', $str, $match );

u标志依赖于什么？我在启用和禁用mb_string的情况下进行了测试，这似乎不会影响它。

我收到的错误是

preg_match_all: Compilation failed: unknown option bit(s) set at offset -1

更多信息

这是生产服务器上的选项之一：

'--with-pcre-regex=/opt/pcre'

这里是PCRE部分

我相信这就是@Wesley所指的便条：

In  order  process  UTF-8 strings, you must build PCRE to include UTF-8
support in the code, and, in addition,  you  must  call  pcre_compile()
with  the  PCRE_UTF8  option  flag,  or the pattern must start with the
sequence (*UTF8). When either of these is the case,  both  the  pattern
and  any  subject  strings  that  are matched against it are treated as
UTF-8 strings instead of strings of 1-byte characters.

有关于如何"构建包含UTF-8的PCRE"的链接或提示吗？

via

pcretest -C

的结果

PCRE version 6.6 06-Feb-2006
Compiled with
  UTF-8 support
  Unicode properties support
  Newline character is LF
  Internal link size = 2
  POSIX malloc threshold = 10
  Default match limit = 10000000
  Default recursion depth limit = 10000000
  Match recursion uses stack

解决方案

此标志取决于在启用Unicode支持的情况下生成的PCRE。

PHP捆绑了此库，并且它通常是在启用Unicode支持的情况下构建的：u修饰符可用，并且从PHP 4.1.0开始始终有效，当PHP使用捆绑的PCRE库构建时。

然而，一些Linux发行版基于其自己的PCRE版本构建PHP，而PCRE没有启用Unicode支持，因此u修饰符在这些版本上不起作用。

解决方案是使用替代的PHP包。

相关文章