使用 PHP 正则表达式匹配字符串中的任何 Unicode 空白字符

2021-12-28 00:00:00 string regex split php

我想在每个空间将文本消息拆分成数组.在我收到这条短信之前，它一直工作得很好.下面是处理文本字符串的几行代码:

I want to split text message into array at every Space. It's been working just fine until I received this text message. Here is the few code lines that process the text string:

$str = 'T bw4 05/09/19 07:51 am BW6N 499.803'; $cleanStr = iconv("UTF-8", "ISO-8859-1", $str); $strArr = preg_split('/[s ]/', $cleanStr); var_dump($strArr);

Var_dump 产生这个结果:

Var_dump yields this result:

array:6 [▼ 0 => "T" 1 => b"bw4 05/09/19" 2 => "07:51" 3 => "am" 4 => "BW6N" 5 => "499.803" ]

数组 "1 => b"bw4 05/09/19"" 中的 #1 项不正确，我无法弄清楚数组值前面的字母 "b" 是什么.此外，bw4"和05/09/19"之间的空格非常感谢有关如何更好地实现字符串拆分的任何建议.这是原始字符串:https://3v4l.org/2L35M，这是我的结果图像本地主机:http://prntscr.com/jjbvny

The #1 item in the array "1 => b"bw4 05/09/19"" in not correct, I am not able figure out what is the letter "b" in front of the array value. Also, the space(es) between "bw4" and "05/09/19" Any suggestion on how better achieve the string splitting are greatly appreciated. Here is the original string: https://3v4l.org/2L35M and here is the image of result from my localhost: http://prntscr.com/jjbvny

推荐答案

要匹配您可能使用的任何 1 个或多个 Unicode 空白字符

To match any 1 or more Unicode whitespace chars you may use

'~s+~u'

您的 '/[s ]/' 模式仅匹配单个空白字符 (s) 或制表符 ( )(这当然是多余的，因为 s 也已经匹配制表符了)，但是由于缺少 u 修饰符，s 无法匹配 bw4 之后的 u00A0 字符(硬空格).

Your '/[s ]/' pattern only matches a single whitespace char (s) or a tab ( ) (which is of course redundant as s already matches tabs, too), but since the u modifier is missing, the s cannot match the u00A0 chars (hard spaces) you have after bw4.

所以，使用

$str = 'T bw4 05/09/19 07:51 am BW6N 499.803'; $strArr = preg_split('/s+/u', $str); print_r($strArr);

查看 PHP 演示产出

Array ( [0] => T [1] => bw4 [2] => 05/09/19 [3] => 07:51 [4] => am [5] => BW6N [6] => 499.803 )

相关文章