如何在 PHP 中迭代 UTF-8 字符串?

2021-12-28 00:00:00 utf-8 php

如何使用索引逐字符迭代 UTF-8 字符串?

How to iterate a UTF-8 string character by character using indexing?

当您使用括号运算符 $str[0] 访问 UTF-8 字符串时,utf 编码字符由 2 个或更多元素组成.

When you access a UTF-8 string with the bracket operator $str[0] the utf-encoded character consists of 2 or more elements.

例如:

$str = "Kąt";
$str[0] = "K";
$str[1] = "�";
$str[2] = "�";
$str[3] = "t";

但我想要:

$str[0] = "K";
$str[1] = "ą";
$str[2] = "t";

使用 mb_substr 是可能的,但这非常慢,即.

It is possible with mb_substr but this is extremely slow, ie.

mb_substr($str, 0, 1) = "K"
mb_substr($str, 1, 1) = "ą"
mb_substr($str, 2, 1) = "t"

是否有另一种方法可以不使用 mb_substr 来逐个字符地插入字符串?

Is there another way to interate the string character by character without using mb_substr?

推荐答案

使用 preg_split.使用 "u" 修饰符,它支持 UTF-8 unicode.

Use preg_split. With "u" modifier it supports UTF-8 unicode.

$chrArray = preg_split('//u', $str, -1, PREG_SPLIT_NO_EMPTY);

相关文章