PHP 反序列化因非编码字符而失败?

2021-12-21 00:00:00 encode serialization character php

$ser = 'a:2:{i:0;s:5:"héllö";i:1;s:5:"wörld";}';//失败$ser2 = 'a:2:{i:0;s:5:"hello";i:1;s:5:"world";}';//有效$out = 反序列化($ser);$out2 = 反序列化($ser2);打印_r($out);打印_r($out2);echo "
";

但是为什么?
我应该在序列化之前编码吗?怎么样?

我使用 Javascript 将序列化字符串写入隐藏字段,而不是 PHP 的 $_POST
在 JS 中,我有类似的东西:

function writeImgData() {var caption_arr = new Array();$('.album img').each(function(index) {caption_arr.push($(this).attr('alt'));});$("#hidden-field").attr("value", serializeArray(caption_arr));};

解决方案

unserialize() 失败的原因:

$ser = 'a:2:{i:0;s:5:"héllö";i:1;s:5:"wörld";}';

是因为 héllöwörld 的长度是错误的,因为 PHP 本身不能正确处理多字节字符串:

echo strlen('héllö');//7echo strlen('世界');//6

但是,如果您尝试 unserialize() 以下正确的字符串:

$ser = 'a:2:{i:0;s:7:"héllö";i:1;s:6:"wörld";}';echo '

';打印_r(反序列化($ser));echo '</pre>';

它有效:

数组([0] =>你好[1] =>世界)

如果您使用 PHP serialize() 它应该正确计算多字节字符串索引的长度.

另一方面,如果您想使用多种(编程)语言处理序列化数据,您应该忘记它并转向使用更标准化的 JSON 之类的东西.

$ser = 'a:2:{i:0;s:5:"héllö";i:1;s:5:"wörld";}'; // fails
$ser2 = 'a:2:{i:0;s:5:"hello";i:1;s:5:"world";}'; // works
$out = unserialize($ser);
$out2 = unserialize($ser2);
print_r($out);
print_r($out2);
echo "<hr>";

But why?
Should I encode before serialzing than? How?

I am using Javascript to write the serialized string to a hidden field, than PHP's $_POST
In JS I have something like:

function writeImgData() {
    var caption_arr = new Array();
    $('.album img').each(function(index) {
         caption_arr.push($(this).attr('alt'));
    });
    $("#hidden-field").attr("value", serializeArray(caption_arr));
};

解决方案

The reason why unserialize() fails with:

$ser = 'a:2:{i:0;s:5:"héllö";i:1;s:5:"wörld";}';

Is because the length for héllö and wörld are wrong, since PHP doesn't correctly handle multi-byte strings natively:

echo strlen('héllö'); // 7
echo strlen('wörld'); // 6

However if you try to unserialize() the following correct string:

$ser = 'a:2:{i:0;s:7:"héllö";i:1;s:6:"wörld";}';

echo '<pre>';
print_r(unserialize($ser));
echo '</pre>';

It works:

Array
(
    [0] => héllö
    [1] => wörld
)

If you use PHP serialize() it should correctly compute the lengths of multi-byte string indexes.

On the other hand, if you want to work with serialized data in multiple (programming) languages you should forget it and move to something like JSON, which is way more standardized.

相关文章