PHP/PDO/MySQL:插入 MEDIUMBLOB 存储坏数据
我有一个简单的 PHP Web 应用程序,它通过文件上传接受图标图像并将它们存储在 MEDIUMBLOB 列中.
I have a simple PHP web app that accepts icon images via file upload and stores them in a MEDIUMBLOB column.
在我的机器 (Windows) 和两台 Linux 服务器上,这工作正常.在第三台 Linux 服务器上,插入的图像已损坏:在 SELECT 后无法读取,并且 MySQL length() 函数报告的列数据长度比上传文件的大小大 40% 左右.
On my machine (Windows) plus two Linux servers, this works fine. On a third Linux server, the inserted image is corrupted: unreadable after a SELECT, and the length of the column data as reported by the MySQL length() function is about 40% larger than the size of the uploaded file.
(每个服务器连接到一个单独的 MySQL 实例.)
(Each server connects to a separate instance of MySQL.)
当然,这让我想到了编码和字符集问题.BLOB 列没有关联的字符集,因此似乎最有可能的罪魁祸首是 PDO 及其对该列的参数值的解释.
Of course, this leads me to think about encoding and character set issues. BLOB columns have no associated charsets, so it seems like the most likely culprit is PDO and its interpretation of the parameter value for that column.
- 我尝试将 bindValue 与 PDO::PARAM_LOB 结合使用,但没有效果.
- 我已经确认服务器上正确接收了图像(即在上传后读取它们没有问题),因此这绝对是 DB/PDO 问题.
- 我已经搜索了服务器之间明显的配置差异,但我不是 PHP 配置方面的专家,所以我可能遗漏了一些东西.
插入代码大致如下:
$imagedata = file_get_contents($_FILES["icon"]["tmp_name"]);
$stmt = $pdo->prepare('insert into foo (theimage) values (:theimage)');
$stmt->bindValue(':theimage', $imagedata, PDO::PARAM_LOB);
$stmt->execute();
任何帮助将不胜感激.
UPDATE:有问题的服务器上的默认 MySQL 字符集是 utf8;其他人是latin1.
UPDATE: The default MySQL charset on the problematic server is utf8; it's latin1 on the others.
问题通过添加PDO::MYSQL_ATTR_INIT_COMMAND =>SET NAMES latin1 COLLATE latin1_general_ci"
到 PDO 构造函数.
The problem is "solved" by adding PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES latin1 COLLATE latin1_general_ci"
to the PDO constructor.
这对我来说似乎一个错误糟糕的设计:为什么连接的字符集会对二进制列的数据产生任何影响,特别是当它被识别为用 PARAM_LOB 二进制到 PDO 本身?
This seems like a bug poor design to me: why should the charset of the connection have any effect on data for a binary column, particularly when it's been identified as binary to PDO itself with PARAM_LOB?
请注意,数据库表在所有情况下都定义为 latin1:只有服务器的默认字符集不一致.
Note that the DB tables are defined as latin1 in all cases: it's only the servers' default charsets that are inconsistent.
推荐答案
这对我来说似乎是一个错误:为什么连接的字符集会对二进制列的数据产生任何影响,尤其是当它被识别为 PDO 本身的二进制数据时,PARAM_LOB?
This seems like a bug to me: why should the charset of the connection have any effect on data for a binary column, particularly when it's been identified as binary to PDO itself with PARAM_LOB?
我不认为这一定是一个错误.我可以想象,每当客户端与服务器交谈并说以下命令是 UTF-8 并且服务器需要它是 Latin-1 时,那么查询可能会在解析和执行之前重新编码.所以这是数据传输的编码问题.由于整个查询优先解析会受到这种重新编码的影响,BLOB 列的二进制数据也会发生变化.
I do not think that this must be a bug. I can imagine that whenever the client talks with the server and says that the following command is in UTF-8 and the server needs it in Latin-1, then the query might get re-encoded prior parsing and execution. So this is an encoding issue for the transportation of the data. As the whole query prior parsing will get influenced by this re-encoding, the binary data for the BLOB column will get changed as well.
来自 Mysql 手册:
服务器收到语句后应该翻译成什么字符集?
为此,服务器使用 character_set_connection 和 collation_connection 系统变量.它将客户端发送的语句从 character_set_client 转换为 character_set_connection(具有介绍人的字符串文字除外,例如 _latin1 或 _utf8).collation_connection 对于文字字符串的比较很重要.对于字符串与列值的比较,collation_connection 无关紧要,因为列有自己的排序规则,具有更高的排序规则优先级.
For this, the server uses the character_set_connection and collation_connection system variables. It converts statements sent by the client from character_set_client to character_set_connection (except for string literals that have an introducer such as _latin1 or _utf8). collation_connection is important for comparisons of literal strings. For comparisons of strings with column values, collation_connection does not matter because columns have their own collation, which has a higher collation precedence.
或者在返回的路上:来自商店的 Latin1 数据将被转换为 UTF-8,因为客户端告诉服务器它更喜欢使用 UTF-8 进行传输.
Or on the way back: Latin1 data from the store will get converted into UTF-8 because the client told the server that it prefers UTF-8 for the transportation.
您命名的 PDO 本身的标识符看起来完全不同:
The identifier for PDO itself you name looks like being something entirely different:
PDO::PARAM_LOB 告诉 PDO 将数据映射为流,以便您可以使用 PHP Streams API 对其进行操作.(参考)
PDO::PARAM_LOB tells PDO to map the data as a stream, so that you can manipulate it using the PHP Streams API. (Ref)
我不是 MySQL 专家,但我会这样解释.客户端和服务器需要协商他们使用的字符集,我认为他们这样做是有原因的.
I'm no MySQL expert but I would explain it this way. Client and server need to negotiate which charsets they are using and I assume they do this for a reason.
相关文章