非拉丁字符 &哎哟

2021-12-21 00:00:00 utf-8 character-encoding php mysql cakephp

我开始了解 Cake PHP,它发现了一个关于 PHP/MySQL 字符集方面的最佳实践的一般问题,我希望可以在这里回答.

I'm getting to know Cake PHP, which has unearthed a general question about best practice in terms of PHP / MySQL character set stuff, which I'm hoping can be answered here.

我的(练习)系统包含一个 mysql 电影表.此列表源自 Excel 表格,该表格以 CSV 格式导出,并通过 phpMyAdmin 导入.

My (practice) system contains a mysql table of movies. This list was sourced from an Excel sheet, which was exported as CSV, and imported via phpMyAdmin.

我注意到具有更多异国情调"字形的标题在浏览器中呈现问题,例如 Amélie 中的 é.使用 Cake 或普通 PHP,它呈现为 ?,除非通过 htmlentities 转换为 é.带有特殊字符的链接根本不呈现.

I noticed that titles with more "exotic" glyphs have issues rendering in the browser, eg The é in Amélie. Using Cake or plain PHP, it renders as a ?, unless transformed via htmlentities into a é. Links with the special characters don't render at all.

如果我使用我的 Cake 输入表单输入 0233,这会在源代码中正确呈现,但作为 é通过 htmlentities.

If I use my Cake input form to enter an <alt>0233, this is rendered correctly in source, but as &Atilde;&copy; via htmlentities.

在快速搜索之后,我决定 UTF-8 可能会解决问题,因此我

After a quick SO search, I decided maybe UTF-8 would fix stuff, hence I

  • 将 PHP 源代码和 CSV 文件编码更改为 UTF-8
  • 确保 <meta> 东西在那里(它已经通过 Cake 的默认布局).
  • 确保我的浏览器认为文档是 UTF-8(他们确实如此)
  • 将 MySQL 数据库上的排序规则更改为 utf-8 general_ci(作为来自可用 UTF-8 选项的受过教育的选项)
  • 删除并重新导入了我的数据
  • changed the PHP source, and CSV file encoding to UTF-8
  • made sure the <meta> stuff was there (it was already via Cake's default layout).
  • made sure my browsers thinks the doc is UTF-8 (they do)
  • changed the collation on the MySQL DB to utf-8 general_ci (as an educated stab from avalable UTF-8 options)
  • deleted and reimported my data

但是,我还是卡住了.我注意到 phpMyAdmin 在浏览记录时设法在其 HTML 源代码中正确"呈现字符.

However, I'm still stuck. I note that phpMyAdmin manages to render the characters "correctly" in it's HTML source when browsing records.

我觉得应该归咎于文档编码,但是,我想知道是否有人可以提供最佳答案:

I sense that document encoding's to blame, however, am wondering if someone can provide the best answer to:

  • 将数据从 Excel 移动到 MySQL 以保留字形的最佳方法是什么?
  • 我的桌子的最佳设置是什么以适应这种情况?
  • 我更喜欢使用 UTF-8 来本地显示 é 之类的东西,我可以在 Cake 中做些什么来避免对 htmlentities 之类的调用进行大量调用,即是否有配置设置或设置方式使这更友好,并让像 Html->link 这样的 Cake 本地助手工作?
  • what's the best way to move my data from Excel to MySQL to preserve glyphs?
  • what's the optimum settings for my tables to accommodate this?
  • I'd prefer to use UTF-8 to natively display the likes of é, what can I do in Cake to avoid making loads of calls to the likes of htmlentities ie is there a configuration setting or way I set stuff up that makes this more friendly and lets Cake native helpers like Html->link work?

一些代码,以防万一:

电影控制器摘录..

function index() {
        $this->set('movies' , $this->Movie->find('all'));

}

index.ctp 查看摘录

index.ctp view excerpt

<?php foreach ($movies as $movie): ?>
<tr>
    <td><?php echo $movie['Movie']['id']; ?></td>
    <td><?php echo htmlentities($movie['Movie']['title']); ?>
    <td><?php echo $this->Html->link($movie['Movie']['title'] , 
    array('controller' => 'movies' , 'action' => 'view' , $movie['Movie']['id'])); ?>
    </td>

    <td><?php echo $this->Html->link("Edit", 
    array('action' => 'edit' , $movie['Movie']['id'])); ?>
    </td>

    <td>
    <?php echo $this->Html->link('Delete', array('action' => 'delete', $movie['Movie']['id']), null, 'Are you sure?')?>
    </td>

</tr>
<?php endforeach; ?>

预先感谢您提供的任何帮助/提示.

Thanks in advance for any help / tips.

推荐答案

确保在导入数据时将 MySQL 连接设置为 UTF-8.排序规则仅用于排序和比较,不用于保存数据.

Make sure the MySQL connection is set to UTF-8 while importing the data. The collation is only used for sorting and comparison, not for saving data.

您可以在 SQL 文件的开头使用 SET NAMES 'utf-8'; 设置连接的字符集.

You can set the charset of the connection using SET NAMES 'utf-8'; in the beginning of your SQL file.

相关文章