python 3.2 UnicodeEncodeError: 'charmap' 编解码器无法对位置 9629 的字符 'u2013' 进行编码:字符映射到 <undefined>

2021-12-02 00:00:00 python python-3.x sqlite

我正在尝试制作一个从 sqlite3 数据库中获取数据的脚本,但我遇到了一个问题.

数据库中的字段是文本类型,包含一个 html 格式的文本.看下面的文字

<头><title>Yahoo!</title><身体><style type="text/css">html {}.yshortcuts {border-bottom:none !important;}.ReadMsgBody {width:100%;}.ExternalClass{width:100%;}</风格><table cellpadding="0" cellspacing="0" bgcolor="#ffffff"><tr><td width="550" valign="top" align="left"><table cellpadding="0" cellspacing="0" width="500"><tr><td colspan="3"><img src="http://mail.yimg.com/nq/assets/sharedmessages/v1/us/logo.gif" width="292" height="51"样式=显示:块;"border="0" alt="雅虎邮件"></td></tr><tr><td rowspan="3" width="1" bgcolor="#c7c4ca"></td><td width="498" height="1" bgcolor="#c7c4ca"></td><td rowspan="3" width="1" bgcolor="#c7c4ca"></td></tr><tr><td width="498" valign="top" align="left"><table cellpadding="0" cellspacing="0"><tr><td width="498" bgcolor="#61399d" align="left" valign="top"><table cellpacing="0" cellpadding="0"><tr><td height="24"></td></tr></table><div style="font-family:Arial, Helvetica, sans-serif;font-size:23px;line-height:27px;margin-bottom:10px;color:#ffffff;margin-left:15px;"><span style="color:#ffffff;text-decoration:none;font-weight:bold;line-height:27px;">Välkommen 直到 Yahoo!邮件.</span></div><div style="font-family:Arial, Helvetica, sans-serif;font-size:22px;line-height:26px;margin-bottom:1px;color:#ffffff;margin-left:15px;margin-bottom:7px;margin-right:15px;">Ansluta och dela går snabbt och enkelt och ärtilgängligt överallt.</div></td></tr><tr><td><img src="http://mail.yimg.com/nq/assets/sharedmessages/v1/all/b1.gif" width="498" height="18" style="display:block;"border="0"></td></tr><table cellpadding="0" cellspacing="0" width="498"><tr><td width="292" valign="top"><table cellpadding="0" cellspacing="0"><tr><td><img src="http://mail.yimg.com/nq/assets/sharedmessages/v1/all/grad.gif" width="292" height="9" style="display:block;"></td></tr><tr><td width="292" bgcolor="#ffffff" align="left" valign="top"><table cellpacing="0" cellpadding="0"><tr><td height="11"></td></tr></table><div style="margin-left:15px;"><div style="font-family:Arial, Helvetica, sans-serif;font-size:14px;line-height:18px;color:#333333;margin-bottom:11px;font-weight:bold;">Det är lätt som en plätt at komma igång.</div><table cellpadding="0" cellspacing="0" width="267"><tr><td width="16" align="left" valign="top"><div style="font-family:Arial, Helvetica, sans-serif;font-size:14px;line-height:16px;color:#61399d;margin-bottom:9px;font-weight:bold;">1.</div></td><td align="left" valign="top"><div style="font-family:Arial, Helvetica, sans-serif;font-size:13px;line-height:16px;color:#61399d;margin-bottom:9px;"><a rel="nofollow" target="_blank" href="http://us-mg999.mail.yahoo.com/neo/launch?action=contacts" style="text-decoration:underline;color:#61399d;"><span>Lägg until alla dina kontakter på en plats</span></a>.</div></td></tr><tr><td align="left" valign="top"><div style="font-family:Arial, Helvetica, sans-serif;font-size:14px;line-height:16px;color:#61399d;margin-bottom:9px;font-weight:bold;">2.</div></td><td align="left" valign="top"><div style="font-family:Arial, Helvetica, sans-serif;font-size:13px;line-height:16px;color:#61399d;margin-bottom:9px;"><a rel="nofollow" target="_blank" href="http://mrd.mail.yahoo.com/themes" style="text-decoration:underline;color:#61399d;"><span>Anpassa dinnyainkorg</span></a>.</div></td></tr><tr><td align="left" valign="top"><div style="font-family:Arial, Helvetica, sans-serif;font-size:14px;line-height:16px;color:#61399d;margin-bottom:9px;font-weight:bold;">3.</div></td><td align="left" valign="top"><div style="font-family:Arial, Helvetica, sans-serif;font-size:13px;line-height:16px;color:#61399d;"><a rel="nofollow" target="_blank" href="http://se.overview.mail.yahoo.com/mobile" style="text-decoration:underline;color:#61399d;"><span>Anslut överallt på dina mobila enheter</span></a>.</div></td></tr>

</td></tr><tr><td height="13"></td></tr></td><td width="196" valign="top"><table cellpadding="0" cellspacing="0"><tr><td width="1" bgcolor="#fbfbfd" valign="top"><img src="http://mail.yimg.com/nq/assets/sharedmessages/v1/all/g1.gif" width="1" height="21" style="display:block;"></td><td width="1" bgcolor="#f5f6fa" valign="top"><img src="http://mail.yimg.com/nq/assets/sharedmessages/v1/all/g2.gif" width="1" height="21" style="display:block;"></td><td width="1" bgcolor="#e8eaf1" valign="top"><img src="http://mail.yimg.com/nq/assets/sharedmessages/v1/all/g3.gif" width="1" height="21" style="display:block;"></td><td width="1" bgcolor="#d4d4d4"></td><td width="186" bgcolor="#f0f0f0" align="left" valign="top"><table cellpacing="0" cellpadding="0"><tr><td height="3"></td></tr></table><div style="margin-left:11px;"><div style="font-family:Arial, Helvetica, sans-serif;font-size:13px;line-height:16px;color:#333333;margin-bottom:9px;"><b>信息挖:</b></div><div style="font-family:Arial, Helvetica, sans-serif;font-size:12px;color:#43494e;line-height:18px;margin-bottom:10px;">Yahoo!-ID 和电子邮政地址:<br/><div style="font-family:Arial, Helvetica, sans-serif;font-size:12px;color:#43494e;line-height:18px;">Håll ditt konto och inställningar aktuella.<br><a rel="nofollow" target="_blank" href="https://edit.yahoo.com/config/eval_profile" style="text-decoration:underline;color:#61399d;"><span>Mitt konto</span></a>

<table cellpacing="0" cellpadding="0"><tr><td height="20"></td></tr></table></td><td width="1" bgcolor="#dbdbdb"></td><td width="1" bgcolor="#ced2de"></td><td width="1" bgcolor="#dbdfed"></td><td width="1" bgcolor="#e8ebf3"></td><td width="1" bgcolor="#f3f4f9"></td><td width="1" bgcolor="#fafbfc"></td></tr><tr><td colspan="11"><img src="http://mail.yimg.com/nq/assets/sharedmessages/v1/all/b2.gif" width="196" height="8"样式=显示:块;"border="0"></td></tr><tr><td height="13"></td></tr></td><td width="10"></td></tr></td></tr><tr><td width="498" height="1" bgcolor="#c7c4ca"></td></tr><table cellpadding="0" cellspacing="0" width="500"><tr><td align="center" valign="top"><table cellspacing="0" cellpadding="0"><tr><td height="10"></td></tr></table><div style="font-family:Arial, Helvetica, sans-serif;font-size:11px;line-height:18px;margin-bottom:10px;"><a rel="nofollow" target="_blank" href="http://info.yahoo.com/legal/se/yahoo/utos.html" style="text-decoration:underline;color:#61399d;>雅虎!Villkor for användning</a>&nbsp;&nbsp;|&nbsp;&nbsp;<a rel="nofollow" target="_blank" href="http://info.yahoo.com/legal/se/yahoo/mail/atos.html" style="text-decoration:underline;color:#61399d;">Yahoo!邮件 – Villkor for användning</a>&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;<a rel="nofollow" target="_blank" href="http://info.yahoo.com/privacy/se/yahoo/details.html" style="text-decoration:underline;color:#61399d;">Yahoo!Sekretesspolicy</a>

</td></tr><tr><td align="left" valign="top"><div style="font-family:Arial, Helvetica, sans-serif;font-size:11px;line-height:14px;color:#545454;margin-left:16px;margin-right:14px;">Var 神 svara inte på detta meddelande.Detta är ett servicemeddelande som rör din användning av Yahoo!邮件.Om du vill veta mer om Yahoo!s användning av personlig 信息,inklusive användning av webb-beacons i HTML-baserad e-post,kan du läsa vår Yahoo!秘密政策.雅虎地址 är 701 First Avenue, Sunnyvale, CA 94089, USA.</td></tr></td></tr><img width="1" height="1" src="http://pclick.internal.yahoo.com/p/s=2143684696"></html>`

尝试提取数据的python代码如下.

<预><代码>>>>导入 sqlite3>>>conn = sqlite3.connect('C:/temp/Mobils/export/com.yahoo.mobile.client.android.mail/databases/mail.db')>>>c = conn.cursor()>>>conn.row_factory=sqlite3.Row>>>c.execute('select body from messages_1 where _id=7')<sqlite3.Cursor 对象在 0x0000000001FB78F0>>>>r = c.fetchone()>>>r.keys()['身体']>>>打印(r ['body'])回溯(最近一次调用最后一次):文件<stdin>",第 1 行,在 <module> 中文件C:Python32libencodingscp850.py",第 19 行,编码返回 codecs.charmap_encode(input,self.errors,encoding_map)[0]UnicodeEncodeError: 'charmap' codec can't encode character 'u2013' in position 9629: character maps to <undefined>>>>

有没有人知道如何将其打印/写入文件.是的,我知道这是打印到标准输出的,但是当我尝试写入文件时,我得到了相同的 UnicodeEncodeError.我尝试了文件对象的 write 方法和 print(r['body'], file=f).

解决方案

当您打开要写入的文件时,请使用可以处理所有字符的特定编码打开它.

 with open('filename', 'w', encoding='utf-8') as f:打印(r['body'],文件=f)

I'm trying to make a script that gets data out from an sqlite3 database, but I have run in to a problem.

The field in the database is of type text and the contains a html formated text. see the text below

<html>
<head>
<title>Yahoo!</title>
</head>
<body>
<style type="text/css">
html {}
.yshortcuts {border-bottom:none !important;}
.ReadMsgBody {width:100%;}
.ExternalClass{width:100%;}
</style>
<table cellpadding="0" cellspacing="0" bgcolor="#ffffff">    
<tr>
<td width="550" valign="top" align="left">

    <table cellpadding="0" cellspacing="0" width="500">
        <tr>
            <td colspan="3"><img        src="http://mail.yimg.com/nq/assets/sharedmessages/v1/us/logo.gif" width="292" height="51" style="display:block;" border="0" alt="Yahoo! Mail"></td>
        </tr>
        <tr>
            <td rowspan="3" width="1" bgcolor="#c7c4ca"></td>
            <td width="498" height="1" bgcolor="#c7c4ca"></td>
            <td rowspan="3" width="1" bgcolor="#c7c4ca"></td>
        </tr>
        <tr>
            <td width="498" valign="top" align="left">
            <table cellpadding="0" cellspacing="0">
                <tr>
                    <td width="498" bgcolor="#61399d" align="left" valign="top">
                    <table cellspacing="0" cellpadding="0"><tr><td height="24"></td></tr></table>
                    <div style="font-family:Arial, Helvetica, sans-serif;font-size:23px;line-height:27px;margin-bottom:10px;color:#ffffff;margin-left:15px;"><span style="color:#ffffff;text-decoration:none;font-weight:bold;line-height:27px;">Välkommen till Yahoo! Mail.</span></div>
                    <div style="font-family:Arial, Helvetica, sans-serif;font-size:22px;line-height:26px;margin-bottom:1px;color:#ffffff;margin-left:15px;margin-bottom:7px;margin-right:15px;">Ansluta och dela går snabbt och enkelt och är tillgängligt överallt.</div>
                    </td>
                </tr>
                <tr>
                    <td><img src="http://mail.yimg.com/nq/assets/sharedmessages/v1/all/b1.gif" width="498" height="18" style="display:block;" border="0"></td>
                </tr>
            </table>
            <table cellpadding="0" cellspacing="0" width="498">
                <tr>
                    <td width="292" valign="top">
                    <table cellpadding="0" cellspacing="0">
                        <tr>
                            <td><img src="http://mail.yimg.com/nq/assets/sharedmessages/v1/all/grad.gif" width="292" height="9" style="display:block;"></td>
                        </tr>
                        <tr>
                            <td width="292" bgcolor="#ffffff" align="left" valign="top">
                            <table cellspacing="0" cellpadding="0"><tr><td height="11"></td></tr></table>
                            <div style="margin-left:15px;">                  
                                <div style="font-family:Arial, Helvetica, sans-serif;font-size:14px;line-height:18px;color:#333333;margin-bottom:11px;font-weight:bold;">Det är lätt som en plätt att komma igång.</div>
                                <table cellpadding="0" cellspacing="0" width="267">
                                    <tr>
                                        <td width="16" align="left" valign="top"><div style="font-family:Arial, Helvetica, sans-serif;font-size:14px;line-height:16px;color:#61399d;margin-bottom:9px;font-weight:bold;">1. </div></td>
                                        <td align="left" valign="top"><div style="font-family:Arial, Helvetica, sans-serif;font-size:13px;line-height:16px;color:#61399d;margin-bottom:9px;"><a rel="nofollow" target="_blank" href="http://us-mg999.mail.yahoo.com/neo/launch?action=contacts" style="text-decoration:underline;color:#61399d;"><span>Lägg till alla dina kontakter på en plats</span></a>.</div></td>
                                    </tr>
                                    <tr>
                                        <td align="left" valign="top"><div style="font-family:Arial, Helvetica, sans-serif;font-size:14px;line-height:16px;color:#61399d;margin-bottom:9px;font-weight:bold;">2. </div></td>
                                        <td align="left" valign="top"><div style="font-family:Arial, Helvetica, sans-serif;font-size:13px;line-height:16px;color:#61399d;margin-bottom:9px;"><a rel="nofollow" target="_blank" href="http://mrd.mail.yahoo.com/themes" style="text-decoration:underline;color:#61399d;"><span>Anpassa din nya inkorg</span></a>.</div></td>
                                    </tr>
                                    <tr>
                                        <td align="left" valign="top"><div style="font-family:Arial, Helvetica, sans-serif;font-size:14px;line-height:16px;color:#61399d;margin-bottom:9px;font-weight:bold;">3. </div></td>
                                        <td align="left" valign="top"><div style="font-family:Arial, Helvetica, sans-serif;font-size:13px;line-height:16px;color:#61399d;"><a rel="nofollow" target="_blank" href="http://se.overview.mail.yahoo.com/mobile" style="text-decoration:underline;color:#61399d;"><span>Anslut överallt på dina mobila enheter</span></a>.</div></td>
                                    </tr>
                                </table>

                            </div>
                            </td>
                        </tr>
                        <tr><td height="13"></td></tr>
                    </table>
                    </td>
                    <td width="196" valign="top">
                    <table cellpadding="0" cellspacing="0">
                        <tr>
                            <td width="1" bgcolor="#fbfbfd" valign="top"><img src="http://mail.yimg.com/nq/assets/sharedmessages/v1/all/g1.gif" width="1" height="21" style="display:block;"></td>
                            <td width="1" bgcolor="#f5f6fa" valign="top"><img src="http://mail.yimg.com/nq/assets/sharedmessages/v1/all/g2.gif" width="1" height="21" style="display:block;"></td>
                            <td width="1" bgcolor="#e8eaf1" valign="top"><img src="http://mail.yimg.com/nq/assets/sharedmessages/v1/all/g3.gif" width="1" height="21" style="display:block;"></td>
                            <td width="1" bgcolor="#d4d4d4"></td>
                            <td width="186" bgcolor="#f0f0f0" align="left" valign="top">  
                            <table cellspacing="0" cellpadding="0"><tr><td height="3">   </td></tr></table>
                            <div style="margin-left:11px;">
                            <div style="font-family:Arial, Helvetica, sans-serif;font-size:13px;line-height:16px;color:#333333;margin-bottom:9px;"><b>Info för dig:</b></div>
                            <div style="font-family:Arial, Helvetica, sans-serif;font-size:12px;color:#43494e;line-height:18px;margin-bottom:10px;">
                            Yahoo!-ID och e-postadress:<br />
                            <div style="font-family:Arial, Helvetica, sans-serif;font-size:12px;color:#43494e;line-height:18px;">
                            Håll ditt konto och inställningar aktuella. <br><a rel="nofollow" target="_blank" href="https://edit.yahoo.com/config/eval_profile" style="text-decoration:underline;color:#61399d;"><span>Mitt konto</span></a> 
                            </div>
                            </div>
                            <table cellspacing="0" cellpadding="0"><tr><td height="20"></td></tr></table>
                            </td>
                            <td width="1" bgcolor="#dbdbdb"></td>
                            <td width="1" bgcolor="#ced2de"></td>
                            <td width="1" bgcolor="#dbdfed"></td>
                            <td width="1" bgcolor="#e8ebf3"></td>
                            <td width="1" bgcolor="#f3f4f9"></td>
                            <td width="1" bgcolor="#fafbfc"></td>
                        </tr>
                        <tr>
                            <td colspan="11"><img src="http://mail.yimg.com/nq/assets/sharedmessages/v1/all/b2.gif" width="196" height="8" style="display:block;" border="0"></td>
                        </tr>
                        <tr><td height="13"></td></tr>
                    </table>
                    </td>
                    <td width="10"></td>
                </tr>
            </table>
            </td>
        </tr>
        <tr>
            <td width="498" height="1" bgcolor="#c7c4ca"></td>
        </tr>
    </table>
    <table cellpadding="0" cellspacing="0" width="500">
        <tr>
            <td align="center" valign="top">
            <table cellspacing="0" cellpadding="0"><tr><td height="10"></td></tr></table>
                <div style="font-family:Arial, Helvetica, sans-serif;font-size:11px;line-height:18px;margin-bottom:10px;">
                <a rel="nofollow" target="_blank" href="http://info.yahoo.com/legal/se/yahoo/utos.html" style="text-decoration:underline;color:#61399d;">Yahoo! Villkor för användning</a>&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;<a rel="nofollow" target="_blank" href="http://info.yahoo.com/legal/se/yahoo/mail/atos.html" style="text-decoration:underline;color:#61399d;">Yahoo! Mail –Villkor för användning</a>&nbsp;&nbsp;&nbsp;|&nbsp;&nbsp;&nbsp;<a rel="nofollow" target="_blank" href="http://info.yahoo.com/privacy/se/yahoo/details.html" style="text-decoration:underline;color:#61399d;">Yahoo! Sekretesspolicy</a>
                </div>
            </td>
        </tr>
        <tr>
            <td align="left" valign="top">
                <div style="font-family:Arial, Helvetica, sans-serif;font-size:11px;line-height:14px;color:#545454;margin-left:16px;margin-right:14px;">Var god svara inte på detta meddelande. Detta är ett servicemeddelande som rör din användning av Yahoo! Mail. Om du vill veta mer om Yahoo!s användning av personlig information, inklusive användning av webb-beacons i HTML-baserad e-post, kan du läsa vår Yahoo! Sekretesspolicy. Yahoo!s adress är 701 First Avenue, Sunnyvale, CA 94089, USA.<br /><br />RefID: lp-1037111</div>
            </td>
        </tr>
    </table>





    </td>
</tr>
</table>
<img width="1" height="1" src="http://pclick.internal.yahoo.com/p/s=2143684696">
</body>
</html>`

and the python code that try to extract the data is as follows.

>>> import sqlite3
>>> conn = sqlite3.connect('C:/temp/Mobils/export/com.yahoo.mobile.client.android.mail/databases/mail.db')
>>> c = conn.cursor()
>>> conn.row_factory=sqlite3.Row
>>> c.execute('select body from messages_1 where _id=7')
<sqlite3.Cursor object at 0x0000000001FB78F0>
>>> r = c.fetchone()
>>> r.keys()
['body']
>>> print(r['body'])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:Python32libencodingscp850.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode character 'u2013' in position 9629: character maps to <undefined>
>>>

Does anybody have any idea of how to print/write this to a file. Yes I know that this is printed to stdout, but I get the same UnicodeEncodeError when I try to write to a file. I tried both write method of a file object and print(r['body'], file=f).

解决方案

When you open the file you want to write to, open it with a specific encoding that can handle all the characters.

with open('filename', 'w', encoding='utf-8') as f:
    print(r['body'], file=f)

相关文章