我如何在 Python3 中 .decode('string-escape') ?

2022-01-31 00:00:00 python python-3.x escaping

问题描述

我有一些需要转义的转义字符串.我想在 Python 中执行此操作.

I have some escaped strings that need to be unescaped. I'd like to do this in Python.

例如,在python2.7中我可以这样做:

For example, in python2.7 I can do this:

>>> "\123omething special".decode('string-escape')
'Something special'
>>> 

如何在 Python3 中做到这一点?这不起作用:

How do I do it in Python3? This doesn't work:

>>> b"\123omething special".decode('string-escape')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
LookupError: unknown encoding: string-escape
>>> 

我的目标是成为 abel 接受这样的字符串:

My goal is to be abel to take a string like this:

s00u00p00p00o00r00t00@00p00s00i00l00o00c00.00c00o00m00

然后把它变成:

"support@psiloc.com"

完成转换后,我会检查我拥有的字符串是用 UTF-8 还是 UTF-16 编码的.

After I do the conversion, I'll probe to see if the string I have is encoded in UTF-8 or UTF-16.


解决方案

如果要str-to-str解码转义序列,那么输入输出都是统一码:

If you want str-to-str decoding of escape sequences, so both input and output are Unicode:

def string_escape(s, encoding='utf-8'):
    return (s.encode('latin1')         # To bytes, required by 'unicode-escape'
             .decode('unicode-escape') # Perform the actual octal-escaping decode
             .encode('latin1')         # 1:1 mapping back to bytes
             .decode(encoding))        # Decode original encoding

测试:

>>> string_escape('\123omething special')
'Something special'

>>> string_escape(r's00u00p00p00o00r00t00@'
                  r'00p00s00i00l00o00c00.00c00o00m00',
                  'utf-16-le')
'support@psiloc.com'

相关文章