如何在 python 中解析 data-uri?
问题描述
HTML 图像元素具有此 简化格式:
HTML image elements have this simplified format:
<img src='something'>
那个东西可以是data-uri
,例如:
That something can be data-uri
, for example:
...
有没有一种标准的方法可以用 python 解析这个,以便我将 content_type
和 base64 数据分开,或者我应该为此创建自己的解析器?
Is there a standard way of parsing this with python, so that I get content_type
and base64 data separated, or should I create my own parser for this?
解决方案
用逗号分割数据 URI,得到不带 header 的 base64 编码数据.调用 base64.b64decode
将其解码为字节.最后,将字节写入文件.
Split the data URI on the comma to get the base64 encoded data without the header. Call base64.b64decode
to decode that to bytes. Last, write the bytes to a file.
from base64 import b64decode
data_uri = "..."
# Python 2 and <Python 3.4
header, encoded = data_uri.split(",", 1)
data = b64decode(encoded)
# Python 3.4+
# from urllib import request
# with request.urlopen(data_uri) as response:
# data = response.read()
with open("image.png", "wb") as f:
f.write(data)
相关文章