在 Python 中从二进制文件中读取整数

2022-01-09 00:00:00 python file binary integer

问题描述

我正在尝试用 Python 读取 BMP 文件.我知道前两个字节表示 BMP 公司.接下来的 4 个字节是文件大小.当我执行时:

I'm trying to read a BMP file in Python. I know the first two bytes indicate the BMP firm. The next 4 bytes are the file size. When I execute:

fin = open("hi.bmp", "rb")
firm = fin.read(2)  
file_size = int(fin.read(4))  

我明白了:

ValueError: int() 以 10 为底的无效文字:'F#x13'

ValueError: invalid literal for int() with base 10: 'F#x13'

我想要做的是将这四个字节作为整数读取,但似乎 Python 正在将它们作为字符读取并返回一个字符串,该字符串无法转换为整数.我怎样才能正确地做到这一点?

What I want to do is reading those four bytes as an integer, but it seems Python is reading them as characters and returning a string, which cannot be converted to an integer. How can I do this correctly?


解决方案

read 方法将字节序列作为字符串返回.要将字符串字节序列转换为二进制数据,请使用内置 struct 模块:http://docs.python.org/library/struct.html.

The read method returns a sequence of bytes as a string. To convert from a string byte-sequence to binary data, use the built-in struct module: http://docs.python.org/library/struct.html.

import struct

print(struct.unpack('i', fin.read(4)))

请注意,unpack 总是返回一个元组,所以 struct.unpack('i', fin.read(4))[0] 给出了你所需要的整数值正在追赶.

Note that unpack always returns a tuple, so struct.unpack('i', fin.read(4))[0] gives the integer value that you are after.

您可能应该使用格式字符串 '<i'(< 是指示小端字节序和标准大小和对齐方式的修饰符 - 默认是使用平台的字节顺序、尺寸和对齐方式).根据 BMP 格式规范,字节应按 Intel/little-endian 字节顺序写入.

You should probably use the format string '<i' (< is a modifier that indicates little-endian byte-order and standard size and alignment - the default is to use the platform's byte ordering, size and alignment). According to the BMP format spec, the bytes should be written in Intel/little-endian byte order.

相关文章