Python读写二进制文件的实现

2023-05-15 17:05:21 python 二进制文件读写

1. 简介

Python 读写文件的二进制数据需要使用到struct模块，进行C/c++与Python数据格式的转换。

2. struct模块介绍

struct模块中最常用的函数为pack和unpack，用法如下：

函数	return	explain
pack(fmt,v1,v2…)	string	按照给定的格式(fmt),把数据转换成字符串(字节流),并将该字符串返回.
pack_into(fmt,buffer,offset,v1,v2…)	None	按照给定的格式(fmt),将数据转换成字符串(字节流),并将字节流写入以offset开始的buffer中.(buffer为可写的缓冲区,可用array模块)
unpack(fmt,v1,v2……)	tuple	按照给定的格式(fmt)解析字节流,并返回解析结果
pack_from(fmt,buffer,offset)	tuple	按照给定的格式(fmt)解析以offset开始的缓冲区,并返回解析结果
calcsize(fmt)	size of fmt	计算给定的格式(fmt)占用多少字节的内存，注意对齐方式

3. struct模块中数据格式fmt对应C/C++和Python中的类型

FORMat	C Type	Python type	Standard size
x	pad byte	no value
c	char	string of length	1
b	signed char	integer	1
B	unsigned char	integer	1
?	_Bool	bool	1
h	short	integer	2
H	unsigned short	integer	2
i	int	integer	4
I	unsigned int	integer	4
l	long	integer	4
L	unsigned long	integer	4
q	long long	integer	8
Q	unsigned long long	integer	8
f	float	float	4
d	double	float	8
s	char[]	string
p	char[]	string
P	void *	integer

4. 实例

注意：代码中，<表示小端，>表示大端

import struct

# 打开文件
with open("binary_file.bin", "wb") as f:

    # 写入4个字节的整数（值为12345）
    int_value = 12345
    f.write(struct.pack("<i", int_value))

    # 写入8个字节的双精度浮点数（值为3.14159）
    double_value = 3.14159
    f.write(struct.pack("<d", double_value))

    # 写入一个字节的布尔值（值为True）
    bool_value = True
    f.write(struct.pack("<?", bool_value))

    # 写入一个定长字符串（10个字符，值为"hello"）
    string_value = "hello".encode("utf-8")
    f.write(struct.pack("<5s", string_value))

    # 写入一个定长字节数组（20个字节，值为b"\x01\x02\x03...\x14"）
    byte_array_value = bytes(range(1, 21))
    f.write(struct.pack("<20s", byte_array_value))

    f.close()

# 打开文件
with open("binary_file.bin", "rb") as f:

    # 读取4个字节，解析成一个整数
    int_value = struct.unpack("<i", f.read(4))[0]
    
    # 读取8个字节，解析成一个双精度浮点数
    double_value = struct.unpack("<d", f.read(8))[0]

    # 读取一个字节，解析成一个布尔值
    bool_value = struct.unpack("<?", f.read(1))[0]

    # 读取一个字符串，解析成一个定长字符串（10个字符）
    string_value = struct.unpack("<5s", f.read(5))[0].decode("utf-8")

    # 读取一个字节数组，解析成一个定长字节数组（20个字节）
    byte_array_value = struct.unpack("<20s", f.read(20))[0]

    # 打印结果
    print(f"int_value: {int_value}")
    print(f"double_value: {double_value}")
    print(f"bool_value: {bool_value}")
    print(f"string_value: {string_value}")
    print(f"byte_array_value: {byte_array_value}")

    f.close()

5. Python 字符串前面加u,r,b,f的含义

5.1. 字符串前加u

后面字符串以 Unicode格式进行编码，一般用在中文字符串前面，防止因为源码储存格式问题，导致再次使用时出现乱码。

str= u'hello'

5.2. 字符串前加r

去掉反斜杠的转移机制。（特殊字符：即那些，反斜杠加上对应字母，表示对应的特殊含义的，比如最常见的”\n”表示换行，”\t”表示Tab等。）

str= r'hello\n\t\n'

5.3. 字符串前加b

表示该字符串是bytes 类型。

bytes = b'hello'

在 python3 中，bytes 和 str 的互相转换方式是

str.encode(‘utf-8')
bytes.decode(‘utf-8')

5.4. 字符串前加f

以 f 开头表示在字符串内支持大括号内的python 表达式,字符串拼接

name = 'Lily'
print(f'My name is {name}.')

参考
[1] python3中的struct模块使用
[2] Python 字符串前面加u,r,b,f的含义

到此这篇关于Python读写二进制文件的实现的文章就介绍到这了,更多相关Python读写二进制文件内容请搜索以前的文章或继续浏览下面的相关文章希望大家以后多多支持！

相关文章