Python中`didic`的内存分配是如何工作的？

2022-03-08 00:00:00 python dictionary hashmap size

问题描述

我在玩字典时发现了这个。

import sys

Square1 = {}
Square2 = {}
Square3 = {}

for i in range(1, 8):
    Square1[i] = i**2

for i in range(1, 11):
    Square2[i] = i**2

for i in range(1, 12):
    Square3[i] = i**2


print(sys.getsizeof(Square1), len(Square1))
print(sys.getsizeof(Square2), len(Square2))
print(sys.getsizeof(Square3), len(Square3))

输出：

196 7
196 10
344 11

Dictionary Length 7和Dictionary Length 10的大小相同，为196，但是长度为11的Dictionary Length为344。为什么它们是一样的？为什么尺寸会随着长度11的增加而增加？在Python中，字典大小是如何工作的？

解决方案

创建空字典时，它会以块为单位预先分配内存，用于它可以存储的最初几个引用。随着字典添加更多的键值对，它需要更多的内存。

但它不会随着每次添加而增长；每次需要更多空间时，它都会添加一些内存块，这些内存块可以容纳&Quot；X&Quot；数量的键-值对，并且在&Quot；X&Quot；数量填满后，会将另一个内存块分配给字典。

这里有一个示例代码，用于显示随着键数的增加而更改的字典大小：

import sys

my_dict = {}
print("Size with {} keys:	 {}".format(0, sys.getsizeof(my_dict)))

for i in range(21):
    my_dict[i] = ''
    print("Size with {} keys:	 {}".format(i+1, sys.getsizeof(my_dict)))

以下是Python 3.6.2中的输出：

#same size for key count 0 - 5 : 240 Bytes
Size with 0 keys:    240
Size with 1 keys:    240
Size with 2 keys:    240
Size with 3 keys:    240
Size with 4 keys:    240
Size with 5 keys:    240

#same size for key count 6 - 10 : 360 Bytes
Size with 6 keys:    368
Size with 7 keys:    368
Size with 8 keys:    368
Size with 9 keys:    368
Size with 10 keys:   368

#same size for key count 11 - 20 : 648 Bytes
Size with 11 keys:   648
Size with 12 keys:   648
Size with 13 keys:   648
Size with 14 keys:   648
Size with 15 keys:   648
Size with 16 keys:   648
Size with 17 keys:   648
Size with 18 keys:   648
Size with 19 keys:   648
Size with 20 keys:   648

此外，字典只存储保存键和值的内存引用，而不会将键值本身存储为dict对象的一部分。因此，数据的类型和大小都不会影响字典的sys.getsizeof()结果。

例如，下面两个字典的大小都是280字节

>>> sys.getsizeof({'a': 'a'})
280

>>> sys.getsizeof({'a'*100000: 'a'*1000000})
280

但是，以下是'a'V/s'a' * 1000000的大小差异：

>>> sys.getsizeof('a')
38

>>> sys.getsizeof('a'*1000000)
1000037

相关文章