在 Python 中强制垃圾收集以释放内存

问题描述

我有一个 Python2.7 应用程序,它使用了很多 dict 对象,这些对象主要包含键和值的字符串.

I have a Python2.7 App which used lots of dict objects which mostly contain strings for keys and values.

有时不再需要这些字典和字符串,我想将它们从内存中删除.

Sometimes those dicts and strings are not needed anymore and I would like to remove those from memory.

我尝试了不同的东西,del dict[key]del dict 等.但是应用程序仍然使用相同的内存量.

I tried different things, del dict[key], del dict, etc. But the App still uses the same amount of memory.

下面是一个我希望为内存付费的示例.但它没有:(

Below a example which I would expect to fee the memory. But it doesn't :(

import gc
import resource

def mem():
    print('Memory usage         : % 2.2f MB' % round(
        resource.getrusage(resource.RUSAGE_SELF).ru_maxrss/1024.0/1024.0,1)
    )

mem()

print('...creating list of dicts...')
n = 10000
l = []
for i in xrange(n):
    a = 1000*'a'
    b = 1000*'b'
    l.append({ 'a' : a, 'b' : b })

mem()

print('...deleting list items...')

for i in xrange(n):
    l.pop(0)

mem()

print('GC collected objects : %d' % gc.collect())

mem()

输出:

Memory usage         :  4.30 MB
...creating list of dicts...
Memory usage         :  36.70 MB
...deleting list items...
Memory usage         :  36.70 MB
GC collected objects : 0
Memory usage         :  36.70 MB

我希望在这里收集"一些对象并释放一些内存.

I would expect here some objects to be 'collected' and some memory to be freed.

我做错了吗?任何其他删除未使用对象或至少查找意外使用对象的位置的方法.

Am I doing something wrong? Any other ways to delete unused objects or a least to find where the objects are unexpectedly used.


解决方案

Frederick Lundh 解释,

如果你创建一个大对象并再次删除它,Python 可能已经释放了内存,但所涉及的内存分配器不一定返回内存到操作系统,所以它看起来好像 Python 进程使用比实际使用的虚拟内存多得多.

If you create a large object and delete it again, Python has probably released the memory, but the memory allocators involved don’t necessarily return the memory to the operating system, so it may look as if the Python process uses a lot more virtual memory than it actually uses.

Alex Martelli 写道:

唯一真正可靠的方法可以确保大型但内存的临时使用确实在完成后将所有资源返回给系统,是让这种使用发生在一个子进程中,它完成了需要大量内存的工作然后终止.

The only really reliable way to ensure that a large but temporary use of memory DOES return all resources to the system when it's done, is to have that use happen in a subprocess, which does the memory-hungry work then terminates.

因此,您可以使用 multiprocessing 生成子进程,执行内存占用计算,然后确保在子进程终止时释放内存:

So, you could use multiprocessing to spawn a subprocess, perform the memory-hogging calculation, and then ensure the memory is released when the subprocess terminates:

import multiprocessing as mp
import resource

def mem():
    print('Memory usage         : % 2.2f MB' % round(
        resource.getrusage(resource.RUSAGE_SELF).ru_maxrss/1024.0,1)
    )

mem()

def memoryhog():
    print('...creating list of dicts...')
    n = 10**5
    l = []
    for i in xrange(n):
        a = 1000*'a'
        b = 1000*'b'
        l.append({ 'a' : a, 'b' : b })
    mem()

proc = mp.Process(target=memoryhog)
proc.start()
proc.join()

mem()

产量

Memory usage         :  5.80 MB
...creating list of dicts...
Memory usage         :  234.20 MB
Memory usage         :  5.90 MB

相关文章