python 中的成员资格测试比 set() 更快

2022-01-17 00:00:00 python performance set fastq

问题描述

我必须检查包含 10-100k 这些元素的列表中是否存在数百万个元素(20-30 个字母 str).在 python 中有没有比 set() 更快的方法?

I have to check presence of millions of elements (20-30 letters str) in the list containing 10-100k of those elements. Is there faster way of doing that in python than set() ?

import sys
#load ids
ids = set( x.strip() for x in open(idfile) )

for line in sys.stdin:
    id=line.strip()
    if id in ids:
        #print fastq
        print id
        #update ids
        ids.remove( id )


解决方案

set 尽可能快.

但是,如果您重写代码以创建 set 一次,而不更改它,则可以使用 frozenset 内置类型.除了不可变之外,它完全一样.

However, if you rewrite your code to create the set once, and not change it, you can use the frozenset built-in type. It's exactly the same except immutable.

如果您仍然遇到速度问题,您需要通过其他方式加速您的程序,例如使用 PyPy 而不是 cPython.

If you're still having speed problems, you need to speed your program up in other ways, such as by using PyPy instead of cPython.

相关文章