是否有任何内置方法可以在 python 中获取可迭代的长度?

2022-01-10 00:00:00 python iterator

问题描述

例如,Python 中的文件是可迭代的——它们迭代文件中的行.我想计算行数.

For example, files, in Python, are iterable - they iterate over the lines in the file. I want to count the number of lines.

一种快速的方法是这样做:

One quick way is to do this:

lines = len(list(open(fname)))

但是,这会将整个文件加载到内存中(一次).这反而违背了迭代器的目的(只需要将当前行保留在内存中).

However, this loads the whole file into memory (at once). This rather defeats the purpose of an iterator (which only needs to keep the current line in memory).

这不起作用:

lines = len(line for line in open(fname))

因为生成器没有长度.

除了定义一个计数函数之外,还有什么方法可以做到这一点吗?

Is there any way to do this short of defining a count function?

def count(i):
    c = 0
    for el in i: c += 1
    return c

为了澄清,我知道必须阅读整个文件!我只是不想一下子把它放在内存中

To clarify, I understand that the whole file will have to be read! I just don't want it in memory all at once


解决方案

没有遍历iterable并计算迭代次数,没有.这就是使它成为可迭代而不是列表的原因.这甚至不是一个特定于 python 的问题.看看经典的链表数据结构.查找长度是一个 O(n) 操作,涉及迭代整个列表以查找元素的数量.

Short of iterating through the iterable and counting the number of iterations, no. That's what makes it an iterable and not a list. This isn't really even a python-specific problem. Look at the classic linked-list data structure. Finding the length is an O(n) operation that involves iterating the whole list to find the number of elements.

正如上面提到的 mcrute,您可能可以将您的功能简化为:

As mcrute mentioned above, you can probably reduce your function to:

def count_iterable(i):
    return sum(1 for e in i)

当然,如果您要定义自己的可迭代对象,您始终可以自己实现 __len__ 并在某处保留元素计数.

Of course, if you're defining your own iterable object you can always implement __len__ yourself and keep an element count somewhere.

相关文章