str.startswith 是如何真正起作用的?
问题描述
我用 startswith()
玩了一会儿,发现了一些有趣的东西:
I've been playing for a bit with startswith()
and I've discovered something interesting:
>>> tup = ('1', '2', '3')
>>> lis = ['1', '2', '3', '4']
>>> '1'.startswith(tup)
True
>>> '1'.startswith(lis)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: startswith first arg must be str or a tuple of str, not list
现在,错误很明显,将列表转换为元组可以正常工作:
Now, the error is obvious and casting the list into a tuple will work just fine as it did in the first place:
>>> '1'.startswith(tuple(lis))
True
现在,我的问题是:为什么 第一个参数必须是 str 或 str 前缀的元组,而不是 str 前缀的列表?
Now, my question is: why the first argument must be str or a tuple of str prefixes, but not a list of str prefixes?
AFAIK,startswith()
的 Python 代码可能如下所示:
AFAIK, the Python code for startswith()
might look like this:
def startswith(src, prefix):
return src[:len(prefix)] == prefix
但这只是让我更加困惑,因为即使考虑到这一点,无论是列表还是元组,它仍然没有任何区别.我错过了什么?
But that just confuses me more, because even with it in mind, it still shouldn't make any difference whether is a list or tuple. What am I missing ?
解决方案
技术上没有理由接受其他序列类型,不.源代码大致是这样的:
There is technically no reason to accept other sequence types, no. The source code roughly does this:
if isinstance(prefix, tuple):
for substring in prefix:
if not isinstance(substring, str):
raise TypeError(...)
return tailmatch(...)
elif not isinstance(prefix, str):
raise TypeError(...)
return tailmatch(...)
(其中 tailmatch(...)
进行实际的匹配工作).
(where tailmatch(...)
does the actual matching work).
所以是的,任何可迭代的都可以用于该 for
循环.但是,所有其他接受多个值的字符串测试 API(以及 isinstance()
和 issubclass()
)也只接受元组,这告诉您作为用户可以安全地假设值不会被改变.您不能改变元组,但理论上该方法可以改变列表.
So yes, any iterable would do for that for
loop. But, all the other string test APIs (as well as isinstance()
and issubclass()
) that take multiple values also only accept tuples, and this tells you as a user of the API that it is safe to assume that the value won't be mutated. You can't mutate a tuple but the method could in theory mutate the list.
还请注意,您通常测试固定数量的前缀或后缀或类(在 isinstance()
和 issubclass()
的情况下)代码>);该实现不适合 大量 数量的元素.元组意味着您的元素数量有限,而列表可以任意大.
Also note that you usually test for a fixed number of prefixes or suffixes or classes (in the case of isinstance()
and issubclass()
); the implementation is not suited for a large number of elements. A tuple implies that you have a limited number of elements, while lists can be arbitrarily large.
接下来,如果可以接受任何可迭代或序列类型,那么这将包括字符串;单个字符串也是一个序列.那么应该将单个字符串参数视为单独的字符还是单个前缀?
Next, if any iterable or sequence type would be acceptable, then that would include strings; a single string is also a sequence. Should then a single string argument be treated as separate characters, or as a single prefix?
因此,换句话说,自文档的限制是序列不会发生突变,与其他 API 一致,它暗示要测试的项目数量有限,并消除了关于如何进行的歧义应处理单个字符串参数.
So in other words, it's a limitation to self-document that the sequence won't be mutated, is consistent with other APIs, it carries an implication of a limited number of items to test against, and removes ambiguity as to how a single string argument should be treated.
请注意,这是之前在 Python Ideas 列表中提出的;参见这个帖子;Guido van Rossum 的主要论点是,您要么是单个字符串的特殊情况,要么只接受一个元组.他选择了后者,认为没有必要改变这一点.
Note that this was brought up before on the Python Ideas list; see this thread; Guido van Rossum's main argument there is that you either special case for single strings or for only accepting a tuple. He picked the latter and doesn't see a need to change this.
相关文章