如何比较python中的多个元组列表?

2022-01-20 00:00:00 python list tuples

问题描述

如何比较多个这样的元组列表:

How can I compare Multiple lists of tuples like this:

[[(1,2), (3,6), (5,3)], [(1,5), (3,5)], [(2,1), (1,8), (3,9)]]

输出应该是:

[(1,2), (1,5), (1,8)],[(3,6), (3,5), (3,9)]

这意味着我只想要那些 x 轴 值与其他值匹配的值.
(5,3) 和 (2,1) 应该被丢弃!

It means that i want just those values whose x-axis value matches others.
(5,3) and (2,1) should be discarded!


解决方案

一种可能的选择

>>> def group(seq):
    for k, v in groupby(sorted(chain(*seq), key = itemgetter(0)), itemgetter(0)):
        v = list(v)
        if len(v) > 1:
            yield v


>>> list(group(some_list))
[[(1, 2), (1, 5), (1, 8)], [(3, 6), (3, 5), (3, 9)]]

另一个受欢迎的选择

>>> from collections import defaultdict
>>> def group(seq):
    some_dict = defaultdict(list)
    for e in chain(*seq):
        some_dict[e[0]].append(e)
    return (v for v in some_dict.values() if len(v) > 1)

>>> list(group(some_list))
[[(1, 2), (1, 5), (1, 8)], [(3, 6), (3, 5), (3, 9)]]

那么它们中的哪一个更适合示例数据?

So which of them fairs better with the example data?

>>> def group_sort(seq):
    for k, v in groupby(sorted(chain(*seq), key = itemgetter(0)), itemgetter(0)):
        v = list(v)
        if len(v) > 1:
            yield v


>>> def group_hash(seq):
    some_dict = defaultdict(list)
    for e in chain(*seq):
        some_dict[e[0]].append(e)
    return (v for v in some_dict.values() if len(v) > 1)

>>> t1_sort = Timer(stmt="list(group_sort(some_list))", setup = "from __main__ import some_list, group_sort, chain, groupby")
>>> t1_hash = Timer(stmt="list(group_hash(some_list))", setup = "from __main__ import some_list, group_hash,chain, defaultdict")
>>> t1_hash.timeit(100000)
3.340240917954361
>>> t1_sort.timeit(100000)
0.14324535970808938

还有一个更大的随机列表

And with a much larger random list

>>> some_list = [[sample(range(1000), 2) for _ in range(100)] for _ in range(100)]
>>> t1_sort.timeit(100)
1.3816694363194983
>>> t1_hash.timeit(1000)
34.015403087978484
>>> 

相关文章