如何减少python中的元组列表

2022-01-13 00:00:00 python python-2.7 mapreduce

问题描述

我有一个数组,我想计算数组中每个项目的出现次数.

I have an array and I want to count the occurrence of each item in the array.

我已经设法使用 map 函数来生成一个元组列表.

I have managed to use a map function to produce a list of tuples.

def mapper(a):
    return (a, 1)

r = list(map(lambda a: mapper(a), arr));

//output example: 
//(11817685, 1), (2014036792, 1), (2014047115, 1), (11817685, 1)

我希望 reduce 函数可以帮助我按每个元组中的第一个数字 (id) 对计数进行分组.例如:

I'm expecting the reduce function can help me to group counts by the first number (id) in each tuple. For example:

(11817685, 2), (2014036792, 1), (2014047115, 1)

我试过了

cnt = reduce(lambda a, b: a + b, r);

还有其他一些方法,但它们都不起作用.

and some other ways but they all don't do the trick.

注意感谢所有关于解决问题的其他方法的建议,但我只是在学习 Python 以及如何在这里实现 map-reduce,并且我已经简化了我的实际业务问题以使其易于理解,所以请告诉我一个正确的方法来做 map-reduce.

NOTE Thanks for all the advice on other ways to solve the problems, but I'm just learning Python and how to implement a map-reduce here, and I have simplified my real business problem a lot to make it easy to understand, so please kindly show me a correct way of doing map-reduce.


解决方案

你可以使用Counter:

from collections import Counter
arr = [11817685, 2014036792, 2014047115, 11817685]
counter = Counter(arr)
print zip(counter.keys(), counter.values())

正如@ShadowRanger 所指出的, Counteritems() 方法:

As pointed by @ShadowRanger Counter has items() method:

from collections import Counter
arr = [11817685, 2014036792, 2014047115, 11817685]
print Counter(arr).items()

相关文章