我在 groupby 上应用了 sum(),我想对最后一列的值进行排序
问题描述
给定以下数据帧
user_ID product_id amount
1 456 1
1 87 1
1 788 3
1 456 5
1 87 2
... ... ...
第一列是客户的 ID,第二列是他购买的产品的 ID,如果是当天购买的产品数量,则表示金额"(日期也考虑在内).客户每天可以随心所欲地购买许多产品.我想计算客户购买每种产品的总次数,所以我应用了 groupby
The first column is the ID of the customer, the second is the ID of the product he bought and the 'amount' express if the quantity of the product purchased on that given day (the date is also taken into consideration). a customer can buy many products each day as much as he wants to.
I want to calculate the total of times each product is bought by the customer, so I applied a groupby
df.groupby(['user_id','product_id'], sort=True).sum()
现在我想对每组中的金额总和进行排序.有什么帮助吗?
now I want to sort the sum of amount in each group. Any help?
解决方案
假设 df
是:
user_ID product_id amount
0 1 456 1
1 1 87 1
2 1 788 3
3 1 456 5
4 1 87 2
5 2 456 1
6 2 788 3
7 2 456 5
然后您可以像以前一样使用 groupby
和 sum
,此外您可以按两列 [user_ID, amount]
和ascending=[True,False]
表示用户升序,每个用户的金额降序:
Then you can use, groupby
and sum
as before, in addition you can sort values by two columns [user_ID, amount]
and ascending=[True,False]
refers ascending order of user and for each user descending order of amount:
new_df = df.groupby(['user_ID','product_id'], sort=True).sum().reset_index()
new_df = new_df.sort_values(by = ['user_ID', 'amount'], ascending=[True,False])
print(new_df)
输出:
user_ID product_id amount
1 1 456 6
0 1 87 3
2 1 788 3
3 2 456 6
4 2 788 3
相关文章