pandas 聚合的条件总和

2022-01-13 00:00:00 python pandas r data.table

问题描述

我最近刚刚从 R 切换到 python,并且在再次习惯数据帧而不是使用 R 的 data.table 时遇到了一些麻烦.我遇到的问题是我想获取一个字符串列表,检查一个值,然后将该字符串的计数相加 - 由用户分解.所以我想把这些数据:

I just recently made the switch from R to python and have been having some trouble getting used to data frames again as opposed to using R's data.table. The problem I've been having is that I'd like to take a list of strings, check for a value, then sum the count of that string- broken down by user. So I would like to take this data:

   A_id       B    C
1:   a1    "up"  100
2:   a2  "down"  102
3:   a3    "up"  100
3:   a3    "up"  250
4:   a4  "left"  100
5:   a5 "right"  102

然后返回:

   A_id_grouped   sum_up   sum_down  ...  over_200_up
1:           a1        1          0  ...            0
2:           a2        0          1                 0
3:           a3        2          0  ...            1
4:           a4        0          0                 0
5:           a5        0          0  ...            0

在我用 R 代码做之前(使用 data.table)

Before I did it with the R code (using data.table)

>DT[ ,list(A_id_grouped, sum_up = sum(B == "up"),
+  sum_down = sum(B == "down"), 
+  ...,
+  over_200_up = sum(up == "up" & < 200), by=list(A)];

但是,我最近使用 Python 的所有尝试都失败了:

However all of my recent attempts with Python have failed me:

DT.agg({"D": [np.sum(DT[DT["B"]=="up"]),np.sum(DT[DT["B"]=="up"])], ...
    "C": np.sum(DT[(DT["B"]=="up") & (DT["C"]>200)])
    })

提前感谢您!这似乎是一个简单的问题,但我在任何地方都找不到.

Thank you in advance! it seems like a simple question however I couldn't find it anywhere.


解决方案

为了补充 unutbu 的答案,这里有一个在 groupby 对象上使用 apply 的方法.

To complement unutbu's answer, here's an approach using apply on the groupby object.

>>> df.groupby('A_id').apply(lambda x: pd.Series(dict(
    sum_up=(x.B == 'up').sum(),
    sum_down=(x.B == 'down').sum(),
    over_200_up=((x.B == 'up') & (x.C > 200)).sum()
)))
      over_200_up  sum_down  sum_up
A_id                               
a1              0         0       1
a2              0         1       0
a3              1         0       2
a4              0         0       0
a5              0         0       0

相关文章