Pandas 对数据框的布尔比较

2022-01-19 00:00:00 python pandas dataframe boolean

问题描述

当我对数据框中的单个元素进行比较时出现错误,但我不明白为什么.

I am getting the error when I make a comparison on a single element in a dataframe, but I don't understand why.

我有一个数据框 df,其中包含许多客户的时间序列数据,其中包含一些空值:

I have a dataframe df with timeseries data for a number of customers, with some null values within it:

df.head()
                    8143511  8145987  8145997  8146001  8146235  8147611  
2012-07-01 00:00:00      NaN      NaN      NaN      NaN      NaN      NaN   
2012-07-01 00:30:00    0.089      NaN    0.281    0.126    0.190    0.500   
2012-07-01 01:00:00    0.090      NaN    0.323    0.141    0.135    0.453   
2012-07-01 01:30:00    0.061      NaN    0.278    0.097    0.093    0.424   
2012-07-01 02:00:00    0.052      NaN    0.278    0.158    0.170    0.462  

在我的脚本中,行if pd.isnull(df[[customer_ID]].loc[ts]):产生错误:

In my script, the line if pd.isnull(df[[customer_ID]].loc[ts]): generates an error:

ValueError: Series 的真值不明确.使用 a.empty、a.bool()、a.item()、a.any() 或 a.all().

但是,如果我在脚本行设置断点,并且当脚本停止时,我会在控制台中输入:

However, if I put a breakpoint on the line of script, and when the script stops I type this into the console:

pd.isnull(df[[customer_ID]].loc[ts])

输出是:

8143511    True
Name: 2012-07-01 00:00:00, dtype: bool

如果我允许脚本从该点继续,则会立即生成错误.

If I allow the script to continue from that point, the error is generated immediately.

如果布尔表达式可以求值并且值为True,为什么它会在if 表达式中产生错误?这对我来说毫无意义.

If the boolean expression can be evaluated and has the value True, why does it generate an error in the if expression? This makes no sense to me.


解决方案

第二组 [] 正在返回一个我误认为是单个值的系列.最简单的解决方案是删除 []:

The second set of [] was returning a series which I mistook for a single value. The simplest solution is to remove []:

if pd.isnull(df[customer_ID].loc[ts]):
       pass

相关文章