Series 的真值是不明确的.使用 a.empty、a.bool()、a.item()、a.any() 或 a.all()

2022-01-19 00:00:00 python pandas dataframe boolean filtering

问题描述

在使用 or 条件过滤我的结果数据帧时遇到问题.我希望我的结果 df 提取所有列 var 高于 0.25 和低于 -0.25 的值.

Having issue filtering my result dataframe with an or condition. I want my result df to extract all column var values that are above 0.25 and below -0.25.

下面的这个逻辑给了我一个模棱两可的真值但是当我把这个过滤分成两个单独的操作时它会起作用.这里发生了什么?不确定在哪里使用建议的 a.empty()、a.bool()、a.item()、a.any() 或 a.all().

This logic below gives me an ambiguous truth value however it work when I split this filtering in two separate operations. What is happening here? not sure where to use the suggested a.empty(), a.bool(), a.item(),a.any() or a.all().

result = result[(result['var'] > 0.25) or (result['var'] < -0.25)]


解决方案

orand python 语句需要 truth 值.对于 pandas,这些被认为是模棱两可的,因此您应该使用按位"|(或)或 &(和)操作:

The or and and python statements require truth-values. For pandas these are considered ambiguous so you should use "bitwise" | (or) or & (and) operations:

result = result[(result['var']>0.25) | (result['var']<-0.25)]

这些类型的数据结构被重载以产生逐元素的(或).

These are overloaded for these kind of datastructures to yield the element-wise or (or and).

只是为这个声明添加更多解释:

Just to add some more explanation to this statement:

当你想获取 pandas.Seriesbool 时抛出异常:

The exception is thrown when you want to get the bool of a pandas.Series:

>>> import pandas as pd
>>> x = pd.Series([1])
>>> bool(x)
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

您遇到的是运算符 隐式 将操作数转换为 bool 的地方(您使用 or 但它也发生在 and, ifwhile):

What you hit was a place where the operator implicitly converted the operands to bool (you used or but it also happens for and, if and while):

>>> x or x
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
>>> x and x
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
>>> if x:
...     print('fun')
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
>>> while x:
...     print('fun')
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

除了这 4 个语句之外,还有几个 Python 函数可以隐藏一些 bool 调用(如 anyallfilter, ...) 这些通常对于 pandas.Series 没有问题,但为了完整起见,我想提及这些.

Besides these 4 statements there are several python functions that hide some bool calls (like any, all, filter, ...) these are normally not problematic with pandas.Series but for completeness I wanted to mention these.

在您的情况下,该例外并没有真正的帮助,因为它没有提及正确的替代方案.对于 andor 您可以使用(如果您想要逐元素比较):

In your case the exception isn't really helpful, because it doesn't mention the right alternatives. For and and or you can use (if you want element-wise comparisons):

  • numpy.logical_or:

>>> import numpy as np
>>> np.logical_or(x, y)

或者只是 | 操作符:

>>> x | y

  • numpy.logical_and:

    >>> np.logical_and(x, y)
    

    或简单的 & 运算符:

    >>> x & y
    

  • 如果您使用运算符,请确保正确设置括号,因为 运算符优先级.

    If you're using the operators then make sure you set your parenthesis correctly because of the operator precedence.

    几个逻辑 numpy 函数应该在 pandas.Series 上工作.

    如果您在执行 ifwhile 时遇到异常中提到的替代方案,则更适合.我将简要解释其中的每一个:

    The alternatives mentioned in the Exception are more suited if you encountered it when doing if or while. I'll shortly explain each of these:

    • 如果您想检查您的系列是否为空:

    >>> x = pd.Series([])
    >>> x.empty
    True
    >>> x = pd.Series([1])
    >>> x.empty
    False
    

    如果没有明确的布尔解释.因此,如果你想要类似 python 的检查,你可以这样做:if x.sizeif not x.empty 而不是 if x.

    Python normally interprets the length of containers (like list, tuple, ...) as truth-value if it has no explicit boolean interpretation. So if you want the python-like check, you could do: if x.size or if not x.empty instead of if x.

    如果您的 Series 包含 一个且只有一个布尔值:

    If your Series contains one and only one boolean value:

    >>> x = pd.Series([100])
    >>> (x > 50).bool()
    True
    >>> (x < 50).bool()
    False
    

  • 如果您想检查您的系列的第一个也是唯一一个项目(如 .bool(),但即使对于非布尔内容也有效):

  • If you want to check the first and only item of your Series (like .bool() but works even for not boolean contents):

    >>> x = pd.Series([100])
    >>> x.item()
    100
    

  • 如果您想检查 all 或 any 项是否非零、非空或非假:

  • If you want to check if all or any item is not-zero, not-empty or not-False:

    >>> x = pd.Series([0, 1, 2])
    >>> x.all()   # because one element is zero
    False
    >>> x.any()   # because one (or more) elements are non-zero
    True
    

  • 相关文章