pandas 找到局部最大值和最小值

2022-01-11 00:00:00 python numpy pandas dataframe time-series

问题描述

我有一个熊猫数据框,它有两列,一列是温度,另一列是时间.

I have a pandas data frame with two columns one is temperature the other is time.

我想创建第三列和第四列,分别称为 min 和 max.这些列中的每一列都将填充 nan,除非存在局部最小值或最大值,否则它将具有该极值的值.

I would like to make third and fourth columns called min and max. Each of these columns would be filled with nan's except where there is a local min or max, then it would have the value of that extrema.

这是数据的样例,基本上我试图识别图中的所有峰值和低点.

Here is a sample of what the data looks like, essentially I am trying to identify all the peaks and low points in the figure.

是否有任何带有 pandas 的内置工具可以做到这一点?

Are there any built in tools with pandas that can accomplish this?


解决方案

假设感兴趣的列标记为 data,一种解决方案是

Assuming that the column of interest is labelled data, one solution would be

df['min'] = df.data[(df.data.shift(1) > df.data) & (df.data.shift(-1) > df.data)]
df['max'] = df.data[(df.data.shift(1) < df.data) & (df.data.shift(-1) < df.data)]

例如:

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

# Generate a noisy AR(1) sample
np.random.seed(0)
rs = np.random.randn(200)
xs = [0]
for r in rs:
    xs.append(xs[-1]*0.9 + r)
df = pd.DataFrame(xs, columns=['data'])

# Find local peaks
df['min'] = df.data[(df.data.shift(1) > df.data) & (df.data.shift(-1) > df.data)]
df['max'] = df.data[(df.data.shift(1) < df.data) & (df.data.shift(-1) < df.data)]

# Plot results
plt.scatter(df.index, df['min'], c='r')
plt.scatter(df.index, df['max'], c='g')
df.data.plot()

相关文章