如何检测 Pandas 时间序列图中的突然变化

2022-01-11 00:00:00 python pandas time-series

问题描述

我正在尝试检测"一系列速度的突然下降,但我不知道如何捕捉它.详情及代码如下:

I am trying to "detect" a sudden drop in velocity in a series and I'm not sure how to capture it. The details and code are below:

这是我拥有的系列片段以及生成它的代码:

This is a snippet of the Series that I have along with the code to produce it:

velocity_df.velocity.car1

Index   velocity
200     17.9941
201     17.9941
202     18.4031
203     18.4031

这是整个系列的情节

我正在尝试检测从 220 到 230-40 的突然下降,并将其保存为如下所示的系列:

I'm trying to detect the sudden drop from 220 to 230-40 and save that out as a Series that looks like this:

Index   velocity
220      14.927
221      14.927
222      14.927
223      14.927
224      14.518
225      14.518
226     16.1538
227     12.2687
228     9.20155
229     6.33885
230     4.49854

我只是想在速度突然下降时捕捉一个大概的范围,以便使用其他功能.

I'm just trying to capture an approximate range when there is a sudden decrease in speed so as to use other features.

如果我可以添加任何其他信息,请告诉我.谢谢!

If I can add any additional information, please let me know. Thank you!


解决方案

如果您想一个一个地比较两个值,这将是一种简单的方法:

This would be a simple approach, if you want to compare two values one by one:

鉴于您的问题中的系列,称为 s,您可以通过将其减去 1 来构造数据的绝对离散导数:

Given the series from your question, called s you can construct the absolute discrete derivative of your data by subtracting it with a shift of 1:

d = pd.Series(s.values[1:] - s.values[:-1], index=s.index[:-1]).abs()

如果现在取该系列绝对差中的最大 m,则可以将其乘以 0 到 1 之间的因子 a 作为阈值:

If you now take the maximum m of that series of absolute differences, you can multiply it with a factor a between 0 and 1 as a threshold:

a = .7
m = d.max()
print(d > m * a)

最后一行输出匹配的索引.

The last line outputs the indices of the matches.

在此基础上,您可以使用滑动窗口技术,例如 内核密度估计或 Parzen 窗口 创建更平滑的结果:

Building up on this, you could use a sliding window technique such as kernel density estimation, or Parzen window to create more smooth results:

r = d.rolling(3, min_periods=1, win_type='parzen').sum()
n = r.max()

就像之前我们可以打印出匹配的元素一样

Like before we can print out the matching elements

print(r > n * a)

给出以下输出

Index
220    False
221    False
222    False
223    False
224    False
225    False
226    False
227     True
228     True
229     True
dtype: bool

相关文章