没有频率的差异 pandas.DateTimeIndex

2022-01-11 00:00:00 python pandas time-series data-science

问题描述

一个不规则的时间序列data存储在一个pandas.DataFrame中.DatetimeIndex 已设置.我需要索引中连续条目之间的时间差.

An irregular time series data is stored in a pandas.DataFrame. A DatetimeIndex has been set. I need the time difference between consecutive entries in the index.

我以为会很简单

data.index.diff()

但是得到了

AttributeError: 'DatetimeIndex' object has no attribute 'diff'

我试过了

data.index - data.index.shift(1)

但是得到了

ValueError: Cannot shift with no freq

我不想在执行此操作之前先推断或强制执行频率.时间序列中存在很大的差距,这些差距将扩展到 nan 的大量运行.重点是首先找到这些差距.

I do not want to infer or enforce a frequency first before doing this operation. There are large gaps in the time series that would be expanded to large runs of nan. The point is to find these gaps first.

那么,什么是干净的方法来完成这个看似简单的操作呢?

So, what is a clean way to do this seemingly simple operation?


解决方案

目前还没有实现index的diff函数.

There is no implemented diff function yet for index.

但是,可以先使用 Series.Index.to_series.html" rel="noreferrer">Index.to_series,如果您需要保留原始索引.如果需要默认索引,请使用不带索引参数的 Series 构造函数.

However, it is possible to convert the index to a Series first by using Index.to_series, if you need to preserve the original index. Use the Series constructor with no index parameter if the default index is needed.

代码示例:

rng = pd.to_datetime(['2015-01-10','2015-01-12','2015-01-13'])
data = pd.DataFrame({'a': range(3)}, index=rng)  
print(data)
             a
 2015-01-10  0
 2015-01-12  1
 2015-01-13  2

a = data.index.to_series().diff()
print(a)

2015-01-10      NaT
2015-01-12   2 days
2015-01-13   1 days
dtype: timedelta64[ns]

a = pd.Series(data.index).diff()
print(a)
 0      NaT
 1   2 days
 2   1 days
dtype: timedelta64[ns]

相关文章