Python pandas 绘制带间隙的时间序列

2022-01-11 00:00:00 python pandas time-series plot

问题描述

我正在尝试绘制一个带有 TimeStamp indizes 的 pandas DataFrame,该数据帧在其 indizes 中有一个时间间隔.使用 pandas.plot() 会导致前一段的最后一个时间戳和下一个段的第一个时间戳之间进行线性插值.我不想要线性插值,也不想要两个日期段之间的空白.有没有办法做到这一点?

假设我们有一个带有 TimeStamp indizes 的 DataFrame:

>>>将 numpy 导入为 np>>>将熊猫导入为 pd>>>将 matplotlib.pyplot 导入为 plt>>>df = pd.DataFrame(np.random.randn(1000), index=pd.date_range('1/1/2000', period=1000))>>>df = df.cumsum()

现在让我们取它的两个时间块并绘制它:

>>>df = pd.concat([df['2000 年 1 月':'2000 年 8 月'], df['2001 年 1 月':'2001 年 8 月']])>>>df.plot()>>>plt.show()

生成的图有一条插值线连接封闭间隙的时间戳.我不知道如何在这台机器上上传图片,但这些图片来自

您可能想要自定义刻度和刻度标签.

编辑

这适用于我使用 pandas 0.17.1 和 numpy 1.10.4.

您真正需要的只是一种将 DatetimeIndex 转换为另一种与日期时间不同的类型的方法.为了获得有意义的标签,我选择了 str.如果 x=df.index.astype(str) 不适用于您的 pandas/numpy/whatever 组合,您可以尝试其他选项:

df.index.to_series().dt.strftime('%Y-%m-%d')df.index.to_series().apply(lambda x: x.strftime('%Y-%m-%d'))...

我意识到没有必要重置索引,所以我删除了那部分.

I am trying to plot a pandas DataFrame with TimeStamp indizes that has a time gap in its indizes. Using pandas.plot() results in linear interpolation between the last TimeStamp of the former segment and the first TimeStamp of the next. I do not want linear interpolation, nor do I want empty space between the two date segments. Is there a way to do that?

Suppose we have a DataFrame with TimeStamp indizes:

>>> import numpy as np
>>> import pandas as pd
>>> import matplotlib.pyplot as plt
>>> df = pd.DataFrame(np.random.randn(1000), index=pd.date_range('1/1/2000', periods=1000))
>>> df = df.cumsum()

Now lets take two time chunks of it and plot it:

>>> df = pd.concat([df['Jan 2000':'Aug 2000'], df['Jan 2001':'Aug 2001']])
>>> df.plot()
>>> plt.show()

The resulting plot has an interpolation line connecting the TimeStamps enclosing the gap. I cannot figure out how to upload pictures on this machine, but these pictures from Google Groups show my problem (interpolated.jpg, no-interpolation.jpg and no gaps.jpg). I can recreate the first as shown above. The second is achievable by replacing all gap values with NaN (see also this question). How can I achieve the third version, where the time gap is omitted?

解决方案

Try:

df.plot(x=df.index.astype(str))

You may want to customize ticks and tick labels.

EDIT

That works for me using pandas 0.17.1 and numpy 1.10.4.

All you really need is a way to convert the DatetimeIndex to another type which is not datetime-like. In order to get meaningful labels I chose str. If x=df.index.astype(str) does not work with your combination of pandas/numpy/whatever you can try other options:

df.index.to_series().dt.strftime('%Y-%m-%d')
df.index.to_series().apply(lambda x: x.strftime('%Y-%m-%d'))
...

I realized that resetting the index is not necessary so I removed that part.

相关文章