在单个绘图 python 上比较多年数据

2022-01-11 00:00:00 python pandas time-series matplotlib plot

问题描述

我有两个不同年份的时间序列存储在 pandas 数据框中.例如:

data15 = pd.DataFrame([1,2,3,4,5,6,7,8,9,10,11,12],index=pd.date_range(start='2015-01',end='2016-01',freq='M'),列=['2015'])data16 = pd.DataFrame([5,4,3,2,1],index=pd.date_range(start='2016-01',end='2016-06',freq='M'),列=['2016'])

我实际上是在处理日常数据,但如果这个问题得到充分回答,我就能弄清楚其余的问题.

我正在尝试将这些不同数据集的图叠加到 1 月到 12 月的单个图上,以比较年份之间的差异.我可以通过为其中一个数据集创建一个错误"索引来做到这一点,这样它们就有一个共同的年份:

data16.index = data15.index[:len(data16)]斧头 = data15.plot()data16.plot(ax=ax)

但如果可能的话,我想避免弄乱索引.这种方法的另一个问题是年份(2015)将出现在我不想要的 x 轴刻度标签中.有谁知道更好的方法来做到这一点?

解决方案

做到这一点的一种方法是将透明轴覆盖在第一个轴上,并在该轴上绘制第二个数据框,但随后您需要更新同时两个轴的 x 限制(类似于 twinx).但是,我认为这需要更多的工作,并且还有一些缺点:例如,您不能再轻松地以交互方式放大到特定区域,除非您确保两个轴都通过它们的 x 限制链接.实际上,最简单的方法是通过弄乱索引"来考虑该偏移量.

对于刻度标签,您可以通过指定 x-tick 格式轻松更改格式,使其不显示年份:

将 matplotlib.dates 导入为 mdatesmonth_day_fmt = mdates.DateFormatter('%b %d') # "Locale 的月份缩写名称.+ 日期"ax.xaxis.set_major_formatter(month_day_fmt)

查看 matplotlib API 示例以指定日期格式.p>

I have two time series from different years stored in pandas dataframes. For example:

data15 = pd.DataFrame(
    [1,2,3,4,5,6,7,8,9,10,11,12],
    index=pd.date_range(start='2015-01',end='2016-01',freq='M'),
    columns=['2015']
)
data16 = pd.DataFrame(
    [5,4,3,2,1],
    index=pd.date_range(start='2016-01',end='2016-06',freq='M'),
    columns=['2016']
)

I'm actually working with daily data but if this question is answered sufficiently I can figure out the rest.

What I'm trying to do is overlay the plots of these different data sets onto a single plot from January through December to compare the differences between the years. I can do this by creating a "false" index for one of the datasets so they have a common year:

data16.index = data15.index[:len(data16)]
ax = data15.plot()
data16.plot(ax=ax)

But I would like to avoid messing with the index if possible. Another problem with this method is that the year (2015) will appear in the x axis tick label which I don't want. Does anyone know of a better way to do this?

解决方案

One way to do this would be to overlay a transparent axes over the first, and plot the 2nd dataframe in that one, but then you'd need to update the x-limits of both axes at the same time (similar to twinx). However, I think that's far more work and has a few more downsides: you can't easily zoom interactively into a specific region anymore for example, unless you make sure both axes are linked via their x-limits. Really, the easiest is to take into account that offset, by "messing with the index".

As for the tick labels, you can easily change the format so that they don't show the year by specifying the x-tick format:

import matplotlib.dates as mdates
month_day_fmt = mdates.DateFormatter('%b %d') # "Locale's abbreviated month name. + day of the month"
ax.xaxis.set_major_formatter(month_day_fmt)

Have a look at the matplotlib API example for specifying the date format.

相关文章