使用 pandas python 计算每日气候学

2022-01-11 00:00:00 python pandas time-series

问题描述

我正在尝试使用 pandas 来计算每日气候学.我的代码是:

I am trying to use pandas to compute daily climatology. My code is:

import pandas as pd dates = pd.date_range('1950-01-01', '1953-12-31', freq='D') rand_data = [int(1000*random.random()) for i in xrange(len(dates))] cum_data = pd.Series(rand_data, index=dates) cum_data.to_csv('test.csv', sep=" ")

cum_data 是包含从 1950 年 1 月 1 日到 1953 年 12 月 31 日的每日日期的数据框.我想创建一个长度为 365 的新向量，第一个元素包含 1950 年、1951 年、1952 年和 1953 年 1 月 1 日的 rand_data 平均值.等等第二个元素...

cum_data is the data frame containing daily dates from 1st Jan 1950 to 31st Dec 1953. I want to create a new vector of length 365 with the first element containing the average of rand_data for January 1st for 1950, 1951, 1952 and 1953. And so on for the second element...

有什么建议我可以使用 pandas 做到这一点吗?

Any suggestions how I can do this using pandas?

解决方案

您可以按一年中的一天分组，并计算这些组的平均值:

You can groupby the day of the year, and the calculate the mean for these groups:

cum_data.groupby(cum_data.index.dayofyear).mean()

但是，您必须注意闰年.这将导致这种方法出现问题.作为替代方案，您还可以按月和日分组:

However, you have the be aware of leap years. This will cause problems with this approach. As alternative, you can also group by the month and the day:

In [13]: cum_data.groupby([cum_data.index.month, cum_data.index.day]).mean() Out[13]: 1 1 462.25 2 631.00 3 615.50 4 496.00 ... 12 28 378.25 29 427.75 30 528.50 31 678.50 Length: 366, dtype: float64

相关文章