获取具有相应索引值的每日数据帧的每月最大值

2022-01-11 00:00:00 python pandas time-series max group-by

问题描述

我已经从雅虎财经下载了每日数据

I have dowloaded daily data from yahoo finance

                    Open          High           Low         Close     Volume  
Date                                                                            
2016-01-04  10485.809570  10485.910156  10248.580078  10283.440430  116249000   
2016-01-05  10373.269531  10384.259766  10173.519531  10310.099609   82348000   
2016-01-06  10288.679688  10288.679688  10094.179688  10214.019531   87751700   
2016-01-07  10144.169922  10145.469727   9810.469727   9979.849609  124188100   
2016-01-08  10010.469727  10122.459961   9849.339844   9849.339844   95672200   
...
2016-02-23   9503.120117   9535.120117   9405.219727   9416.769531   87240700   
2016-02-24   9396.480469   9415.330078   9125.190430   9167.799805   99216000   
2016-02-25   9277.019531   9391.309570   9199.089844   9331.480469          0   
2016-02-26   9454.519531   9576.879883   9436.330078   9513.299805   95662100   
2016-02-29   9424.929688   9498.570312   9332.419922   9495.400391   90978700   

我想找出每个月的最高收盘价以及这个收盘价的日期.

I would like to find the maximum closing price each month and also the date of this closing price.

使用 groupby dfM = df['Close'].groupby(df.index.month).max() 它会返回每月最大值,但我会失去每日索引位置.

With a groupby dfM = df['Close'].groupby(df.index.month).max() it returns me the monthly maximums but I am losing the daily index position.

   grouped by month 
1      10310.099609
2       9757.879883

有没有保存索引的好方法?

Is there a good way to to keep the index?

我会寻找这样的结果:

            grouped by month 
2016-01-05      10310.099609
2016-02-01       9757.879883


解决方案

你可以使用 TimeGroupergroupby 获得每月的最大值:

You can get the max value per month using TimeGrouper together with groupby:

from pandas.io.data import DataReader

aapl = DataReader('AAPL', data_source='yahoo', start='2015-6-1')
>>> aapl.groupby(pd.TimeGrouper('M')).Close.max()
Date
2015-06-30    130.539993
2015-07-31    132.070007
2015-08-31    119.720001
2015-09-30    116.410004
2015-10-31    120.529999
2015-11-30    122.570000
2015-12-31    119.029999
2016-01-31    105.349998
2016-02-29     98.120003
2016-03-31    100.529999
Freq: M, Name: Close, dtype: float64

使用idxmax会得到对应日期的最高价格.

Using idxmax will get the corresponding dates of the max price.

>>> aapl.groupby(pd.TimeGrouper('M')).Close.idxmax()
Date
2015-06-30   2015-06-01
2015-07-31   2015-07-20
2015-08-31   2015-08-10
2015-09-30   2015-09-16
2015-10-31   2015-10-29
2015-11-30   2015-11-03
2015-12-31   2015-12-04
2016-01-31   2016-01-04
2016-02-29   2016-02-17
2016-03-31   2016-03-01
Name: Close, dtype: datetime64[ns]

并排获取结果:

>>> aapl.groupby(pd.TimeGrouper('M')).Close.agg({'max date': 'idxmax', 'max price': np.max})
             max price   max date
Date                             
2015-06-30  130.539993 2015-06-01
2015-07-31  132.070007 2015-07-20
2015-08-31  119.720001 2015-08-10
2015-09-30  116.410004 2015-09-16
2015-10-31  120.529999 2015-10-29
2015-11-30  122.570000 2015-11-03
2015-12-31  119.029999 2015-12-04
2016-01-31  105.349998 2016-01-04
2016-02-29   98.120003 2016-02-17
2016-03-31  100.529999 2016-03-01

相关文章