如何按 NAN 值拆分 pandas 时间序列

2022-01-11 00:00:00 python numpy pandas time-series split

问题描述

我有一个看起来像这样的熊猫时间序列:

I have a pandas TimeSeries which looks like this:

2007-02-06 15:00:00 0.780 2007-02-06 16:00:00 0.125 2007-02-06 17:00:00 0.875 2007-02-06 18:00:00 NaN 2007-02-06 19:00:00 0.565 2007-02-06 20:00:00 0.875 2007-02-06 21:00:00 0.910 2007-02-06 22:00:00 0.780 2007-02-06 23:00:00 NaN 2007-02-07 00:00:00 NaN 2007-02-07 01:00:00 0.780 2007-02-07 02:00:00 0.580 2007-02-07 03:00:00 0.880 2007-02-07 04:00:00 0.791 2007-02-07 05:00:00 NaN

每当连续出现一个或多个 NaN 值时，我想拆分 pandas TimeSeries.目标是我将事件分开.

I would like split the pandas TimeSeries everytime there occurs one or more NaN values in a row. The goal is that I have separated events.

Event1: 2007-02-06 15:00:00 0.780 2007-02-06 16:00:00 0.125 2007-02-06 17:00:00 0.875 Event2: 2007-02-06 19:00:00 0.565 2007-02-06 20:00:00 0.875 2007-02-06 21:00:00 0.910 2007-02-06 22:00:00 0.780

我可以循环遍历每一行，但还有一种聪明的方法吗???

I could loop through every row but is there also a smart way of doing that???

解决方案

你可以使用 numpy.split 然后过滤结果列表.这是一个示例，假设具有值的列标记为 "value":

You can use numpy.split and then filter the resulting list. Here is one example assuming that the column with the values is labeled "value":

events = np.split(df, np.where(np.isnan(df.value))[0]) # removing NaN entries events = [ev[~np.isnan(ev.value)] for ev in events if not isinstance(ev, np.ndarray)] # removing empty DataFrames events = [ev for ev in events if not ev.empty]

您将获得一个列表，其中包含由 NaN 值分隔的所有事件.

You will have a list with all the events separated by the NaN values.

相关文章