使用 LSTM 循环网络的 Pybrain 时间序列预测

问题描述

我有一个与使用 pybrain 进行时间序列回归有关的问题.我打算在 pybrain 中使用 LSTM 层来训练和预测时间序列.

I have a question in mind which relates to the usage of pybrain to do regression of a time series. I plan to use the LSTM layer in pybrain to train and predict a time series.

我在下面的链接中找到了一个示例代码

I found an example code here in the link below

请求示例:用于预测序列中下一个值的循环神经网络

在上面的示例中,网络能够在训练后预测序列.但问题是,网络通过一次性将其输入到输入层来接收所有顺序数据.例如,如果训练数据每个有 10 个特征,则这 10 个特征将同时馈送到 10 个输入节点.

In the example above, the network is able to predict a sequence after its being trained. But the issue is, network takes in all the sequential data by feeding it in one go to the input layer. For example, if the training data has 10 features each, the 10 features will be simultaneously fed into 10 input nodes at one time.

根据我的理解,这不再是时间序列预测,对吗?由于每个特征输入网络的时间没有区别?如果我在这方面错了,请纠正我.

From my understanding, this is no longer a time series prediction am I right? Since there is no difference in terms of the time each feature is fed into the network? Correct me if I am wrong on this.

因此,我想要实现的是一个只有一个输入节点和一个输出节点的循环网络.输入节点是所有时间序列数据将在不同时间步按顺序输入的地方.网络将被训练以在输出节点重现输入.

Therefore, what I am trying to achieve is a recurrent network that has only ONE input node, and ONE output node. The input node is where all the time series data will be fed sequentially at different time steps. The network will be trained to reproduce the input at the output node.

您能否建议或指导我构建我提到的网络?非常感谢您.

Could you please suggest or guide me in constructing the network I mentioned? Thank you very much in advance.


解决方案

您可以训练具有单个输入节点和单个输出节点的 LSTM 网络来进行时间序列预测,如下所示:

You can train an LSTM network with a single input node and a single output node for doing time series prediction like this:

首先,作为一个好习惯,让我们使用 Python3 的打印功能:

First, just as a good practice, let's use Python3's print function:

from __future__ import print_function

然后,做一个简单的时间序列:

Then, make a simple time series:

data = [1] * 3 + [2] * 3
data *= 3
print(data)

[1, 1, 1, 2, 2, 2, 1, 1, 1, 2, 2, 2, 1, 1, 1, 2, 2, 2]

[1, 1, 1, 2, 2, 2, 1, 1, 1, 2, 2, 2, 1, 1, 1, 2, 2, 2]

现在将这个时间序列放入一个有监督的数据集中,其中每个样本的目标是下一个样本:

Now put this timeseries into a supervised dataset, where the target for each sample is the next sample:

from pybrain.datasets import SequentialDataSet
from itertools import cycle

ds = SequentialDataSet(1, 1)
for sample, next_sample in zip(data, cycle(data[1:])):
    ds.addSample(sample, next_sample)

构建一个具有 1 个输入节点、5 个 LSTM 单元和 1 个输出节点的简单 LSTM 网络:

Build a simple LSTM network with 1 input node, 5 LSTM cells and 1 output node:

from pybrain.tools.shortcuts import buildNetwork
from pybrain.structure.modules import LSTMLayer

net = buildNetwork(1, 5, 1, 
                   hiddenclass=LSTMLayer, outputbias=False, recurrent=True)

训练网络:

from pybrain.supervised import RPropMinusTrainer
from sys import stdout

trainer = RPropMinusTrainer(net, dataset=ds)
train_errors = [] # save errors for plotting later
EPOCHS_PER_CYCLE = 5
CYCLES = 100
EPOCHS = EPOCHS_PER_CYCLE * CYCLES
for i in xrange(CYCLES):
    trainer.trainEpochs(EPOCHS_PER_CYCLE)
    train_errors.append(trainer.testOnData())
    epoch = (i+1) * EPOCHS_PER_CYCLE
    print(" epoch {}/{}".format(epoch, EPOCHS), end="")
    stdout.flush()

print()
print("final error =", train_errors[-1])

绘制错误图(请注意,在这个简单的玩具示例中,我们在同一个数据集上进行测试和训练,这当然不是您在实际项目中所做的!):

Plot the errors (note that in this simple toy example, we are testing and training on the same dataset, which is of course not what you'd do for a real project!):

import matplotlib.pyplot as plt

plt.plot(range(0, EPOCHS, EPOCHS_PER_CYCLE), train_errors)
plt.xlabel('epoch')
plt.ylabel('error')
plt.show()

现在让网络预测下一个样本:

Now ask the network to predict the next sample:

for sample, target in ds.getSequenceIterator(0):
    print("               sample = %4.1f" % sample)
    print("predicted next sample = %4.1f" % net.activate(sample))
    print("   actual next sample = %4.1f" % target)
    print()

(以上代码基于example_rnn.py 和 PyBrain 文档)

相关文章