在 Python 中从数据点中查找移动平均值

2022-01-09 00:00:00 python plot average sum

问题描述

我又玩了一下 Python,发现了一本带有示例的简洁书籍.示例之一是绘制一些数据.我有一个包含两列的 .txt 文件,并且我有数据.我将数据绘制得很好,但在练习中它说:进一步修改您的程序以计算和绘制数据的运行平均值,定义为:

I am playing in Python a bit again, and I found a neat book with examples. One of the examples is to plot some data. I have a .txt file with two columns and I have the data. I plotted the data just fine, but in the exercise it says: Modify your program further to calculate and plot the running average of the data, defined by:

$Y_k=frac{1}{2r}sum_{m=-r}^r y_{k+m}$

其中 r=5 在这种情况下(并且 y_k 是数据文件中的第二列).让程序在同一张图上绘制原始数据和运行平均值.

where r=5 in this case (and the y_k is the second column in the data file). Have the program plot both the original data and the running average on the same graph.

到目前为止,我有这个:

So far I have this:

from pylab import plot, ylim, xlim, show, xlabel, ylabel
from numpy import linspace, loadtxt

data = loadtxt("sunspots.txt", float)
r=5.0

x = data[:,0]
y = data[:,1]

plot(x,y)
xlim(0,1000)
xlabel("Months since Jan 1749.")
ylabel("No. of Sun spots")
show()

那么我如何计算总和?在 Mathematica 中它很简单,因为它是符号操作(例如 Sum[i, {i,0,10}]),但是如何在 python 中计算总和,它取数据中的每十个点并取平均值,直到最后点数?

So how do I calculate the sum? In Mathematica it's simple since it's symbolic manipulation (Sum[i, {i,0,10}] for example), but how to calculate sum in python which takes every ten points in the data and averages it, and does so until the end of points?

我看了这本书,但找不到任何可以解释这一点的东西:

I looked at the book, but found nothing that would explain this :

heltonbiker 的代码成功了 ^^ :D

heltonbiker's code did the trick ^^ :D

from __future__ import division
from pylab import plot, ylim, xlim, show, xlabel, ylabel, grid
from numpy import linspace, loadtxt, ones, convolve
import numpy as numpy

data = loadtxt("sunspots.txt", float)

def movingaverage(interval, window_size):
    window= numpy.ones(int(window_size))/float(window_size)
    return numpy.convolve(interval, window, 'same')

x = data[:,0]
y = data[:,1]


plot(x,y,"k.")
y_av = movingaverage(y, 10)
plot(x, y_av,"r")
xlim(0,1000)
xlabel("Months since Jan 1749.")
ylabel("No. of Sun spots")
grid(True)
show()

我得到了这个:

非常感谢^^ :)


解决方案

在阅读此答案之前,请记住下面还有另一个答案,来自 Roman Kh,它使用 numpy.cumsum 并且比这个答案快得多.

Before reading this answer, bear in mind that there is another answer below, from Roman Kh, which uses numpy.cumsum and is MUCH MUCH FASTER than this one.

<小时>

最佳将移动/滑动平均值(或任何其他滑动窗口函数)应用于信号的一种常见方法是使用 numpy.convolve()..p>


Best One common way to apply a moving/sliding average (or any other sliding window function) to a signal is by using numpy.convolve().

def movingaverage(interval, window_size):
    window = numpy.ones(int(window_size))/float(window_size)
    return numpy.convolve(interval, window, 'same')

这里,interval 是您的 x 数组,window_size 是要考虑的样本数.窗口将以每个样本为中心,因此它会在当前样本之前和之后获取样本以计算平均值.您的代码将变为:

Here, interval is your x array, and window_size is the number of samples to consider. The window will be centered on each sample, so it takes samples before and after the current sample in order to calculate the average. Your code would become:

plot(x,y)
xlim(0,1000)

x_av = movingaverage(interval, r)
plot(x_av, y)

xlabel("Months since Jan 1749.")
ylabel("No. of Sun spots")
show()

希望这会有所帮助!

相关文章