对箱中的数据进行平均化
问题描述
我有两个表:一个是深度表,另一个是叶绿素表,它们是相互对应的。我想要平均每0.5米深度的叶绿素数据。
chl = [0.4,0.1,0.04,0.05,0.4,0.2,0.6,0.09,0.23,0.43,0.65,0.22,0.12,0.2,0.33]
depth = [0.1,0.3,0.31,0.44,0.49,1.1,1.145,1.33,1.49,1.53,1.67,1.79,1.87,2.1,2.3]
深度槽的长度并不总是相等,也不总是以0.0或0.5的间隔开始。然而,叶绿素数据总是与深度数据协调一致。叶绿素平均值也不能按升序排列,需要根据深度保持正确的顺序。深度和叶绿素的清单很长,所以我不能单独做这件事。
如何制作包含平均叶绿素数据的0.5米深的垃圾箱?
目标:
depth = [0.5,1.0,1.5,2.0,2.5]
chlorophyll = [avg1,avg2,avg3,avg4,avg5]
例如:
avg1 = np.mean(0.4,0.1,0.04,0.05,0.4)
解决方案
一种方法是使用numpy.digitize
将您的类别装箱。
然后使用词典或列表理解来计算结果。
import numpy as np
chl = np.array([0.4,0.1,0.04,0.05,0.4,0.2,0.6,0.09,0.23,0.43,0.65,0.22,0.12,0.2,0.33])
depth = np.array([0.1,0.3,0.31,0.44,0.49,1.1,1.145,1.33,1.49,1.53,1.67,1.79,1.87,2.1,2.3])
bins = np.array([0,0.5,1.0,1.5,2.0,2.5])
A = np.vstack((np.digitize(depth, bins), chl)).T
res = {bins[int(i)]: np.mean(A[A[:, 0] == i, 1]) for i in np.unique(A[:, 0])}
# {0.5: 0.198, 1.5: 0.28, 2.0: 0.355, 2.5: 0.265}
或您想要的精确格式:
res_lst = [np.mean(A[A[:, 0] == i, 1]) for i in range(len(bins))]
# [nan, 0.198, nan, 0.28, 0.355, 0.265]
相关文章