计算 xarray 数据集的平方根误差

2022-01-21 00:00:00 python python-3.x dataset python-xarray

问题描述

我有一月份的 xarray 数据集 monthly_data 包含以下信息:

I have xarray dataset monthly_data of just January's with following info:

lat: float64 (192)
lon: float64 (288)
time: object (1200)(monthly data)

Data Variables:
tas: (time, lat, lon)[[[45,78,...],...]...]

我有真实值 grnd_trth,它有 1 月的真实数据

I have ground truth value grnd_trth which has true data of January

Coordinates:
lat: float64 (192)
lon: float64 (288)

Data Variables:
tas(lat and lon)

现在我想从 monthly_data 相对于 grnd_trth 计算每个月的平方根误差,我尝试使用循环,我想它工作正常,这是我的尝试:

Now I want to calculate root squared error for each month from monthly_data with respect to grnd_trth, I tried using loops and I guess it's working fine, here's my try:

rms = []

for i in range(1200):
  err = 0
  for j in (grnd_trth.tas[0] - monthly_data.tas[i]).values:
    for k in j:
      err += k**2
  rms.append(err**1/2)

我只是想知道是否有更有效的方法或任何直接功能可以做到这一点?

I just want to know is there more efficient way or any direct function to do so?

monthly_data.tas的输出:

xarray.Datarray 'tas': (time:1200 lat: 192 lon: 288)
array([[[45,46,45,4....],....]...]

Coordinates:
lat:
array([-90. , -89.75,...])

lon:
array([0., 1.25.,.... ])

time:
array([cftime.DatetimeNoLeap(0001-01-15 12:00:00),
       cftime.DatetimeNoLeap(0002-01-15 12:00:00),
       cftime.DatetimeNoLeap(0003-01-15 12:00:00), ...,
       cftime.DatetimeNoLeap(1198-01-15 12:00:00),
       cftime.DatetimeNoLeap(1199-01-15 12:00:00),
       cftime.DatetimeNoLeap(1200-01-15 12:00:00)]

grnd_trth.tas的输出:

xarray.Datarray 'tas': (lat: 192 lon: 288)
array([[45,46,45,4....],....]

Coordinates:
lat:
array([-90. , -89.75,...])

lon:
array([0., 1.25.,.... ])

time:
array([cftime.DatetimeNoLeap(0001-01-15 12:00:00)]

但是当我只使用 .values() 函数时,它只会返回我的 tas 值数组!

But when I just use .values( ) function it'll only return me tas value array!


解决方案

就以更高效"的方式执行此操作而言,有两点需要指出.

In terms of doing this in a more 'efficient' way, there are two things to point out.

1) 您可以直接对 xarray 对象进行算术运算,例如.

1) You're allowed to do arithmetic operations directly on xarray objects eg.

for time_idx in range(1200):
    # For each time idx, find the root squared error at 
    # each pixel between grnd_truth and monthly_data

    err2 = (grnd_truth.tas - monthly_data.tas[time_idx,...])**2
    err  = err2**(1/2)

2) 有一个方法调用 .sum() 将数组中的所有元素相加,因此这意味着您不必执行 for k in j:代码>行,以便对像素求和.例如.

2) There's a method call .sum() which sums all the elements in an array, so this means you won't have to do the for k in j: line in order to sum over the pixels. Eg.

rms=[]

for time_idx in range(2000):
    # same two lines as before...

    # sum over every pixel and extract the value from the DataArray
    err_tot = err.sum().values

    # Add to running total
    rms.append(err_tot)

现在,这里要指出的一件事是,通过简单地从 DataArray 中提取值,您将丢失有关该数组的所有元数据!所以这并不是真正的最佳做法,但现在我认为这可以回答您的问题?

Now, one thing to point out here is that, by simply extracting the values from the DataArray, you lose all of the metadata about the array! So this isn't really best practice, but for now I think this answers your question?

相关文章