计算两个xarray数据集的平均值非常非常慢。为什么?

huangapple go评论138阅读模式
英文:

Calculating the mean of two xarrays datasets is very very slow. Why?

问题

我有两个包含空间温度的xarray数据集。每个数据集都有纬度和经度坐标以及温度数据变量。这些坐标是匹配的。我想通过计算它们的平均值来合并这两个数据集。它们包含NaN值,所以在进行平均计算时需要跳过它们。

我目前的做法是:

(xr1 + xr2) / 2

但这种方法速度较慢。我还尝试过:

np.nanmean(np.dstack((xr1.temp.values, xr2.temp.values)), 2)

np.nanmean 操作本身非常快速。问题在于 .values 的速度非常慢。我还尝试将一个数据变量从一个数据集分配给另一个数据集,然后取平均值,但速度也很慢。我还尝试了xarray的合并方法,但它在处理NaN值时表现不佳。

请分享任何有用的想法。

英文:

I have two xarray datasets that contain spatial temperature. Each has lat lon coordinates and a temperature data variable. The coordinates match. I want to combine the two datasets by calculating their averages. They do contain NaNs so the averaging needs to skip them.

The way I am doing it now is:

(xr1+xr2)/2

which is slow. I also tried

np.nanmean(np.dstack((xr1.temp.values,xr2.temp.values)),2)

The operation np.nanmean itself is very fast. The problem is that .values is very very slow. I also tried assigning one data variable from one dataset to another and then take the mean but it is also very slow. I also tried xarray merge but it doesn't work well with NaNs.

Please share any useful ideas

答案1

得分: 3

感谢评论中的 @Brian61354270,我意识到xarrays与包含数据的URL相关联,但数据尚未加载到我的计算机上。在两个xarrays上使用.load后,数值操作非常快速。

英文:

Thanks to @Brian61354270 in the comments, I realized that the xarrays were linked to a URL containing the data, but it wasn't loaded yet to my computer. After using .load on both xarrays, the numerical operations were very fast.

huangapple
  • 本文由 发表于 2023年7月27日 23:53:15
  • 转载请务必保留本文链接:https://go.coder-hub.com/76781503.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定