英文:
Calculating the mean of two xarrays datasets is very very slow. Why?
问题
我有两个包含空间温度的xarray数据集。每个数据集都有纬度和经度坐标以及温度数据变量。这些坐标是匹配的。我想通过计算它们的平均值来合并这两个数据集。它们包含NaN值,所以在进行平均计算时需要跳过它们。
我目前的做法是:
(xr1 + xr2) / 2
但这种方法速度较慢。我还尝试过:
np.nanmean(np.dstack((xr1.temp.values, xr2.temp.values)), 2)
np.nanmean
操作本身非常快速。问题在于 .values
的速度非常慢。我还尝试将一个数据变量从一个数据集分配给另一个数据集,然后取平均值,但速度也很慢。我还尝试了xarray的合并方法,但它在处理NaN值时表现不佳。
请分享任何有用的想法。
英文:
I have two xarray datasets that contain spatial temperature. Each has lat lon coordinates and a temperature data variable. The coordinates match. I want to combine the two datasets by calculating their averages. They do contain NaNs so the averaging needs to skip them.
The way I am doing it now is:
(xr1+xr2)/2
which is slow. I also tried
np.nanmean(np.dstack((xr1.temp.values,xr2.temp.values)),2)
The operation np.nanmean
itself is very fast. The problem is that .values
is very very slow. I also tried assigning one data variable from one dataset to another and then take the mean but it is also very slow. I also tried xarray merge but it doesn't work well with NaNs.
Please share any useful ideas
答案1
得分: 3
感谢评论中的 @Brian61354270,我意识到xarrays与包含数据的URL相关联,但数据尚未加载到我的计算机上。在两个xarrays上使用.load
后,数值操作非常快速。
英文:
Thanks to @Brian61354270 in the comments, I realized that the xarrays were linked to a URL containing the data, but it wasn't loaded yet to my computer. After using .load
on both xarrays, the numerical operations were very fast.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论