问题是将一幅栅格图像重新采样到另一幅栅格图像的分辨率。

huangapple go评论55阅读模式
英文:

Issues resampling raster to the resolution of another raster

问题

我正在尝试将人口栅格数据重新采样和投影,以匹配降水栅格数据的形状和分辨率。

数据链接:

人口数据:https://figshare.com/ndownloader/files/10257111

降水数据:https://www.ncei.noaa.gov/data/nclimgrid-monthly/access/nclimgrid_prcp.nc

人口数据是五个不同人口模型覆盖美国大陆的每十年一次的一系列栅格数据。如果您只选择其中一个栅格数据,我可以解决其余部分(我已经合并为多波段栅格数据)。例如,如果您使用"pop_m4_2010"栅格数据,那将很有帮助。分辨率为1x1公里,投影是Albers等面积圆锥投影NAD 83 ESRI:102003。

降水数据是涵盖美国大陆的月降水数据的netcdf文件。分辨率为5x5公里,投影是WGS84 EPSG:4326。

我使用以下代码将netcdf文件转换为tiff文件:

import xarray as xr 
import rioxarray as rio

prcp_file = xr.open_dataset('nclimgrid_prcp.nc')
prp = prcp_file['prcp']
prp = prp.rio.set_spatial_dims(x_dim='lon', y_dim='lat')
prp.rio.write_crs("epsg:4326", inplace=True)
prp.rio.to_raster('prp_raster.tiff')

我还使用QGIS打开了人口文件(添加栅格图层,导航到下载文件夹以选择"w001001.adf"文件)。当我在WGS84项目中执行此操作时,QGIS似乎会自动进行投影转换,但我对此还不太熟悉,所以不确定是否正确。

从这一点开始,我尝试了几种方法来将人口栅格数据重新采样以匹配降水栅格数据的5x5分辨率。

  1. 在QGIS处理工具箱中使用GRASS r.resample
  2. 在QGIS处理工具箱中使用Raster Layer Zonal Statistics
  3. 使用Python,老实说,我已经迷失在各种不同的论坛帖子和教程中,我已经尝试了GDAL.Warp、Rasterio.Warp、仿射变换、rio.reproject_match等。以下是一些代码尝试的示例。

许多方法似乎有效(尤其是rio.reproject_match似乎简单有效)。然而,似乎都无法按预期工作。当我通过传递县矢量形状文件的分区统计来测试生成的人口栅格数据的准确性时,结果区域内的人口总数要么为0,要么与实际相差甚远。

我做错了什么?

Reproject_Match:

import rioxarray
import xarray
import matplotlib.pyplot as plt

def print_raster(raster):
    print(
        f"shape: {raster.rio.shape}\n"
        f"resolution: {raster.rio.resolution()}\n"
        f"bounds: {raster.rio.bounds()}\n"
        f"sum: {raster.sum().item()}\n"
        f"CRS: {raster.rio.crs}\n"
    )

xds = rioxarray.open_rasterio('pop_m4_2010.tif')
xds_match = rioxarray.open_rasterio('prp_raster.tiff')

fig, axes = plt.subplots(ncols=2, figsize=(12,4))
xds.plot(ax=axes[0])
xds_match.plot(ax=axes[1])
plt.draw()

print("Original Raster:\n----------------\n")
print_raster(xds)
print("Raster to Match:\n----------------\n")
print_raster(xds_match)

xds_repr_match = xds.rio.reproject_match(xds_match)

print("Reprojected Raster:\n-------------------\n")
print_raster(xds_repr_match)
print("Raster to Match:\n----------------\n")
print_raster(xds_match)

xds_repr_match.rio.to_raster("reproj_pop.tif")

另一种使用Rasterio.Warp的方法:

import rasterio
from rasterio.warp import calculate_default_transform, reproject, Resampling

# 打开源栅格
srcRst = rasterio.open('pop_m4_2010.tif') 
print("源栅格CRS:")
print(srcRst.crs)

dstCrs = {'init': 'EPSG:4326'}
print("目标栅格CRS:")
print(dstCrs)

# 计算变换数组和重新投影后栅格的形状
transform, width, height = calculate_default_transform(
        srcRst.crs, dstCrs, srcRst.width, srcRst.height, *srcRst.bounds)

print("源栅格的变换数组:")
print(srcRst.transform)

print("目标栅格的变换数组:")
print(transform)

# 针对目标栅格配置元数据
kwargs = srcRst.meta.copy()
kwargs.update({
        'crs': dstCrs,
        'transform': transform,
        'width': width,
        'height': height
    })

# 打开目标栅格
dstRst = rasterio.open('pop_m4_2010_reproj4326.tif', 'w', **kwargs)

# 重新投影并保存栅格波段数据
for i in range(1, srcRst.count + 1):
    reproject(
        source=rasterio.band(srcRst, i),
        destination=rasterio.band(dstRst, i),
        src_crs=srcRst.crs,
        dst_crs=dstCrs,
        resampling=Resampling.bilinear)
    print(i)
    
# 关闭目标栅格
dstRst.close()

以下是使用Rasterio.Warp的第二种尝试:

import rasterio
from rasterio.warp import calculate_default_transform, reproject, Resampling

prcp = rasterio.open('prp_raster.tiff', mode='r')

with rasterio.open('pop_m4_2010.tif') as dataset:
    # 重新采样数据以匹配目标形状
    data

<details>
<summary>英文:</summary>

I am trying to take a population raster and resample+reproject it to match the shape and resolution of a precipitation raster.


 

Data Links:

Population Data: https://figshare.com/ndownloader/files/10257111

Precipitation Data: https://www.ncei.noaa.gov/data/nclimgrid-monthly/access/nclimgrid_prcp.nc

The Population Data is a series of rasters per decade of 5 different population models covering the continental US. If you simply select one of the rasters I can work out the rest (I have combined into a multiband raster anyways). For example if you use the pop_m4_2010 raster that would help. The resolution is 1x1km, and the projection is Albers Equal Area Conic NAD 83 ESRI:102003.

The Precipitation Data is a netcdf file covering monthly precipitation data for the continental US. The resolution is 5x5km and the projection is WGS84 EPSG:4326.

I converted the netcdf to tiff using the following code:


import xarray as xr
import rioxarray as rio
prcp_file = xr.open_dataset('nclimgrid_prcp.nc')
prp = prcp_file['prcp']
prp = prp.rio.set_spatial_dims(x_dim='lon', y_dim='lat')
prp.rio.write_crs("epsg:4326", inplace=True)
prp.rio.to_raster('prp_raster.tiff')


I also used QGIS to open the population files (add raster layer, navigate into the downloaded folder for pop_m4_2010 and select the &quot;w001001.adf&quot; file). When I do this in a WGS84 project QGIS automatically appears to force reprojection but I am new to this so I am unsure if it is correct.

From this point I have tried several things to resample the population raster to match the 5x5 resolution of the precipitation raster.

1. In QGIS Processing Toolbox GRASS r.resample
2. In QGIS Processing Toolbox Raster Layer Zonal Statistics
3. In Python, honestly I have lost track of all of the different forum posts and tutorials I have followed on GDAL.Warp, Rasterio.Warp, affine transformations, rio.reproject_match, etc. Below are a few examples of the code attempts. 

Many of these appear to work (particularly the rio.reproject_match seemed simple and effective). However, none of these appear to be working as intended. When I test the accuracy of the resulting population raster by passing zonal stats of a county vector shapefile the resulting sum of population in the area is either 0, or wildly inaccurate. 


What am I doing wrong?

Reproject_Match:

import rioxarray # for the extension to load
import xarray

import matplotlib.pyplot as plt

%matplotlib inline

def print_raster(raster):
print(
f"shape: {raster.rio.shape}\n"
f"resolution: {raster.rio.resolution()}\n"
f"bounds: {raster.rio.bounds()}\n"
f"sum: {raster.sum().item()}\n"
f"CRS: {raster.rio.crs}\n"
)

xds = rioxarray.open_rasterio('pop_m4_2010.tif')
xds_match = rioxarray.open_rasterio('prp_raster.tiff')

fig, axes = plt.subplots(ncols=2, figsize=(12,4))
xds.plot(ax=axes[0])
xds_match.plot(ax=axes[1])
plt.draw()

print("Original Raster:\n----------------\n")
print_raster(xds)
print("Raster to Match:\n----------------\n")
print_raster(xds_match)

xds_repr_match = xds.rio.reproject_match(xds_match)

print("Reprojected Raster:\n-------------------\n")
print_raster(xds_repr_match)
print("Raster to Match:\n----------------\n")
print_raster(xds_match)

xds_repr_match.rio.to_raster("reproj_pop.tif")


Another way with Rasterio.Warp:

import rasterio
from rasterio.warp import calculate_default_transform, reproject, Resampling

#open source raster
srcRst =rasterio.open('pop_m4_2010.tif')
print("source raster crs:")
print(srcRst.crs)

dstCrs = {'init': 'EPSG:4326'}
print("destination raster crs:")
print(dstCrs)

#calculate transform array and shape of reprojected raster
transform, width, height = calculate_default_transform(
srcRst.crs, dstCrs, srcRst.width, srcRst.height, *srcRst.bounds)

print("transform array of source raster")
print(srcRst.transform)

print("transform array of destination raster")
print(transform)

#working of the meta for the destination raster
kwargs = srcRst.meta.copy()
kwargs.update({
'crs': dstCrs,
'transform': transform,
'width': width,
'height': height
})

#open destination raster
dstRst = rasterio.open('pop_m4_2010_reproj4326.tif', 'w', **kwargs)

#reproject and save raster band data
for i in range(1, srcRst.count + 1):
reproject(
source=rasterio.band(srcRst, i),
destination=rasterio.band(dstRst, i),
#src_transform=srcRst.transform,
src_crs=srcRst.crs,
#dst_transform=transform,
dst_crs=dstCrs,
resampling=Resampling.bilinear)
print(i)

#close destination raster
dstRst.close()


And here is a second attempt with Rasterio.Warp:


import rasterio
from rasterio.warp import calculate_default_transform, reproject, Resampling

prcp = rasterio.open('prp_raster.tiff', mode = 'r')

with rasterio.open('pop_m4_2010.tif') as dataset:
# resample data to target shape
data = dataset.read(out_shape=(dataset.count,prcp.height,prcp.width), resampling=Resampling.bilinear)

# scale image transform
transform = dataset.transform * dataset.transform.scale((dataset.width / data.shape[-1]),
                                                        (dataset.height / data.shape[-2]))

# Register GDAL format drivers and configuration options with a
# context manager.
with rasterio.Env():

    profile = src.profile

    profile.update(
        dtype=rasterio.float32,
        count=1,
        compress=&#39;lzw&#39;)

    with rasterio.open(&#39;pop_m4_2010_resampledtoprcp.tif&#39;, &#39;w&#39;, **profile) as dst:
        dst.write(data.astype(rasterio.float32))


</details>


# 答案1
**得分**: 1

以下是R代码的中文翻译:

```R
# 这是如何使用R来完成这个任务的示例。

# 加载terra库
library(terra)

# 读取人口数据和气候数据
pop <- rast("USA_HistoricalPopulationDataset/pop_m5_2010")
wth <- rast("nclimgrid_prcp.nc")

# 对人口数据进行投影,使用"sum"函数计算总和
wpop <- project(pop, wth, "sum")

# 检查结果
wpop
# 类型:SpatRaster
# 尺寸:596, 1385, 1 (行数,列数,层数)
# 分辨率:0.04166666, 0.04166667 (x,y)
# 范围:-124.7083, -67, 24.5417, 49.37503 (最小x,最大x,最小y,最大y)
# 坐标参考:经纬度,WGS 84
# 来源:内存
# 名称:pop_m5_2010
# 最小值:0.0
# 最大值:423506.7

# 计算全球总人口
global(pop, "sum", na.rm=TRUE)
#         sum
# pop_m5_2010 306620886

global(wpop, "sum", na.rm=TRUE)
#         sum
# pop_m5_2010 306620761

# 将结果保存到文件
writeRaster(wpop, "pop.tif")

# 一次性处理所有人口数据
ff <- list.files(pattern="0$", "USA_HistoricalPopulationDataset", full=TRUE)
apop <- rast(ff)
wapop <- project(apop, wth, "sum")

# 人口数据的数量可能不准确,因为在投影(变形)时使用了双线性插值。这不适用于(人口)计数数据。
# 你可以先将其转换为人口密度,然后再进行投影,并最后转换回来。
# 下面是如何执行这个操作,得到与上面直接方法类似的结果:

# 计算人口数据和气候数据的单元格大小
csp <- cellSize(pop)
csw <- cellSize(wth[[1]])

# 计算人口密度
popdens <- pop / csp

# 对人口密度进行投影,使用"bilinear"插值方法
popdens <- project(popdens, wth, "bilinear")

# 计算人口数量
popcount <- popdens * csw

# 检查结果
popcount
# 类型:SpatRaster
# 尺寸:596, 1385, 1 (行数,列数,层数)
# 分辨率:0.04166666, 0.04166667 (x,y)
# 范围:-124.7083, -67, 24.5417, 49.37503 (最小x,最大x,最小y,最大y)
# 坐标参考:经纬度,WGS 84
# 来源:内存
# 名称:pop_m5_2010
# 最小值:0.0
# 最大值:393982.5

global(popcount, "sum", na.rm=TRUE)
#         sum
# pop_m5_2010 304906042

请注意,这是R代码的中文翻译,仅包括代码部分,不包括其他内容。

英文:

This is how you can do that with R.

library(terra)
pop &lt;- rast(&quot;USA_HistoricalPopulationDataset/pop_m5_2010&quot;)
wth &lt;- rast(&quot;nclimgrid_prcp.nc&quot;)

wpop &lt;- project(pop, wth, &quot;sum&quot;)

Inspect the results.

wpop
#class       : SpatRaster 
#dimensions  : 596, 1385, 1  (nrow, ncol, nlyr)
#resolution  : 0.04166666, 0.04166667  (x, y)
#extent      : -124.7083, -67, 24.5417, 49.37503  (xmin, xmax, ymin, ymax)
#coord. ref. : lon/lat WGS 84 
#source(s)   : memory
#name        : pop_m5_2010 
#min value   :         0.0 
#max value   :    423506.7 

global(pop, &quot;sum&quot;, na.rm=TRUE)
#                  sum
#pop_m5_2010 306620886

global(wpop, &quot;sum&quot;, na.rm=TRUE)
#                  sum
#pop_m5_2010 306620761

You can save the results to file with something like this

writeRaster(wpop, &quot;pop.tif&quot;)

And you could do this in one step for all population data like this:

ff &lt;- list.files(pattern=&quot;0$&quot;, &quot;USA_HistoricalPopulationDataset&quot;, full=TRUE)
apop &lt;- rast(ff)
wapop &lt;- project(apop, wth, &quot;sum&quot;)

The population numbers you are getting are probably wrong because you are using bilinear interpolation when projecting (warping). That is not appropriate for (population) count data. You could first transform it to population density, warp, and transform back. I do that below, getting a result that is similar to what you get with the more direct approach that I have shown above.

csp &lt;- cellSize(pop)
csw &lt;- cellSize(wth[[1]])
popdens &lt;- pop / csp
popdens &lt;- project(popdens, wth, &quot;bilinear&quot;)
popcount &lt;- popdens * csw

popcount
#class       : SpatRaster 
#dimensions  : 596, 1385, 1  (nrow, ncol, nlyr)
#resolution  : 0.04166666, 0.04166667  (x, y)
#extent      : -124.7083, -67, 24.5417, 49.37503  (xmin, xmax, ymin, ymax)
#coord. ref. : lon/lat WGS 84 
#source(s)   : memory
#name        : pop_m5_2010 
#min value   :         0.0 
#max value   :    393982.5 

global(popcount, &quot;sum&quot;, na.rm=TRUE)
#                  sum
#pop_m5_2010 304906042

huangapple
  • 本文由 发表于 2023年2月8日 08:26:44
  • 转载请务必保留本文链接:https://go.coder-hub.com/75380315.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定