英文:
R Spatial Information Aggregation
问题
我正在尝试使用R构建一个映射函数,使用了dplyr、sp和rgdal包。首先,我将描述一下数据。我已经从opendatasoft.com下载了整个世界的shapefile。我从嵌入在这个sp对象中的数据帧中提取了国家名称,然后创建了一个包含分布在各个国家的一些值的数据帧。最后,我编写了以下用户定义函数(UDF)来将数据与sp对象合并。
至于countryValueData,它只是一个与每个国家相关的随机值和NAs的数据帧,我是通过随机数生成和从sp对象获取的国家列表创建的。
这个合并工作正常。现在我在sp对象中嵌入了一个具有以下结构的数据帧。
'数据框': 256 obs. of 12 variables:
$ Country : chr "Uganda" "Uzbekistan" "Ireland" "Eritrea" ...
$ Value1 : num NA NA 1660 NA NA ...
$ Value2 : num 2727 734 734 2727 734 ...
$ Value3 : num 2574 4383 3024 4293 NA ...
$ Value4 : num 2727 734 734 2727 1404 ...
$ iso3 : chr "UGA" "UZB" "IRL" "ERI" ...
$ status : chr "Member State" "Member State" "Member State" "Member State" ...
$ color_code : chr "UGA" "UZB" "IRL" "ERI" ...
$ continent : chr "Africa" "Asia" "Europe" "Africa" ...
$ region : chr "Eastern Africa" "Central Asia" "Northern Europe" "Eastern Africa" ...
$ iso_3166_1_: chr "UG" "UZ" "IE" "ER" ...
$ french_shor: chr "Ouganda" "Ouzbékistan" "Irlande" "Érythrée" ...
其中value1、value2、value3和value4是我之前提到的随机值。现在我正试图以某种方式在大陆级别上对这些数据进行聚合。理论上,由于sp对象中的所有多边形都有明确定义的边界,并且数据帧中有一个从国家到大陆的映射,聚合不应该是不可能的。我想要有一种机制,可以通过它将我的气泡图的性质从国家级别改为大陆级别,但首先,我需要进行这种聚合。有人尝试过这个吗?
英文:
I am trying to build a mapping function with R, using the dplyr, sp and rgdal packages. To start with, I'll describe the data. I have downloaded the shapefile of the entire world from opendatasoft.com. I extracted the country names from the dataframe embedded in this sp object, and then made a dataframe with some values that are distributed across the countries. Finally, I wrote the following UDF to merge the data with the sp object.
createDataMap <- function(shapeFileName = "world-administrative-boundaries", countryValueData){
SpatialInformation <- readOGR(shapeFileName, paste(shapeFileName,"areas", sep = "-")) #I am appending the term "-areas" only because I renamed the files in the folder
ValueInformation <- countryValueData
namesOfCountriesInTheWorldMap <- SpatialInformation@data$name
namesOfCountriesInValueList <- ValueInformation$Country
namesInValuesButNotInTheWorld <- setdiff(namesOfCountriesInValueList, namesOfCountriesInTheWorldMap)
namesInTheWorldButNotInTheDataFile <- setdiff(namesOfCountriesInTheWorldMap, namesOfCountriesInValueList)
if(length(namesInValuesButNotInTheWorld)>0){
return("You seem to have Values outside this world. Please wait until we achieve Type 1 Civilization Status.")
}
if(length(namesInTheWorldButNotInTheDataFile)>0){
return("Please add all the country rows in the data file, even if you do not have Values everywhere.")
}
IDtoCountry <- as.data.frame(cbind(namesOfCountriesInTheWorldMap, sapply(SpatialInformation@polygons, function(x) x@ID)))
namesOfCountriesInValueList <- left_join(namesOfCountriesInValueList,IDtoCountry, by = c("Country" = "namesOfCountriesInTheWorldMap"))
locationInfo <- SpatialInformation@data
namesOfCountriesInValueList <- left_join(countryValueData,locationInfo, by = c("Country" = "name"))
rownames(namesOfCountriesInValueList) <- namesOfCountriesInValueList$V2
namesOfCountriesInValueList$V2 <- NULL
return(SpatialPolygonsDataFrame(SpatialInformation, namesOfCountriesInValueList, match.ID = TRUE))
}
As for the countryValueData, It is just a dataframe of random values and NAs associated with each countries, which I created with random number generation and the list of countries I got from the sp object.
This merge works fine. Now I have a dataframe embedded in the sp object with the following structure.
>'data.frame': 256 obs. of 12 variables:
>> $ Country : chr "Uganda" "Uzbekistan" "Ireland" "Eritrea" ...
>> $ Value1 : num NA NA 1660 NA NA ...
>>$ Value2 : num 2727 734 734 2727 734 ...
>>$ Value3 : num 2574 4383 3024 4293 NA ...
>>$ Value4 : num 2727 734 734 2727 1404 ...
>>$ iso3 : chr "UGA" "UZB" "IRL" "ERI" ...
>>$ status : chr "Member State" "Member State" "Member State" "Member State" ...
>>$ color_code : chr "UGA" "UZB" "IRL" "ERI" ...
>>$ continent : chr "Africa" "Asia" "Europe" "Africa" ...
>>$ region : chr "Eastern Africa" "Central Asia" "Northern Europe" "Eastern Africa" ...
>>$ iso_3166_1_: chr "UG" "UZ" "IE" "ER" ...
>>$ french_shor: chr "Ouganda" "Ouzbékistan" "Irlande" "Érythrée" ...
where value1, value2, value3 and value4 are the random values I mentioned before. What I am now trying to do is to somehow aggregate this data on a continent level. In theory, since all the polygons in the sp object have their borders well defined, and there is a mapping from countries to continent in the data frame, aggregation should not impossible. I want to have a mechanism with which I can change the nature of my bubble plot from country level to continent level, but for that, first, I need to do this aggregation. Has anyone ever tried to do this?
答案1
得分: 1
使用例如 {sf},可以轻松地将空间数据框与补充数据连接,并像处理其他数据框一样“切片和切块”非几何列(“属性”)。在跨行聚合时,默认情况下合并空间要素。
示例:
- 读取形状文件作为空间数据框:
library(sf)
the_countries <- read_sf('/path/to/my/shapefile.shp')
> the_countries
Simple feature collection with 9 features and 8 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: -4.79028 ymin: 41.36492 xmax: 17.16639 ymax: 55.05653
Geodetic CRS: WGS 84
# A tibble: 9 x 9
iso3 status color_code name continent region iso_3166_1_ french_shor
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 CHE Member State CHE Switze~ Europe Weste~ CH Suisse
2 AUT Member State AUT Austria Europe Weste~ AT Autriche
## ...
- 生成一些额外的玩具数据:
country_value_data <- data.frame(iso3 = c('FRA', 'BEL'), random_indicator = rnorm(2))
- 将玩具数据连接到现有属性并聚合,例如按大陆:
library(dplyr)
the_countries |>
left_join(country_value_data, by = 'iso3') |>
group_by(continent) |>
summarise(random_indicator = mean(random_indicator, na.rm = TRUE))
+ Simple feature collection with 1 feature and 2 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: -4.79028 ymin: 41.36492 xmax: 17.16639 ymax: 55.05653
Geodetic CRS: WGS 84
# A tibble: 1 x 3
continent random_indicator geometry
<chr> <dbl> <MULTIPOLYGON [°]>
1 Europe -0.968 (((9.455 42.71861, 9.4675 42.76555, 9.4884 42.8070~
英文:
With, e.g., {sf} it's straightforward to join spatial dataframes with supplementary data and "slice'n'dice" the non-geometry columns ("attributes") as one would do with other dataframes. When aggregating across rows, the spatial features are merged by default.
example:
- read in shapefile as spatial dataframe:
library(sf)
the_countries <- read_sf('/path/to/my/shapefile.shp')
> the_countries
Simple feature collection with 9 features and 8 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: -4.79028 ymin: 41.36492 xmax: 17.16639 ymax: 55.05653
Geodetic CRS: WGS 84
# A tibble: 9 x 9
iso3 status color_code name continent region iso_3166_1_ french_shor
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 CHE Member State CHE Switze~ Europe Weste~ CH Suisse
2 AUT Member State AUT Austria Europe Weste~ AT Autriche
## ...
- generate some additional toy data:
country_value_data <- data.frame(iso3 = c('FRA', 'BEL'), random_indicator = rnorm(2))
- join toy data to existing attributes and aggregate, e. g. across continent:
library(dplyr)
the_countries |>
left_join(country_value_data, by = 'iso3') |>
group_by(continent) |>
summarise(random_indicator = mean(random_indicator, na.rm = TRUE))
+ Simple feature collection with 1 feature and 2 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: -4.79028 ymin: 41.36492 xmax: 17.16639 ymax: 55.05653
Geodetic CRS: WGS 84
# A tibble: 1 x 3
continent random_indicator geometry
<chr> <dbl> <MULTIPOLYGON [°]>
1 Europe -0.968 (((9.455 42.71861, 9.4675 42.76555, 9.4884 42.8070~
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论