英文:
How can I merge two datasets based on their geometry columns, where one dataset has point geometries and the other has polygon geometries?
问题
我使用了这段代码。然而,尽管我有几何列,但它要求我将数据框转换为地理数据框,所以我编写了这些代码,但然后它并未匹配一些点,我无法理解为什么。
英文:
I would like to merge the following datasetS based on the "geometry" variable. The reason I am trying to merge is that I need the names of the subnational regions in the format of iso_3166_2 and iso_3166_2 in the dataset called iso_geo. If the point geometries lie in polygon/multiploygon geometries then it should match.
smart_city
year | country_id | geometry |
---|---|---|
2019 | Germany | POINT (7.08000 51.51000) |
2018 | Monaco | POINT (7.41700 43.74000) |
iso_geo
n_region_iso_name | iso_3166_2 | geometry |
---|---|---|
Escut de Canillo | AD-02 | POLYGON ((1.59735 42.62192, 1.60830 42.61812, ... |
Escut d'Encamp | AD-03 | MULTYPOLYGON ((1.71709 42.53989, 1.71062 42.52774, ... |
import geopandas as gpd
from shapely.geometry import Point
# Assuming 'smart_city' is the DataFrame with point geometries and 'iso_geometry' is the DataFrame with polygon geometries
# Convert the 'geometry' column in 'smart_city' DataFrame to Point geometries
smart_city['geometry'] = smart_city['geometry'].apply(lambda row: Point(row.x, row.y))
# Convert 'smart_city' DataFrame to a GeoDataFrame
smart_city_geo = gpd.GeoDataFrame(smart_city, geometry='geometry')
# Convert 'iso_geometry' DataFrame to a GeoDataFrame
iso_geometry_geo = gpd.GeoDataFrame(iso_geometry, geometry='geometry')
# Perform the spatial join
merged = gpd.sjoin(smart_city_geo, iso_geometry_geo, how='left', op='within')
# The 'merged' GeoDataFrame will contain the attributes from both DataFrames based on the spatial relationship
# between the points and polygons.
I used this code. However, even though I have geometry columns it asked me to convert the dataframe to geodataframe so I write those codes but then it does not match some points and I could not understand why.#
答案1
得分: 0
以下是您要的翻译内容:
当一个问题缺少完整的代码、数据或其他信息(模块版本、运行环境等),回答起来并不容易。
让我以这种方式帮助您。我提供下面经过测试的代码,您只需测试它。尝试编辑数据并重新运行,看看是否能够得到结果。
这段代码在我的机器上运行良好。数据足够简单易懂。如果出现错误,请在下面的评论部分发布错误消息。
import os
os.environ['USE_PYGEOS'] = '0'
import geopandas
import geopandas as gpd
import numpy as np
from shapely.geometry import Polygon, Point, MultiPolygon
# 非洲的四个正方形多边形
a = Polygon([(0, 0), (0, 10), (10, 10), (10, 0)])
b = Polygon([(0, 10), (0, 20), (10, 20), (10, 10)])
c = Polygon([(10, 0), (10, 10), (20, 10), (20, 0)])
d = MultiPolygon([
(
((10, 10), (10, 20), (20, 20), (20, 10)),
[((20-2,20-1), (10+2,20-1), (15,17)),
((10+2,10+1), (20-2,10+1), (15,13))
]
),
# 更多多边形在这里
]) # 带有2个洞的多边形
# 与多边形一起使用的一些点
pa = Point([5,5]) # 在a中
pb = Point([15,5])
pc = Point([5,15])
pd = Point([15,15])
pe = Point([15,15+3])
pf = Point([5-6,5])
# 包含所有多边形的GeoDataFrame
poly_gdf = gpd.GeoDataFrame({"Poly_name": ["A", "B", "C", "D"], "geometry": [a, b, c, d],
"colors": ["red","green","gray","brown"]}, crs="EPSG:4326")
pnt_gdf = gpd.GeoDataFrame({"Point_name": ["pA", "pB", "pC", "pD", "pE", "pF"],
"Value": [1,2,3,4,5,6],
"geometry": [pa, pb, pc, pd, pe, pf],
"colors": ["red","gray","green","brown", "black", "blue"]}, crs="EPSG:4326")
# 将它们都绘制出来
ax1 = poly_gdf.plot(ec='k', fc=poly_gdf['colors'], alpha=0.5, zorder=9)
pnt_gdf.plot(color=pnt_gdf['colors'], ax=ax1, alpha=1, zorder=10)
可视化/验证的输出图:
快速空间连接
points_w_poly = geopandas.sjoin(pnt_gdf, poly_gdf)
points_w_poly
Point_name Value geometry colors_left index_right Poly_name colors_right
0 pA 1 POINT (5.00000 5.00000) red 0 A red
1 pB 2 POINT (15.00000 5.00000) gray 2 C gray
2 pC 3 POINT (5.00000 15.00000) green 1 B green
3 pD 4 POINT (15.00000 15.00000) brown 3 D brown
所有多边形中心的点可以成功连接。位于多边形内部孔中或多边形外部外环的点将不会连接。
查看已连接数据的属性(Value):
points_w_poly[["Poly_name", "Value"]]
Poly_name Value
0 A 1
1 C 2
2 B 3
3 D 4
更详细的空间连接
sjoin
的语法可能在不同版本的geopandas中有所不同。
gpd.sjoin(pnt_gdf, poly_gdf, how='left', predicate='within', lsuffix='*Point', rsuffix='*Poly')
输出:
Point_name Value geometry colors_*Point index_*Poly Poly_name colors_*Poly
0 pA 1 POINT (5.00000 5.00000) red 0.0 A red
1 pB 2 POINT (15.00000 5.00000) gray 2.0 C gray
2 pC 3 POINT (5.00000 15.00000) green 1.0 B green
3 pD 4 POINT (15.00000 15.00000) brown 3.0 D brown
4 pE 5 POINT (15.00000 18.00000) black NaN NaN NaN
5 pF 6 POINT (-1.00000 5.00000) blue NaN NaN NaN
英文:
When a question lacks complete code, data, or other informations (module version, running environment), it is not easy to answer.
Let me help this way. I offer the tested code below, you just test it. Try edit the data and rerun to see if you can get the result or not.
The code runs well on my machine. The data is simple enough to grasp. If you get errors, post the error messages in the comment section below.
import os
os.environ['USE_PYGEOS'] = '0'
import geopandas
import geopandas as gpd
import numpy as np
from shapely.geometry import Polygon, Point, MultiPolygon
# Four square polygons in Africa
a = Polygon([(0, 0), (0, 10), (10, 10), (10, 0)])
b = Polygon([(0, 10), (0, 20), (10, 20), (10, 10)])
c = Polygon([(10, 0), (10, 10), (20, 10), (20, 0)])
d = MultiPolygon([
(
((10, 10), (10, 20), (20, 20), (20, 10)),
[((20-2,20-1), (10+2,20-1), (15,17)),
((10+2,10+1), (20-2,10+1), (15,13))
]
),
# More polygon here
]) # Polygon with 2 holes
#((10,20), (20,20), (15,21))
# Some points to go with the polygons
pa = Point([5,5]) # in a
pb = Point([15,5])
pc = Point([5,15])
pd = Point([15,15])
pe = Point([15,15+3])
pf = Point([5-6,5])
# GeoDataFrame of all the polygons
poly_gdf = gpd.GeoDataFrame({"Poly_name": ["A", "B", "C", "D"], "geometry": [a, b, c, d],
"colors": ["red","green","gray","brown"]}, crs="EPSG:4326")
pnt_gdf = gpd.GeoDataFrame({"Point_name": ["pA", "pB", "pC", "pD", "pE", "pF"],
"Value": [1,2,3,4,5,6],
"geometry": [pa, pb, pc, pd, pe, pf],
"colors": ["red","gray","green","brown", "black", "blue"]}, crs="EPSG:4326")
# Plot them both
ax1 = poly_gdf.plot(ec="k", fc=poly_gdf["colors"], alpha=0.5, zorder=9)
pnt_gdf.plot(color=pnt_gdf["colors"], ax=ax1, alpha=1, zorder=10)
Output plot for visualization/verification:
Quick Spatial Join
points_w_poly = geopandas.sjoin(pnt_gdf, poly_gdf)
points_w_poly
<pre>
Point_name Value geometry colors_left index_right Poly_name colors_right
0 pA 1 POINT (5.00000 5.00000) red 0 A red
1 pB 2 POINT (15.00000 5.00000) gray 2 C gray
2 pC 3 POINT (5.00000 15.00000) green 1 B green
3 pD 4 POINT (15.00000 15.00000) brown 3 D brown
</pre>
The points at the center of all the polygons can join successfully. The point inside the hole of a polygon or outside the outer-linear-rings of the polygons will not join.
# View the attribute of the joined data (Value) for each polygon.
points_w_poly[["Poly_name", "Value"]]
<pre>
Poly_name Value
0 A 1
1 C 2
2 B 3
3 D 4</pre>
More detailed spatial join
The syntax of sjoin
may be different for each version of geopandas.
gpd.sjoin(pnt_gdf, poly_gdf, how='left', predicate='within', lsuffix='*Point', rsuffix='*Poly')
Output:
<pre>
Point_name Value geometry colors_*Point index_*Poly Poly_name colors_*Poly
0 pA 1 POINT (5.00000 5.00000) red 0.0 A red
1 pB 2 POINT (15.00000 5.00000) gray 2.0 C gray
2 pC 3 POINT (5.00000 15.00000) green 1.0 B green
3 pD 4 POINT (15.00000 15.00000) brown 3.0 D brown
4 pE 5 POINT (15.00000 18.00000) black NaN NaN NaN
5 pF 6 POINT (-1.00000 5.00000) blue NaN NaN NaN
</pre>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论