How can I merge two datasets based on their geometry columns, where one dataset has point geometries and the other has polygon geometries?

huangapple go评论125阅读模式
英文:

How can I merge two datasets based on their geometry columns, where one dataset has point geometries and the other has polygon geometries?

问题

我使用了这段代码。然而,尽管我有几何列,但它要求我将数据框转换为地理数据框,所以我编写了这些代码,但然后它并未匹配一些点,我无法理解为什么。

英文:

I would like to merge the following datasetS based on the "geometry" variable. The reason I am trying to merge is that I need the names of the subnational regions in the format of iso_3166_2 and iso_3166_2 in the dataset called iso_geo. If the point geometries lie in polygon/multiploygon geometries then it should match.

smart_city

year country_id geometry
2019 Germany POINT (7.08000 51.51000)
2018 Monaco POINT (7.41700 43.74000)

iso_geo

n_region_iso_name iso_3166_2 geometry
Escut de Canillo AD-02 POLYGON ((1.59735 42.62192, 1.60830 42.61812, ...
Escut d'Encamp AD-03 MULTYPOLYGON ((1.71709 42.53989, 1.71062 42.52774, ...
import geopandas as gpd
from shapely.geometry import Point

# Assuming 'smart_city' is the DataFrame with point geometries and 'iso_geometry' is the DataFrame with polygon geometries

# Convert the 'geometry' column in 'smart_city' DataFrame to Point geometries
smart_city['geometry'] = smart_city['geometry'].apply(lambda row: Point(row.x, row.y))

# Convert 'smart_city' DataFrame to a GeoDataFrame
smart_city_geo = gpd.GeoDataFrame(smart_city, geometry='geometry')

# Convert 'iso_geometry' DataFrame to a GeoDataFrame
iso_geometry_geo = gpd.GeoDataFrame(iso_geometry, geometry='geometry')

# Perform the spatial join
merged = gpd.sjoin(smart_city_geo, iso_geometry_geo, how='left', op='within')

# The 'merged' GeoDataFrame will contain the attributes from both DataFrames based on the spatial relationship
# between the points and polygons.

I used this code. However, even though I have geometry columns it asked me to convert the dataframe to geodataframe so I write those codes but then it does not match some points and I could not understand why.#

答案1

得分: 0

以下是您要的翻译内容:

当一个问题缺少完整的代码、数据或其他信息(模块版本、运行环境等),回答起来并不容易。

让我以这种方式帮助您。我提供下面经过测试的代码,您只需测试它。尝试编辑数据并重新运行,看看是否能够得到结果。

这段代码在我的机器上运行良好。数据足够简单易懂。如果出现错误,请在下面的评论部分发布错误消息。

import os
os.environ['USE_PYGEOS'] = '0'
import geopandas
import geopandas as gpd
import numpy as np
from shapely.geometry import Polygon, Point, MultiPolygon

# 非洲的四个正方形多边形
a = Polygon([(0, 0), (0, 10), (10, 10), (10, 0)])
b = Polygon([(0, 10), (0, 20), (10, 20), (10, 10)])
c = Polygon([(10, 0), (10, 10), (20, 10), (20, 0)])
d = MultiPolygon([
        (
            ((10, 10), (10, 20), (20, 20), (20, 10)), 
            [((20-2,20-1), (10+2,20-1), (15,17)),
            ((10+2,10+1), (20-2,10+1), (15,13))
            ]
        ),
        # 更多多边形在这里
    ]) # 带有2个洞的多边形

# 与多边形一起使用的一些点
pa = Point([5,5]) # 在a中
pb = Point([15,5])
pc = Point([5,15])
pd = Point([15,15])
pe = Point([15,15+3])
pf = Point([5-6,5])

# 包含所有多边形的GeoDataFrame
poly_gdf = gpd.GeoDataFrame({"Poly_name": ["A", "B", "C", "D"], "geometry": [a, b, c, d],
                    "colors": ["red","green","gray","brown"]}, crs="EPSG:4326")
pnt_gdf = gpd.GeoDataFrame({"Point_name": ["pA", "pB", "pC", "pD", "pE", "pF"], 
                            "Value": [1,2,3,4,5,6],
                            "geometry": [pa, pb, pc, pd, pe, pf],
                    "colors": ["red","gray","green","brown", "black", "blue"]}, crs="EPSG:4326")
# 将它们都绘制出来
ax1 = poly_gdf.plot(ec='k', fc=poly_gdf['colors'], alpha=0.5, zorder=9)
pnt_gdf.plot(color=pnt_gdf['colors'], ax=ax1, alpha=1, zorder=10)

可视化/验证的输出图:

How can I merge two datasets based on their geometry columns, where one dataset has point geometries and the other has polygon geometries?

快速空间连接

points_w_poly = geopandas.sjoin(pnt_gdf, poly_gdf)
points_w_poly
Point_name	Value	geometry	colors_left	index_right	Poly_name	colors_right
0	pA	1	POINT (5.00000 5.00000)	    red	    0	A	red
1	pB	2	POINT (15.00000 5.00000)	gray	2	C	gray
2	pC	3	POINT (5.00000 15.00000)	green	1	B	green
3	pD	4	POINT (15.00000 15.00000)	brown	3	D	brown

所有多边形中心的点可以成功连接。位于多边形内部孔中或多边形外部外环的点将不会连接。

查看已连接数据的属性(Value):

points_w_poly[["Poly_name", "Value"]]
Poly_name	Value
0	A	    1
1	C	    2
2	B	    3
3	D	    4

更详细的空间连接

sjoin 的语法可能在不同版本的geopandas中有所不同。

gpd.sjoin(pnt_gdf, poly_gdf, how='left', predicate='within', lsuffix='*Point', rsuffix='*Poly')

输出:

    Point_name	Value	geometry	colors_*Point	index_*Poly	Poly_name	colors_*Poly
0	pA	1	POINT (5.00000 5.00000)	    red	    0.0	  A	    red
1	pB	2	POINT (15.00000 5.00000)	gray	2.0	  C	    gray
2	pC	3	POINT (5.00000 15.00000)	green	1.0	  B	    green
3	pD	4	POINT (15.00000 15.00000)	brown	3.0	  D	    brown
4	pE	5	POINT (15.00000 18.00000)	black	NaN	  NaN	NaN
5	pF	6	POINT (-1.00000 5.00000)	blue	NaN	  NaN	NaN
英文:

When a question lacks complete code, data, or other informations (module version, running environment), it is not easy to answer.

Let me help this way. I offer the tested code below, you just test it. Try edit the data and rerun to see if you can get the result or not.

The code runs well on my machine. The data is simple enough to grasp. If you get errors, post the error messages in the comment section below.

import os
os.environ['USE_PYGEOS'] = '0'
import geopandas
import geopandas as gpd
import numpy as np
from shapely.geometry import Polygon, Point, MultiPolygon
# Four square polygons in Africa
a = Polygon([(0, 0), (0, 10), (10, 10), (10, 0)])
b = Polygon([(0, 10), (0, 20), (10, 20), (10, 10)])
c = Polygon([(10, 0), (10, 10), (20, 10), (20, 0)])
d = MultiPolygon([
(
((10, 10), (10, 20), (20, 20), (20, 10)), 
[((20-2,20-1), (10+2,20-1), (15,17)),
((10+2,10+1), (20-2,10+1), (15,13))
]
),
# More polygon here
]) # Polygon with 2 holes
#((10,20), (20,20), (15,21))
# Some points to go with the polygons
pa = Point([5,5]) # in a
pb = Point([15,5])
pc = Point([5,15])
pd = Point([15,15])
pe = Point([15,15+3])
pf = Point([5-6,5])
# GeoDataFrame of all the polygons
poly_gdf = gpd.GeoDataFrame({"Poly_name": ["A", "B", "C", "D"], "geometry": [a, b, c, d],
"colors": ["red","green","gray","brown"]}, crs="EPSG:4326")
pnt_gdf = gpd.GeoDataFrame({"Point_name": ["pA", "pB", "pC", "pD", "pE", "pF"], 
"Value": [1,2,3,4,5,6],
"geometry": [pa, pb, pc, pd, pe, pf],
"colors": ["red","gray","green","brown", "black", "blue"]}, crs="EPSG:4326")
# Plot them both
ax1 = poly_gdf.plot(ec="k", fc=poly_gdf["colors"], alpha=0.5, zorder=9)
pnt_gdf.plot(color=pnt_gdf["colors"], ax=ax1, alpha=1, zorder=10)

Output plot for visualization/verification:

How can I merge two datasets based on their geometry columns, where one dataset has point geometries and the other has polygon geometries?

Quick Spatial Join

points_w_poly = geopandas.sjoin(pnt_gdf, poly_gdf)
points_w_poly

<pre>
Point_name Value geometry colors_left index_right Poly_name colors_right
0 pA 1 POINT (5.00000 5.00000) red 0 A red
1 pB 2 POINT (15.00000 5.00000) gray 2 C gray
2 pC 3 POINT (5.00000 15.00000) green 1 B green
3 pD 4 POINT (15.00000 15.00000) brown 3 D brown
</pre>

The points at the center of all the polygons can join successfully. The point inside the hole of a polygon or outside the outer-linear-rings of the polygons will not join.

# View the attribute of the joined data (Value) for each polygon.
points_w_poly[[&quot;Poly_name&quot;, &quot;Value&quot;]]

<pre>
Poly_name Value
0 A 1
1 C 2
2 B 3
3 D 4</pre>

More detailed spatial join

The syntax of sjoin may be different for each version of geopandas.

gpd.sjoin(pnt_gdf, poly_gdf, how=&#39;left&#39;, predicate=&#39;within&#39;, lsuffix=&#39;*Point&#39;, rsuffix=&#39;*Poly&#39;)

Output:

<pre>
Point_name Value geometry colors_*Point index_*Poly Poly_name colors_*Poly
0 pA 1 POINT (5.00000 5.00000) red 0.0 A red
1 pB 2 POINT (15.00000 5.00000) gray 2.0 C gray
2 pC 3 POINT (5.00000 15.00000) green 1.0 B green
3 pD 4 POINT (15.00000 15.00000) brown 3.0 D brown
4 pE 5 POINT (15.00000 18.00000) black NaN NaN NaN
5 pF 6 POINT (-1.00000 5.00000) blue NaN NaN NaN
</pre>

huangapple
  • 本文由 发表于 2023年6月13日 04:52:25
  • 转载请务必保留本文链接:https://go.coder-hub.com/76460254.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定