英文:
How to calculate the values from different data frame based on boundaries
问题
我有一个包含以下列的地理数据框架:
- 移动订阅数量
- 经度(X)
- 纬度(Y)
还有另一个名为“boudaries”的地理数据框架,其中包含边界的几何信息。
我想在边界地理数据框架中创建另一列,该列根据边界地理数据框架中的经度和纬度计算移动订阅的总数。
我真的希望有人能在这个问题上帮助我。感谢您的亲切协助。
我已经尝试过合并这两个数据框架,但我不知道如何基于边界来计算数据。
英文:
I have a geodataframe containing columns of :-
- no. of mobile subscription
- longitude (X)
- latitude (Y)
and another geodataframe called "boudaries" which containing the geometry of boundaries
I want to create another column in boundaries geodataframe which calculate the sum of mobile subscription based on the latitude and longitude that falls on the boundaries in the boundary dataframe.
I really hope someone can help me in this issue. Appreciate your kind assistance.
I have tried to merge both data frames, but I have no idea on how to calculate the data based on the boundaries
答案1
得分: 0
这个答案输出了在特定区域内的订阅数量:
import geopandas as gpd
import pandas as pd
# 创建一个虚拟边界地理数据框
df = pd.DataFrame({'name': ['first boundary', 'second boundary'],
'area': ['POLYGON ((-10 -3, -10 3, 3 3, 3 -10, -10 -3))', 'POLYGON ((-20 -21, -12 -17, 2 -15, 5 -20, -20 -21))']})
boundaries = gpd.GeoDataFrame(df[['name']], geometry=gpd.GeoSeries.from_wkt(df.area, crs='epsg:4326'))
# 创建一个带有一些点的虚拟地理数据框(您可以根据需要更改坐标)
points = pd.DataFrame({'num_sub': [1, 2, 3, 4, 5],
'coordinates': ['POINT(-7 1)', 'POINT(1 -2)', 'POINT(-17 -20)', 'POINT(0 -18)', 'POINT(-5 0)']})
subs_coordinates = gpd.GeoDataFrame(points[['num_sub']], geometry=gpd.GeoSeries.from_wkt(points.coordinates, crs='epsg:4326'))
# 返回每个区域的订阅总数并存储在num_subs列中
boundaries['num_subs'] = boundaries.geometry.apply(lambda x: x.contains(subs_coordinates.geometry).sum())
如果您有X和Y坐标分别在不同列中(在此示例中命名为X和Y),您可以按如下方式操作:
points = pd.DataFrame({'num_sub': [1, 2, 3, 4, 5],
'X': [-7, 1, -17, 0, -5],
'Y': [1, -2, -20, -18, 0]})
# 将x和y列转换为几何点
points['coordinates'] = points[['X', 'Y']].apply(lambda x: 'POINT('+str(x.X)+' '+str(x.Y)+')', axis=1)
# 创建geopandas数据框
subs_coordinates = gpd.GeoDataFrame(points[['num_sub']], geometry=gpd.GeoSeries.from_wkt(points.coordinates, crs='epsg:4326'))
# 返回每个区域的订阅总数并存储在num_subs列中
boundaries['num_subs'] = boundaries.geometry.apply(lambda x: x.contains(subs_coordinates.geometry).sum())
希望对您有所帮助。
英文:
This answer outputs the num of subscription given a specific area:
import geopandas as gpd
import pandas as pd
# creating a dummy boundary geodataframe
df = pd.DataFrame({'name': ['first boundary', 'second boundary'],
'area': ['POLYGON ((-10 -3, -10 3, 3 3, 3 -10, -10 -3))', 'POLYGON ((-20 -21, -12 -17, 2 -15, 5 -20, -20 -21))']})
boundaries = gpd.GeoDataFrame(df[['name']], geometry=gpd.GeoSeries.from_wkt(df.area, crs = 'epsg:4326'))
# creating a dummy geodataframe with some points (you can change it to your coordenates)
points = pd.DataFrame({'num_sub': [1, 2, 3, 4, 5],
'coordenates': ['POINT(-7 1)', 'POINT(1 -2)', 'POINT(-17 -20)', 'POINT(0 -18)', 'POINT(-5 0)']})
subs_coordenates = gpd.GeoDataFrame(points[['num_sub']], geometry=gpd.GeoSeries.from_wkt(points.coordenates, crs = 'epsg:4326'))
# returning the sum of subscription for each area and storing in a num_subs column
boundaries['num_subs'] = boundaries.geometry.apply(lambda x: x.contains(subs_coordenates.geometry).sum())
If you have the X and Y cordenates in diferent columns (named X and Y in this example), you can do as folows:
points = pd.DataFrame({'num_sub': [1, 2, 3, 4, 5],
'X': [-7, 1, -17, 0, -5],
'Y': [1, -2, -20, -18, 0]})
# Converting the x and y columns to geometry points
points['coordenates'] = points[['X', 'Y']].apply(lambda x: 'POINT('+str(x.X)+' '+str(x.Y)+')', axis=1)
# creating the geopandas dataframe
subs_coordenates = gpd.GeoDataFrame(points[['num_sub']], geometry=gpd.GeoSeries.from_wkt(points.coordenates, crs = 'epsg:4326'))
# returning the sum of subscription for each area and storing in a num_subs column
boundaries['num_subs'] = boundaries.geometry.apply(lambda x: x.contains(subs_coordenates.geometry).sum())
Hope it works for you.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论