如何根据边界计算不同数据框中的数值

huangapple go评论68阅读模式
英文:

How to calculate the values from different data frame based on boundaries

问题

我有一个包含以下列的地理数据框架:

  1. 移动订阅数量
  2. 经度(X)
  3. 纬度(Y)

还有另一个名为“boudaries”的地理数据框架,其中包含边界的几何信息。

我想在边界地理数据框架中创建另一列,该列根据边界地理数据框架中的经度和纬度计算移动订阅的总数。

我真的希望有人能在这个问题上帮助我。感谢您的亲切协助。

我已经尝试过合并这两个数据框架,但我不知道如何基于边界来计算数据。

英文:

I have a geodataframe containing columns of :-

  1. no. of mobile subscription
  2. longitude (X)
  3. latitude (Y)

and another geodataframe called "boudaries" which containing the geometry of boundaries

I want to create another column in boundaries geodataframe which calculate the sum of mobile subscription based on the latitude and longitude that falls on the boundaries in the boundary dataframe.

I really hope someone can help me in this issue. Appreciate your kind assistance.

I have tried to merge both data frames, but I have no idea on how to calculate the data based on the boundaries

答案1

得分: 0

这个答案输出了在特定区域内的订阅数量:

import geopandas as gpd
import pandas as pd

# 创建一个虚拟边界地理数据框
df = pd.DataFrame({'name': ['first boundary', 'second boundary'],
                   'area': ['POLYGON ((-10 -3, -10 3, 3 3, 3 -10, -10 -3))', 'POLYGON ((-20 -21, -12 -17, 2 -15, 5 -20, -20 -21))']})

boundaries = gpd.GeoDataFrame(df[['name']], geometry=gpd.GeoSeries.from_wkt(df.area, crs='epsg:4326'))

# 创建一个带有一些点的虚拟地理数据框(您可以根据需要更改坐标)
points = pd.DataFrame({'num_sub': [1, 2, 3, 4, 5],
                      'coordinates': ['POINT(-7 1)', 'POINT(1 -2)', 'POINT(-17 -20)', 'POINT(0 -18)', 'POINT(-5 0)']})

subs_coordinates = gpd.GeoDataFrame(points[['num_sub']], geometry=gpd.GeoSeries.from_wkt(points.coordinates, crs='epsg:4326'))

# 返回每个区域的订阅总数并存储在num_subs列中
boundaries['num_subs'] = boundaries.geometry.apply(lambda x: x.contains(subs_coordinates.geometry).sum())

如果您有X和Y坐标分别在不同列中(在此示例中命名为X和Y),您可以按如下方式操作:

points = pd.DataFrame({'num_sub': [1, 2, 3, 4, 5],
                      'X': [-7, 1, -17, 0, -5],
                      'Y': [1, -2, -20, -18, 0]})

# 将x和y列转换为几何点
points['coordinates'] = points[['X', 'Y']].apply(lambda x: 'POINT('+str(x.X)+' '+str(x.Y)+')', axis=1)

# 创建geopandas数据框
subs_coordinates = gpd.GeoDataFrame(points[['num_sub']], geometry=gpd.GeoSeries.from_wkt(points.coordinates, crs='epsg:4326'))

# 返回每个区域的订阅总数并存储在num_subs列中
boundaries['num_subs'] = boundaries.geometry.apply(lambda x: x.contains(subs_coordinates.geometry).sum())

希望对您有所帮助。

英文:

This answer outputs the num of subscription given a specific area:

import geopandas as gpd
import pandas as pd

# creating a dummy boundary geodataframe
df = pd.DataFrame({'name': ['first boundary', 'second boundary'],
                    'area': ['POLYGON ((-10 -3, -10 3, 3 3, 3 -10, -10 -3))', 'POLYGON ((-20 -21, -12 -17, 2 -15, 5 -20, -20 -21))']})

boundaries = gpd.GeoDataFrame(df[['name']], geometry=gpd.GeoSeries.from_wkt(df.area, crs = 'epsg:4326'))

# creating a dummy geodataframe with some points (you can change it to your coordenates)
points = pd.DataFrame({'num_sub': [1, 2, 3, 4, 5],
                       'coordenates': ['POINT(-7 1)', 'POINT(1 -2)', 'POINT(-17 -20)', 'POINT(0 -18)', 'POINT(-5 0)']})

subs_coordenates = gpd.GeoDataFrame(points[['num_sub']], geometry=gpd.GeoSeries.from_wkt(points.coordenates, crs = 'epsg:4326'))

# returning the sum of subscription for each area and storing in a num_subs column
boundaries['num_subs'] =  boundaries.geometry.apply(lambda x: x.contains(subs_coordenates.geometry).sum())

If you have the X and Y cordenates in diferent columns (named X and Y in this example), you can do as folows:

points = pd.DataFrame({'num_sub': [1, 2, 3, 4, 5],
                       'X': [-7, 1, -17, 0, -5],
                       'Y': [1, -2, -20, -18, 0]})

# Converting the x and y columns to geometry points
points['coordenates'] = points[['X', 'Y']].apply(lambda x: 'POINT('+str(x.X)+' '+str(x.Y)+')', axis=1)

# creating the geopandas dataframe
subs_coordenates = gpd.GeoDataFrame(points[['num_sub']], geometry=gpd.GeoSeries.from_wkt(points.coordenates, crs = 'epsg:4326'))

# returning the sum of subscription for each area and storing in a num_subs column
boundaries['num_subs'] =  boundaries.geometry.apply(lambda x: x.contains(subs_coordenates.geometry).sum())

Hope it works for you.

huangapple
  • 本文由 发表于 2023年5月22日 17:36:11
  • 转载请务必保留本文链接:https://go.coder-hub.com/76304809.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定