英文:
Compute areas for multiple entries of GeoDataFrame
问题
I have some working code that computes the area of a city by name:
def get_area_of_city(city_name):
# Fetch the geodataframe for the specified city
city_gdf = ox.geocoder.geocode_to_gdf(city_name, which_result=1)
# Access the geometry (polygon) of the city from the geodataframe
city_polygon = city_gdf['geometry']
# Get the latitude and longitude of the city (centroid of the polygon)
city_latitude, city_longitude = city_polygon.geometry.centroid.y, city_polygon.geometry.centroid.x
# Define the Cylindrical Equal Area (CEA) projection centered at the city
cea_projection = f"+proj=cea +lon_0={city_longitude} +lat_ts={city_latitude} +x_0=0 +y_0=0 +ellps=WGS84 +units=m +no_defs"
# Reproject the polygon to the CEA projection
city_polygon_cea = city_polygon.to_crs(cea_projection)
# Compute the area of the polygon in square meters
area_square_meters = city_polygon_cea.area
# You can also convert the area to square kilometers if needed
area_square_kilometers = area_square_meters / 1000000.0
return area_square_kilometers
Now, I want to adapt this code such that it works with any GeoDataFrame
that contains multiple cities. This code should be able to construct a projection for each city and apply it to the polygon to get the area. How can I do this? I currently have the following code:
def get_area_of_geodataframe(gdf):
# Get a copy of the original GeoDataFrame
gdf_copy = gdf.copy()
# Get the latitude and longitude of the centroid of all geometries in the GeoDataFrame
gdf_copy['city_latitude'] = gdf_copy.geometry.centroid.y
gdf_copy['city_longitude'] = gdf_copy.geometry.centroid.x
# Define the Cylindrical Equal Area (CEA) projection for each geometry
gdf_copy['cea_projection'] = gdf_copy.apply(lambda row: f"+proj=cea +lon_0={row['city_longitude']} +lat_ts={row['city_latitude']} +x_0=0 +y_0=0 +ellps=WGS84 +units=m +no_defs", axis=1)
# Reproject each geometry to the CEA projection
gdf_copy['city_polygon_cea'] = gdf_copy.apply(lambda row: row['geometry'].to_crs(row['cea_projection']), axis=1)
# Compute the area of each geometry in square meters
gdf_copy['area_square_meters'] = gdf_copy['city_polygon_cea'].area
# Convert the area to square kilometers
gdf_copy['area_square_kilometers'] = gdf_copy['area_square_meters'] / 1000000.0
# Drop the intermediate columns and return the modified GeoDataFrame
gdf_copy = gdf_copy.drop(columns=['city_latitude', 'city_longitude', 'cea_projection', 'city_polygon_cea'])
return gdf_copy
However, the error I receive is AttributeError: 'Series' object has no attribute 'to_crs'
when I call row['geometry'].to_crs()
.
How do I have to adapt my code?
I tried to use the above code, but I get an error.
英文:
I have some working code that computes the area of a city by name:
def get_area_of_city(city_name):
# Fetch the geodataframe for the specified city
city_gdf = ox.geocoder.geocode_to_gdf(city_name, which_result=1)
# Access the geometry (polygon) of the city from the geodataframe
city_polygon = city_gdf['geometry']
# Get the latitude and longitude of the city (centroid of the polygon)
city_latitude, city_longitude = city_polygon.geometry.centroid.y, city_polygon.geometry.centroid.x
# Define the Cylindrical Equal Area (CEA) projection centered at the city
cea_projection = f"+proj=cea +lon_0={city_longitude} +lat_ts={city_latitude} +x_0=0 +y_0=0 +ellps=WGS84 +units=m +no_defs"
# Reproject the polygon to the CEA projection
city_polygon_cea = city_polygon.to_crs(cea_projection)
# Compute the area of the polygon in square meters
area_square_meters = city_polygon_cea.area
# You can also convert the area to square kilometers if needed
area_square_kilometers = area_square_meters / 1000000.0
return area_square_kilometers
Now, I want to adapt this code such that it works with any GeoDataFrame
that contains multiple cities. This code should be able to construct a projection for each city and apply it to the polygon to get the area. How can I do this? I currently have the following code:
def get_area_of_geodataframe(gdf):
# Get a copy of the original GeoDataFrame
gdf_copy = gdf.copy()
# Get the latitude and longitude of the centroid of all geometries in the GeoDataFrame
gdf_copy['city_latitude'] = gdf_copy.geometry.centroid.y
gdf_copy['city_longitude'] = gdf_copy.geometry.centroid.x
# Define the Cylindrical Equal Area (CEA) projection for each geometry
gdf_copy['cea_projection'] = gdf_copy.apply(lambda row: f"+proj=cea +lon_0={row['city_longitude']} +lat_ts={row['city_latitude']} +x_0=0 +y_0=0 +ellps=WGS84 +units=m +no_defs", axis=1)
# Reproject each geometry to the CEA projection
gdf_copy['city_polygon_cea'] = gdf_copy.apply(lambda row: row.to_crs(row['cea_projection']), axis=1)
# Compute the area of each geometry in square meters
gdf_copy['area_square_meters'] = gdf_copy['city_polygon_cea'].area
# Convert the area to square kilometers
gdf_copy['area_square_kilometers'] = gdf_copy['area_square_meters'] / 1000000.0
# Drop the intermediate columns and return the modified GeoDataFrame
gdf_copy = gdf_copy.drop(columns=['city_latitude', 'city_longitude', 'cea_projection', 'city_polygon_cea'])
return gdf_copy
However, the error I revieve is AttributeError: 'Series' object has no attribute 'to_crs'
when I call row.to_crs()
.
How do I have to adapt my code?
I tried to use the above code, but I get an error.
答案1
得分: 1
为什么它不起作用?
出乎意料还是不出乎意料,当你提取 geopandas.GeoDataFrame
的单行时,它最终变成了 pandas.Series
,因此它没有 to_crs()
方法。
type(gdf_copy.iloc[0])
# 返回 <class 'pandas.core.series.Series'>
关于是否应该返回 pandas.Series
,似乎存在一场辩论这里。这个答案提供了一个有趣的索引技巧(注意双方括号):
type(gdf_copy.iloc[[0]])
# 返回 <class 'geopandas.geodataframe.GeoDataFrame'>
不过,我认为在这里我们不能使用这个技巧,因为 apply()
让我们可以直接处理已经是 pandas.Series
的数据。
如果你真的想这么做,你可能可以通过直接使用 shapely 和 pyproj 来实现。
为什么这也许不是一个好主意?
如果它按照你尝试的方式工作,你最终会得到带有不同坐标参考系统 (CRS) 的 shapely 对象,而 geopandas
不支持在单个几何列中具有不同的 CRS。尽管如此,你可以具有不同 CRS 的不同几何列,尽管只有一个被视为 GeoDataFrame 的 主 几何。
建议的方法...
为什么不重用你的有效的 get_area_of_city(city_name)
函数呢?
df = pd.DataFrame({'city': ['Lyon', 'Paris', 'Marseille']})
df['area_square_kilometers'] = df.apply(lambda row: get_area_of_city(row['city']), axis=1)
# 结果:
# city area_square_kilometers
# 0 Lyon 47.985400
# 1 Paris 105.390441
# 2 Marseille 242.129201
希望这对你有所帮助!
英文:
Why is it not working?
Surprisingly or not, when you take a single row of a geopandas.GeoDataFrame
, it ends up being a pandas.Series
, thus it does not have the method to_crs()
.
type(gdf_copy.iloc[0])
# returns <class 'pandas.core.series.Series'>
There seems to be a debate here on whether or not getting back a pandas.Series
should be the expected behavior. This answer provides an interesting indexing trick (notice the double square-brackets):
type(gdf_copy.iloc[[0]])
# returns <class 'geopandas.geodataframe.GeoDataFrame'>
I don't think we can make use of that trick here though, as apply()
let us work with what is already a pandas.Series
.
Should you really want to go that way, you could probably accomplish that by working directly with shapely and pyproj.
Why might it not be a good idea anyway?
Had it worked the way you were trying to, you would have ended up with shapely objects with different CRS in a single geometry column, which geopandas
does not support. You can have different geometry columns with different CRS though, despite the fact that only one is considered as the main geometry of the GeoDataFrame.
Suggested way to go...
Why not reuse your nice and working get_area_of_city(city_name)
function?
df = pd.DataFrame({'city': ['Lyon', 'Paris', 'Marseille']})
df['area_square_kilometers'] = df.apply(lambda row: get_area_of_city(row['city']), axis=1)
# city area_square_kilometers
# 0 Lyon 47.985400
# 1 Paris 105.390441
# 2 Marseille 242.129201
Hope this helps!
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论