英文:
Extracting a geojson file/object embedded in pandas dataframe
问题
我有一个Pandas数据框,它是这个简化版本的更复杂版本:
# 测试数据框
data = {'Geojson': ['{"geometry": {"coordinates": [[24.950899, 60.169158], [24.953492, 60.169158],[24.953510, 60.170104],[24.950958, 60.169990]], "type": "Polygon"},"id": 1,"properties": {"GlobalID": "84756blabla","NAME": "Helsinki Senate Square","OBJECTID": 1,"OBS_CREATEDATE": 1641916981000,"OBS_UPDATEDATE": null, "Area_m2": 6861.47},"type": "Feature"}'],
'Name': ['Helsinki Senate Square'],
'Type': ['Polygon']}
df = pd.DataFrame(data)
df.head()
如您所见,第一列中嵌入了一个GeoJSON文件。我想要做的是从该列中提取GeoJSON值并将其单独保存为一个GeoJSON文件,但我一直在做这个操作时遇到困难。在互联网上寻找帮助不容易,因为大多数示例都展示如何提取JSON,而其属性与GeoJSON不同。
如果可能的话,我还想在同一个Python脚本中将GeoJSON提取为geopandas的GeoDataFrame。
正如您可能已经猜到的,最终目标是能够在地理信息系统(GIS)上绘制数据或在GIS上使用数据。由于该列中有许多GeoJSON文件(不仅仅是我示例中的一个),解决方案可能需要迭代。数据类型是多边形,但我也对能够考虑不同要素类型(例如点、多线、多边形等)的解决方案感兴趣。
任何建议/解决方案将不胜感激。
英文:
I have a pandas dataframe that is a more complex version of this simplified one:
# Test data frame
data = {'Geojson': ['{"geometry": {"coordinates": [[[24.950899, 60.169158], [24.953492, 60.169158],[24.953510, 60.170104],[24.950958, 60.169990]]],"type": "Polygon"},"id": 1,"properties": {"GlobalID": "84756blabla","NAME": "Helsinki Senate Square","OBJECTID": 1,"OBS_CREATEDATE": 1641916981000,"OBS_UPDATEDATE": null, "Area_m2": 6861.47},"type": "Feature"}'],'Name': ["Helsinki Senate Square"], 'Type': ["Polygon"]}
df = pd.DataFrame(data)
df.head()
...
Geojson Name Type
0 {"geometry": {"coordinates": [[[24.950899, 60.... Helsinki Senate Square Polygon
As you can see, there is a GeoJSON file embedded in the first column. What I would like to do is extract the GeoJSON value from that column and save it separately as a GeoJSON file, but I've been having trouble doing this. Finding help on the net is not easy as most show examples for how to extract a JSON, which has different properties from those of a GeoJSON.
If possible I'd also like to extract the GeoJSON as a geopandas GeoDataFrame within the same python script.
As you may have guessed, the end goal is to be able to map the data or use it in a GIS context. Since there are many GeoJSONs in the column (not just one as per my example). The solution may require iteration. The datatype is polygon, but I'd also be interested in a solution which could take into account different feature types, eg. point, multiline, multipolygon etc ...
Any suggestions/solutions would be most welcome.
答案1
得分: 0
GeoJSON是一种JSON格式。我会解析每个要素并将其添加到FeatureCollection中。
以下是使用您的测试数据的示例:
import json
import geopandas as gpd
# 要素列表:这里只有一个要素
test_features = ['{"geometry": {"coordinates": [[24.950899, 60.169158], [24.953492, 60.169158],[24.953510, 60.170104],[24.950958, 60.169990]], "type": "Polygon"},"id": 1,"properties": {"GlobalID": "84756blabla","NAME": "Helsinki Senate Square","OBJECTID": 1,"OBS_CREATEDATE": 1641916981000,"OBS_UPDATEDATE": null, "Area_m2": 6861.47},"type": "Feature"}']
new_feature_collection = {
'type': 'FeatureCollection',
'features': []
}
for feature in test_features:
feature = json.loads(feature)
new_feature_collection['features'].append(feature)
# 转换为GeoJSON格式的字符串
geojson_out = json.dumps(new_feature_collection, indent=4)
# 显示结果
print(geojson_out)
# 或者,如果您只想获取GeoDataFrame:
gdf = gpd.GeoDataFrame.from_features([json.loads(feature) for feature in test_features])
print(gdf)
输出:
{
"type": "FeatureCollection",
"features": [
{
"geometry": {
"coordinates": [
[
24.950899,
60.169158
],
[
24.953492,
60.169158
],
[
24.95351,
60.170104
],
[
24.950958,
60.16999
]
],
"type": "Polygon"
},
"id": 1,
"properties": {
"GlobalID": "84756blabla",
"NAME": "Helsinki Senate Square",
"OBJECTID": 1,
"OBS_CREATEDATE": 1641916981000,
"OBS_UPDATEDATE": null,
"Area_m2": 6861.47
},
"type": "Feature"
}
]
}
geometry | GlobalID | NAME | OBJECTID | OBS_CREATEDATE | OBS_UPDATEDATE | Area_m2 | |
---|---|---|---|---|---|---|---|
0 | POLYGON ((24.95090 60.16916, ... | 84756blabla | Helsinki Senate Square | 1 | 1641916981000 | None | 6861.47 |
英文:
GeoJSON is a json format. I'd parse each feature and add it to a FeatureCollection.
Here's an example using your test data:
import json
import geopandas as gpd
# list of features: just the one feature in question here
test_features = ['{"geometry": {"coordinates": [[[24.950899, 60.169158], [24.953492, 60.169158],[24.953510, 60.170104],[24.950958, 60.169990]]],"type": "Polygon"},"id": 1,"properties": {"GlobalID": "84756blabla","NAME": "Helsinki Senate Square","OBJECTID": 1,"OBS_CREATEDATE": 1641916981000,"OBS_UPDATEDATE": null, "Area_m2": 6861.47},"type": "Feature"}']
new_feature_collection = {
'type': 'FeatureCollection',
'features': []
}
for feature in test_features:
feature = json.loads(feature)
new_feature_collection['features'].append(feature)
# convert to GeoJSON-formatted string
geojson_out = json.dumps(new_feature_collection, indent=4)
# show it
print(geojson_out)
# Alternatively, if you're just interested in getting a GeoDataFrame:
gdf = gpd.GeoDataFrame.from_features([json.loads(feature) for feature in test_features])
print(gdf)
Output:
{
"type": "FeatureCollection",
"features": [
{
"geometry": {
"coordinates": [
[
[
24.950899,
60.169158
],
[
24.953492,
60.169158
],
[
24.95351,
60.170104
],
[
24.950958,
60.16999
]
]
],
"type": "Polygon"
},
"id": 1,
"properties": {
"GlobalID": "84756blabla",
"NAME": "Helsinki Senate Square",
"OBJECTID": 1,
"OBS_CREATEDATE": 1641916981000,
"OBS_UPDATEDATE": null,
"Area_m2": 6861.47
},
"type": "Feature"
}
]
}
geometry | GlobalID | NAME | OBJECTID | OBS_CREATEDATE | OBS_UPDATEDATE | Area_m2 | |
---|---|---|---|---|---|---|---|
0 | POLYGON ((24.95090 60.16916, ... | 84756blabla | Helsinki Senate Square | 1 | 1641916981000 | None | 6861.47 |
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论