提取嵌入在 Pandas 数据帧中的 GeoJSON 文件/对象。

huangapple go评论71阅读模式
英文:

Extracting a geojson file/object embedded in pandas dataframe

问题

我有一个Pandas数据框,它是这个简化版本的更复杂版本:

# 测试数据框
data = {'Geojson': ['{"geometry": {"coordinates": [[24.950899, 60.169158], [24.953492, 60.169158],[24.953510, 60.170104],[24.950958, 60.169990]], "type": "Polygon"},"id": 1,"properties": {"GlobalID": "84756blabla","NAME": "Helsinki Senate Square","OBJECTID": 1,"OBS_CREATEDATE": 1641916981000,"OBS_UPDATEDATE": null, "Area_m2": 6861.47},"type": "Feature"}'],
       'Name': ['Helsinki Senate Square'],
       'Type': ['Polygon']}

df = pd.DataFrame(data)

df.head()

如您所见,第一列中嵌入了一个GeoJSON文件。我想要做的是从该列中提取GeoJSON值并将其单独保存为一个GeoJSON文件,但我一直在做这个操作时遇到困难。在互联网上寻找帮助不容易,因为大多数示例都展示如何提取JSON,而其属性与GeoJSON不同。

如果可能的话,我还想在同一个Python脚本中将GeoJSON提取为geopandas的GeoDataFrame。

正如您可能已经猜到的,最终目标是能够在地理信息系统(GIS)上绘制数据或在GIS上使用数据。由于该列中有许多GeoJSON文件(不仅仅是我示例中的一个),解决方案可能需要迭代。数据类型是多边形,但我也对能够考虑不同要素类型(例如点、多线、多边形等)的解决方案感兴趣。

任何建议/解决方案将不胜感激。

英文:

I have a pandas dataframe that is a more complex version of this simplified one:

# Test data frame
data = {'Geojson': ['{"geometry": {"coordinates": [[[24.950899, 60.169158], [24.953492, 60.169158],[24.953510, 60.170104],[24.950958, 60.169990]]],"type": "Polygon"},"id": 1,"properties": {"GlobalID": "84756blabla","NAME": "Helsinki Senate Square","OBJECTID": 1,"OBS_CREATEDATE": 1641916981000,"OBS_UPDATEDATE": null, "Area_m2": 6861.47},"type": "Feature"}'],'Name': ["Helsinki Senate Square"], 'Type': ["Polygon"]}

df = pd.DataFrame(data)

df.head()
...
	Geojson	Name	Type
0	{"geometry": {"coordinates": [[[24.950899, 60....	Helsinki Senate Square	Polygon

As you can see, there is a GeoJSON file embedded in the first column. What I would like to do is extract the GeoJSON value from that column and save it separately as a GeoJSON file, but I've been having trouble doing this. Finding help on the net is not easy as most show examples for how to extract a JSON, which has different properties from those of a GeoJSON.

If possible I'd also like to extract the GeoJSON as a geopandas GeoDataFrame within the same python script.

As you may have guessed, the end goal is to be able to map the data or use it in a GIS context. Since there are many GeoJSONs in the column (not just one as per my example). The solution may require iteration. The datatype is polygon, but I'd also be interested in a solution which could take into account different feature types, eg. point, multiline, multipolygon etc ...

Any suggestions/solutions would be most welcome.

答案1

得分: 0

GeoJSON是一种JSON格式。我会解析每个要素并将其添加到FeatureCollection中。

以下是使用您的测试数据的示例:

import json
import geopandas as gpd

# 要素列表:这里只有一个要素
test_features = ['{"geometry": {"coordinates": [[24.950899, 60.169158], [24.953492, 60.169158],[24.953510, 60.170104],[24.950958, 60.169990]], "type": "Polygon"},"id": 1,"properties": {"GlobalID": "84756blabla","NAME": "Helsinki Senate Square","OBJECTID": 1,"OBS_CREATEDATE": 1641916981000,"OBS_UPDATEDATE": null, "Area_m2": 6861.47},"type": "Feature"}']

new_feature_collection = {
    'type': 'FeatureCollection',
    'features': []
}

for feature in test_features:
    feature = json.loads(feature)
    new_feature_collection['features'].append(feature)

# 转换为GeoJSON格式的字符串
geojson_out = json.dumps(new_feature_collection, indent=4)

# 显示结果
print(geojson_out)

# 或者,如果您只想获取GeoDataFrame:
gdf = gpd.GeoDataFrame.from_features([json.loads(feature) for feature in test_features])
print(gdf)

输出:

{
    "type": "FeatureCollection",
    "features": [
        {
            "geometry": {
                "coordinates": [
                    [
                        24.950899,
                        60.169158
                    ],
                    [
                        24.953492,
                        60.169158
                    ],
                    [
                        24.95351,
                        60.170104
                    ],
                    [
                        24.950958,
                        60.16999
                    ]
                ],
                "type": "Polygon"
            },
            "id": 1,
            "properties": {
                "GlobalID": "84756blabla",
                "NAME": "Helsinki Senate Square",
                "OBJECTID": 1,
                "OBS_CREATEDATE": 1641916981000,
                "OBS_UPDATEDATE": null,
                "Area_m2": 6861.47
            },
            "type": "Feature"
        }
    ]
}
geometry GlobalID NAME OBJECTID OBS_CREATEDATE OBS_UPDATEDATE Area_m2
0 POLYGON ((24.95090 60.16916, ... 84756blabla Helsinki Senate Square 1 1641916981000 None 6861.47
英文:

GeoJSON is a json format. I'd parse each feature and add it to a FeatureCollection.

Here's an example using your test data:

import json
import geopandas as gpd

# list of features: just the one feature in question here
test_features = ['{"geometry": {"coordinates": [[[24.950899, 60.169158], [24.953492, 60.169158],[24.953510, 60.170104],[24.950958, 60.169990]]],"type": "Polygon"},"id": 1,"properties": {"GlobalID": "84756blabla","NAME": "Helsinki Senate Square","OBJECTID": 1,"OBS_CREATEDATE": 1641916981000,"OBS_UPDATEDATE": null, "Area_m2": 6861.47},"type": "Feature"}']

new_feature_collection = {
    'type': 'FeatureCollection',
    'features': []
}

for feature in test_features:
    feature = json.loads(feature)
    new_feature_collection['features'].append(feature)

# convert to GeoJSON-formatted string
geojson_out = json.dumps(new_feature_collection, indent=4)

# show it
print(geojson_out)


# Alternatively, if you're just interested in getting a GeoDataFrame:
gdf = gpd.GeoDataFrame.from_features([json.loads(feature) for feature in test_features])
print(gdf)

Output:

{
    "type": "FeatureCollection",
    "features": [
        {
            "geometry": {
                "coordinates": [
                    [
                        [
                            24.950899,
                            60.169158
                        ],
                        [
                            24.953492,
                            60.169158
                        ],
                        [
                            24.95351,
                            60.170104
                        ],
                        [
                            24.950958,
                            60.16999
                        ]
                    ]
                ],
                "type": "Polygon"
            },
            "id": 1,
            "properties": {
                "GlobalID": "84756blabla",
                "NAME": "Helsinki Senate Square",
                "OBJECTID": 1,
                "OBS_CREATEDATE": 1641916981000,
                "OBS_UPDATEDATE": null,
                "Area_m2": 6861.47
            },
            "type": "Feature"
        }
    ]
}

geometry GlobalID NAME OBJECTID OBS_CREATEDATE OBS_UPDATEDATE Area_m2
0 POLYGON ((24.95090 60.16916, ... 84756blabla Helsinki Senate Square 1 1641916981000 None 6861.47

huangapple
  • 本文由 发表于 2023年2月27日 08:29:06
  • 转载请务必保留本文链接:https://go.coder-hub.com/75575893.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定