2023年3月4日 04:34:14go评论56阅读模式

英文:

geopandas renaming columns when saving to file

问题

我试图使用GeoPandas将一个形状文件保存到本地，最好是一个压缩文件，但我已经尝试了压缩和未压缩的方法。我注意到在将文件保存到本地后，然后再读取文件时，有三列发生了变化，最重要的是 'geom' 变回了 'geometry'，'parcel_apn_2' 现在是 'parcel_a_1'，'fips_county' 现在是 'fips_count'。我是否漏掉了可能导致这种行为的东西？

在保存之前检查列名：

# shp_prior_to_writing 是原始的 GeoDataFrame 
shp_prior_to_writing.columns

返回...

Index(['xref_id', 'fips_state', 'fips_county', 'county', 'parcel_apn',
       'parcel_apn_2', 'address', 'city', 'state', 'zip', 'src_id', 'latitude',
       'longitude', 'geom'],
      dtype='object')

然后将相同的文件保存到本地...

shp_prior_to_writing.to_file('test_shp.shp', driver='ESRI Shapefile')

然后读取它...

same_shape_file = gpd.read_file('test_shp.shp')
same_shape_file.columns

返回...

Index(['xref_id', 'fips_state', 'fips_count', 'county', 'parcel_apn',
       'parcel_a_1', 'address', 'city', 'state', 'zip', 'src_id', 'latitude',
       'longitude', 'geometry'],
      dtype='object')

我尝试过压缩和未压缩。我尝试过不明确设置任何驱动程序（我相信它默认为 ESRI Shapefile），我尝试过重新启动我的笔记本中的 Jupyter 内核。我还尝试过在保存之前明确重命名这些列，但结果似乎总是一样的。

英文:

I am trying to save a shape file locally with GeoPandas, preferably as a zipped file, however I have tried both compressed and uncompressed methods. I'm noticing that after saving the file locally, then reading the file back in, three columns have changed, most importantly 'geom' has reverted back to 'geometry', 'parcel_apn_2' is now 'parcel_a_1', and 'fips_county' is now 'fips_count'. Am I missing something that would cause this behavior?

Checking the column names prior to saving:

# shp_prior_to_writing is the original GeoDataFrame 
shp_prior_to_writing.columns

returns...

Index([&#39;xref_id&#39;, &#39;fips_state&#39;, &#39;fips_county&#39;, &#39;county&#39;, &#39;parcel_apn&#39;,
       &#39;parcel_apn_2&#39;, &#39;address&#39;, &#39;city&#39;, &#39;state&#39;, &#39;zip&#39;, &#39;src_id&#39;, &#39;latitude&#39;,
       &#39;longitude&#39;, &#39;geom&#39;],
      dtype=&#39;object&#39;)

then writing the same file locally...

shp_prior_to_writing.to_file(&#39;test_shp.shp&#39;, driver=&#39;ESRI Shapefile&#39;)

and reading it back in...

same_shape_file=gpd.read_file(&#39;test_shp.shp&#39;)
same_shape_file.columns

returns...

Index([&#39;xref_id&#39;, &#39;fips_state&#39;, &#39;fips_count&#39;, &#39;county&#39;, &#39;parcel_apn&#39;,
       &#39;parcel_a_1&#39;, &#39;address&#39;, &#39;city&#39;, &#39;state&#39;, &#39;zip&#39;, &#39;src_id&#39;, &#39;latitude&#39;,
       &#39;longitude&#39;, &#39;geometry&#39;],
      dtype=&#39;object&#39;)

I've tried zipping vs. uncompressed. I've tried without explicitly setting any drivers (I believe it defaults to ESRI Shapefile anyways), I've tried restarting the Jupyter kernel in my notebook. I've tried explicitly renaming those columns again prior to saving as well, but the result appears to always be the same.

答案1

得分: 0

Shapefile格式对列名有一个硬性限制，限制为10个字符。这个限制是内建在格式规范中的，来自ESRI，并不是geopandas或提供shp驱动的Fiona的问题。

请查看维基百科上对ESRI Shapefile标准限制的讨论，其中列出了这个10字符限制。还可以参考GIS StackExchange：如何绕过Shapefile中字段名的10字符限制？来了解一些选项的讨论。

由于这个10字符的限制，geopandas必须在重写列之前重命名它们，这导致了您看到的名称更改。如果您希望继续使用这些列名并让它们在磁盘上往返，请您需要使用不同的文件格式。

英文:

The shapefile format has a hard limit on column names of 10 characters. That limit is baked into the format specification and comes from ESRI, and is not the fault of geopandas or Fiona which provides the shp driver.

See e.g. a discussion of the ESRI Shapefile standard’s limitations on Wikipedia, which lists the 10-character limit. Also see GIS StackExchange: Bypassing 10 character limit of field name in shapefiles? for a discussion of options.

Because of this 10-character limit, geopandas must rename your columns before they can be rewritten, which is resulting in the name change you are seeing. If you want to continue to use these column names and have them round trip to disk, you will need to use a different file format.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

geopandas 在保存到文件时重命名列名

问题

答案1

地图在Python中使用geopandas无法正确检测颜色。

如何根据边界计算不同数据框中的数值

自动根据GeoJSON属性使用geopandas设置轮廓和填充颜色。

Python一个地方有多少点在里面

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论