如何在Python中读取dbf文件并将其转换为数据框架

huangapple go评论104阅读模式
英文:

How to read dbf file in python and convert it to dataframe

问题

  1. 我正在尝试使用 `simpledbf` 库来读取一个 dbf 文件,并将其转换为数据框以进行进一步处理。
  2. ```python
  3. from simpledbf import Dbf5
  4. dbf = Dbf5(r"C:\Users\Prashant.kumar\Downloads\dbf\F1_1.DBF")
  5. df1 = dbf.to_dataframe()

不幸的是,我遇到了以下错误。

我尝试寻找解决方案,但未能找到解决方法,也无法找到将 dbf 文件转换为数据框进行后续处理的替代方法。

这是文件链接:
https://mega.nz/folder/gKIBUKIa#rE7TmE5FToLzCblMhLLFbw

有没有一种方法可以将这个 dbf 文件读取到 Python 中作为一个数据框?

  1. <details>
  2. <summary>英文:</summary>
  3. I am trying to read a dbf file using `simpledbf` library and convert to to dataframe for further processing.

from simpledbf import Dbf5
dbf = Dbf5(r"C:\Users\Prashant.kumar\Downloads\dbf\F1_1.DBF")
df1 = dbf.to_dataframe()

  1. Unfortunately, I am getting the following error.
  2. [![enter image description here][1]][1]
  3. I tried to find a solution but couldn&#39;t find a resolution, nor I can find an alternative way to convert the dbf file to a dataframe for post processing.
  4. Here is the file
  5. https://mega.nz/folder/gKIBUKIa#rE7TmE5FToLzCblMhLLFbw
  6. Is there a way to read this dbf to python as a dataframe?
  7. [1]: https://i.stack.imgur.com/DGaOj.png
  8. </details>
  9. # 答案1
  10. **得分**: 1
  11. 请使用 `dbfread` 替代 `simpledbf`
  12. ```python
  13. # pip install dbfread
  14. from dbfread import DBF
  15. from pandas import DataFrame
  16. dbf = DBF('F1_1.DBF')
  17. df = DataFrame(iter(dbf))

输出:

  1. >>> df
  2. RESPONDENT RESPONDEN2 RESPONDEN3 STATUS FORM_TYPE STATUS_DAT SORT_NAME PSWD_GEN _NullFlags
  3. 0 1 AEP Generating Company A 0 1990-01-01 b'\x00'
  4. 1 2 ALABAMA POWER COMPANY A 0 2000-05-03 b'\x00'
  5. 2 3 Alaska Electric Light and Power Company A 0 1990-01-01 b'\x00'
  6. 3 4 Alcoa Power Generating Inc. A 0 1990-01-01 b'\x00'
  7. 4 5 THE ALLEGHENY GENERATING COMPANY A 0 1990-01-01 b'\x00'
  8. .. ... ... ... ... ... ... ... ... ...
  9. 389 538 DesertLink, LLC A -1 2020-11-17 b'\x00'
  10. 390 539 NextEra Energy Transmission MidAtlantic Indian... A -1 2020-12-03 b'\x00'
  11. 391 540 Wilderness Line Holdings, LLC A -1 2020-12-15 b'\x00'
  12. 392 541 McKenzie Electric Cooperative, Inc. A -1 2021-04-19 b'\x00'
  13. 393 542 LS Power Grid New York Corporation I A 0 2021-08-27 b'\x00'
  14. [394 rows x 9 columns]
英文:

Use dbfread instead of simpledbf:

  1. # pip install dbfread
  2. from dbfread import DBF
  3. from pandas import DataFrame
  4. dbf = DBF(&#39;F1_1.DBF&#39;)
  5. df = DataFrame(iter(dbf))

Output:

  1. &gt;&gt;&gt; df
  2. RESPONDENT RESPONDEN2 RESPONDEN3 STATUS FORM_TYPE STATUS_DAT SORT_NAME PSWD_GEN _NullFlags
  3. 0 1 AEP Generating Company A 0 1990-01-01 b&#39;\x00&#39;
  4. 1 2 ALABAMA POWER COMPANY A 0 2000-05-03 b&#39;\x00&#39;
  5. 2 3 Alaska Electric Light and Power Company A 0 1990-01-01 b&#39;\x00&#39;
  6. 3 4 Alcoa Power Generating Inc. A 0 1990-01-01 b&#39;\x00&#39;
  7. 4 5 THE ALLEGHENY GENERATING COMPANY A 0 1990-01-01 b&#39;\x00&#39;
  8. .. ... ... ... ... ... ... ... ... ...
  9. 389 538 DesertLink, LLC A -1 2020-11-17 b&#39;\x00&#39;
  10. 390 539 NextEra Energy Transmission MidAtlantic Indian... A -1 2020-12-03 b&#39;\x00&#39;
  11. 391 540 Wilderness Line Holdings, LLC A -1 2020-12-15 b&#39;\x00&#39;
  12. 392 541 McKenzie Electric Cooperative, Inc. A -1 2021-04-19 b&#39;\x00&#39;
  13. 393 542 LS Power Grid New York Corporation I A 0 2021-08-27 b&#39;\x00&#39;
  14. [394 rows x 9 columns]

huangapple
  • 本文由 发表于 2023年2月14日 21:00:27
  • 转载请务必保留本文链接:https://go.coder-hub.com/75448200.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定