英文:
How to read dbf file in python and convert it to dataframe
问题
我正在尝试使用 `simpledbf` 库来读取一个 dbf 文件,并将其转换为数据框以进行进一步处理。
```python
from simpledbf import Dbf5
dbf = Dbf5(r"C:\Users\Prashant.kumar\Downloads\dbf\F1_1.DBF")
df1 = dbf.to_dataframe()
不幸的是,我遇到了以下错误。
我尝试寻找解决方案,但未能找到解决方法,也无法找到将 dbf 文件转换为数据框进行后续处理的替代方法。
这是文件链接:
https://mega.nz/folder/gKIBUKIa#rE7TmE5FToLzCblMhLLFbw
有没有一种方法可以将这个 dbf 文件读取到 Python 中作为一个数据框?
<details>
<summary>英文:</summary>
I am trying to read a dbf file using `simpledbf` library and convert to to dataframe for further processing.
from simpledbf import Dbf5
dbf = Dbf5(r"C:\Users\Prashant.kumar\Downloads\dbf\F1_1.DBF")
df1 = dbf.to_dataframe()
Unfortunately, I am getting the following error.
[![enter image description here][1]][1]
I tried to find a solution but couldn't find a resolution, nor I can find an alternative way to convert the dbf file to a dataframe for post processing.
Here is the file
https://mega.nz/folder/gKIBUKIa#rE7TmE5FToLzCblMhLLFbw
Is there a way to read this dbf to python as a dataframe?
[1]: https://i.stack.imgur.com/DGaOj.png
</details>
# 答案1
**得分**: 1
请使用 `dbfread` 替代 `simpledbf`:
```python
# pip install dbfread
from dbfread import DBF
from pandas import DataFrame
dbf = DBF('F1_1.DBF')
df = DataFrame(iter(dbf))
输出:
>>> df
RESPONDENT RESPONDEN2 RESPONDEN3 STATUS FORM_TYPE STATUS_DAT SORT_NAME PSWD_GEN _NullFlags
0 1 AEP Generating Company A 0 1990-01-01 b'\x00'
1 2 ALABAMA POWER COMPANY A 0 2000-05-03 b'\x00'
2 3 Alaska Electric Light and Power Company A 0 1990-01-01 b'\x00'
3 4 Alcoa Power Generating Inc. A 0 1990-01-01 b'\x00'
4 5 THE ALLEGHENY GENERATING COMPANY A 0 1990-01-01 b'\x00'
.. ... ... ... ... ... ... ... ... ...
389 538 DesertLink, LLC A -1 2020-11-17 b'\x00'
390 539 NextEra Energy Transmission MidAtlantic Indian... A -1 2020-12-03 b'\x00'
391 540 Wilderness Line Holdings, LLC A -1 2020-12-15 b'\x00'
392 541 McKenzie Electric Cooperative, Inc. A -1 2021-04-19 b'\x00'
393 542 LS Power Grid New York Corporation I A 0 2021-08-27 b'\x00'
[394 rows x 9 columns]
英文:
Use dbfread
instead of simpledbf
:
# pip install dbfread
from dbfread import DBF
from pandas import DataFrame
dbf = DBF('F1_1.DBF')
df = DataFrame(iter(dbf))
Output:
>>> df
RESPONDENT RESPONDEN2 RESPONDEN3 STATUS FORM_TYPE STATUS_DAT SORT_NAME PSWD_GEN _NullFlags
0 1 AEP Generating Company A 0 1990-01-01 b'\x00'
1 2 ALABAMA POWER COMPANY A 0 2000-05-03 b'\x00'
2 3 Alaska Electric Light and Power Company A 0 1990-01-01 b'\x00'
3 4 Alcoa Power Generating Inc. A 0 1990-01-01 b'\x00'
4 5 THE ALLEGHENY GENERATING COMPANY A 0 1990-01-01 b'\x00'
.. ... ... ... ... ... ... ... ... ...
389 538 DesertLink, LLC A -1 2020-11-17 b'\x00'
390 539 NextEra Energy Transmission MidAtlantic Indian... A -1 2020-12-03 b'\x00'
391 540 Wilderness Line Holdings, LLC A -1 2020-12-15 b'\x00'
392 541 McKenzie Electric Cooperative, Inc. A -1 2021-04-19 b'\x00'
393 542 LS Power Grid New York Corporation I A 0 2021-08-27 b'\x00'
[394 rows x 9 columns]
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论