如何在Python中读取dbf文件并将其转换为数据框架

huangapple go评论54阅读模式
英文:

How to read dbf file in python and convert it to dataframe

问题

我正在尝试使用 `simpledbf` 库来读取一个 dbf 文件,并将其转换为数据框以进行进一步处理。

```python
from simpledbf import Dbf5
dbf = Dbf5(r"C:\Users\Prashant.kumar\Downloads\dbf\F1_1.DBF")
df1 = dbf.to_dataframe()

不幸的是,我遇到了以下错误。

我尝试寻找解决方案,但未能找到解决方法,也无法找到将 dbf 文件转换为数据框进行后续处理的替代方法。

这是文件链接:
https://mega.nz/folder/gKIBUKIa#rE7TmE5FToLzCblMhLLFbw

有没有一种方法可以将这个 dbf 文件读取到 Python 中作为一个数据框?


<details>
<summary>英文:</summary>

I am trying to read a dbf file using `simpledbf` library and convert to to dataframe for further processing. 

from simpledbf import Dbf5
dbf = Dbf5(r"C:\Users\Prashant.kumar\Downloads\dbf\F1_1.DBF")
df1 = dbf.to_dataframe()



Unfortunately, I am getting the following error.
[![enter image description here][1]][1]

I tried to find a solution but couldn&#39;t find a resolution, nor I can find an alternative way to convert the dbf file to a dataframe for post processing. 

Here is the file
https://mega.nz/folder/gKIBUKIa#rE7TmE5FToLzCblMhLLFbw

Is there a way to read this dbf to python as a dataframe?

  [1]: https://i.stack.imgur.com/DGaOj.png

</details>


# 答案1
**得分**: 1

请使用 `dbfread` 替代 `simpledbf`:

```python
# pip install dbfread
from dbfread import DBF
from pandas import DataFrame

dbf = DBF('F1_1.DBF')
df = DataFrame(iter(dbf))

输出:

>>> df
     RESPONDENT                                         RESPONDEN2 RESPONDEN3 STATUS  FORM_TYPE  STATUS_DAT SORT_NAME PSWD_GEN _NullFlags
0             1                             AEP Generating Company                 A          0  1990-01-01                       b'\x00'
1             2                              ALABAMA POWER COMPANY                 A          0  2000-05-03                       b'\x00'
2             3            Alaska Electric Light and Power Company                 A          0  1990-01-01                       b'\x00'
3             4                        Alcoa Power Generating Inc.                 A          0  1990-01-01                       b'\x00'
4             5                   THE ALLEGHENY GENERATING COMPANY                 A          0  1990-01-01                       b'\x00'
..          ...                                                ...        ...    ...        ...         ...       ...      ...        ...
389         538                                    DesertLink, LLC                 A         -1  2020-11-17                       b'\x00'
390         539  NextEra Energy Transmission MidAtlantic Indian...                 A         -1  2020-12-03                       b'\x00'
391         540                      Wilderness Line Holdings, LLC                 A         -1  2020-12-15                       b'\x00'
392         541                McKenzie Electric Cooperative, Inc.                 A         -1  2021-04-19                       b'\x00'
393         542               LS Power Grid New York Corporation I                 A          0  2021-08-27                       b'\x00'

[394 rows x 9 columns]
英文:

Use dbfread instead of simpledbf:

# pip install dbfread
from dbfread import DBF
from pandas import DataFrame

dbf = DBF(&#39;F1_1.DBF&#39;)
df = DataFrame(iter(dbf))

Output:

&gt;&gt;&gt; df
     RESPONDENT                                         RESPONDEN2 RESPONDEN3 STATUS  FORM_TYPE  STATUS_DAT SORT_NAME PSWD_GEN _NullFlags
0             1                             AEP Generating Company                 A          0  1990-01-01                       b&#39;\x00&#39;
1             2                              ALABAMA POWER COMPANY                 A          0  2000-05-03                       b&#39;\x00&#39;
2             3            Alaska Electric Light and Power Company                 A          0  1990-01-01                       b&#39;\x00&#39;
3             4                        Alcoa Power Generating Inc.                 A          0  1990-01-01                       b&#39;\x00&#39;
4             5                   THE ALLEGHENY GENERATING COMPANY                 A          0  1990-01-01                       b&#39;\x00&#39;
..          ...                                                ...        ...    ...        ...         ...       ...      ...        ...
389         538                                    DesertLink, LLC                 A         -1  2020-11-17                       b&#39;\x00&#39;
390         539  NextEra Energy Transmission MidAtlantic Indian...                 A         -1  2020-12-03                       b&#39;\x00&#39;
391         540                      Wilderness Line Holdings, LLC                 A         -1  2020-12-15                       b&#39;\x00&#39;
392         541                McKenzie Electric Cooperative, Inc.                 A         -1  2021-04-19                       b&#39;\x00&#39;
393         542               LS Power Grid New York Corporation I                 A          0  2021-08-27                       b&#39;\x00&#39;

[394 rows x 9 columns]

huangapple
  • 本文由 发表于 2023年2月14日 21:00:27
  • 转载请务必保留本文链接:https://go.coder-hub.com/75448200.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定