英文:
Pandas reading CSV with ^G as separator
问题
The CSV文件使用**^G作为分隔符。我正在使用pandas,当前的分隔符是逗号。我有一个新的要求,需要读取^G**分隔的CSV。是否有任何支持的库关联?此外,所有列都包含在引号中。
示例CSV数据
"2198"^G"data"^G"x"
"2199"^G"data2"^G"y"
"2198"^G"data3"^G"z"
根据建议,我尝试了下面的命令
df = pd.read_csv(f, engine="python", sep=r"\^G", header=None, names=columns, quoting=csv.QUOTE_NONE)
我得到了下面的输出
{"col1":"\"2198\"","col2":"\"data\"","col3":"\"x\""}
如何去掉最终输出中的引号和斜杠?
英文:
The CSV file has a delimiter of ^G. I am using pandas, the current separator is a comma. I have a new requirement to read the ^G-separated CSV. Are there any supported libraries associated? Also, all the columns are enclosed in quotes.
Sample CSV data
"2198"^G"data"^G"x"
"2199"^G"data2"^G"y"
"2198"^G"data3"^G"z"
Based on the suggestion I tried below command
df = pd.read_csv(f, engine="python", sep=r"\^G", header=None, names=columns, quoting=csv.QUOTE_NONE)
I get the output below
{"col1":"\"2198\"","col2":"\"data\"","col3":"\"x\"}
How do I remove the quote marks and slashes for the data in the final output?
答案1
得分: 1
Sure, here is the translated code:
使用 [`read_csv`](http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html) 时,使用 `engine='python'` 并转义 `^`,因为它是一个特殊的正则表达式字符:
```python
df = pd.read_csv(file, sep=r"\^G", engine='python')
编辑:你可以使用 strip
进行转换以移除 "
:
columns = list('abc')
df = pd.read_csv(file,
engine="python",
sep=r"\^G",
header=None,
names=columns,
converters=dict.fromkeys(columns, lambda x: x.strip('\"')))
print(df)
结果为:
a b c
0 2198 data x
1 2199 data2 y
2 2198 data3 z
<details>
<summary>英文:</summary>
Use [`read_csv`](http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html) with `engine='python'` and escape `^` because special regex character:
df = pd.read_csv(file, sep=r"\^G", engine='python')
EDIT: You can use converter with `strip` for remove `"`:
columns = list('abc')
df = pd.read_csv(file,
engine="python",
sep=r"\^G",
header=None,
names=columns,
converters=dict.fromkeys(columns, lambda x: x.strip('"')))
print (df)
a b c
0 2198 data x
1 2199 data2 y
2 2198 data3 z
</details>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论