英文:
Why am I losing information with .str.split(expand=True)?
问题
我正在尝试扩展一个由字符串组成的数据框的列,类似于这样:
ATTGG
CATGC
GTGCC
将其转换为一个新数据框中的多列。
我使用的命令是:
newdf = pd.DataFrame(df['col'].str.split("", expand=True))
在打印时,我发现第一列和第一行实际上是索引:
0 1 2 3 4 5
1 C A T G C
2 G T G C C
而且我的第一行被截断了,可能是因为索引的存在。
为什么我的第一行被截断了?我可以怎么做来修复这个问题?
英文:
I'm trying to expand a column of a dataframe which is made up of strings, something like this:
ATTGG
CATGC
GTGCC
into several columns in a new dataframe.
The command I used is
newdf = pd.DataFrame(df['col'].str.split("", expand = True)
When printing, I found that the first column and the first row are actually the index:
0 1 2 3 4 5
1 C A T G C
2 G T G C C
and that my first row is cut off, presumably because of the presence of the index.
Why is my first row cut off? What can I do to fix this?
答案1
得分: 1
将字符串转换为列表后再创建数据框:
newdf = pd.DataFrame.from_records(df['col'].map(list))
print(newdf)
# 输出
0 1 2 3 4
0 A T T G G
1 C A T G C
2 G T G C C
英文:
Convert your string to list before creating the dataframe:
newdf = pd.DataFrame.from_records(df['col'].map(list))
print(newdf)
# Output
0 1 2 3 4
0 A T T G G
1 C A T G C
2 G T G C C
</details>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论