字典转换为带有列表作为值的数据框

huangapple go评论59阅读模式
英文:

Dictionary to Dataframe with list as value

问题

import pandas as pd

df = pd.DataFrame.from_dict({'pl': pl, 'pr': pr}, orient='index').T
英文:

I am trying to create a dataframe from two dictionaries with matching keys with the values in neighboring columns.

I have two dictionaries:

pl = {'seq1' : ['actgcta', 'cggctatcg'], 'seq2': ['cgatcgatca'], 'seq3': ['cgatcagt', 'cgataataat']}

pr = {'seq1' : ['cagtatacga', 'attacgat', 'atcgactagt'], 'seq2': ['cgatcgatca'], 'seq3': ['cgatcagt']}

I am trying to create a dataframe that looks like this (please forgive the crude figure):

seq1 |['actgcta','cggctatcg']    |['cagtatacga', 'attacgat', 'atcgactagt']
--------------------------------------------------------------------------
seq2 |['cgatcgatca']             |['cgatcgatca']
--------------------------------------------------------------------------
seq3 |['cgatcagt', 'cgataataat'] |['cgatcagt']

I have tried working with pd.DataFrame , pd.DataFrame.from_dict , and played with various orient args, but have had no success.

答案1

得分: 1

一种使用 pandas.concat 的方法:

df = pd.concat(map(pd.Series, [pl, pr]), axis=1)

输出:

|       |          0          |                    1                     |
|-------|---------------------|-------------------------------------------|
| seq1  | [actgcta, cggctatcg] | [cagtatacga, attacgat, atcgactagt]         |
| seq2  | [cgatcgatca]         | [cgatcgatca]                              |
| seq3  | [cgatcagt, cgataataat]| [cgatcagt]                                |
英文:

One way using pandas.concat:

df = pd.concat(map(pd.Series, [pl, pr]), axis=1)

Output:

                           0                                   1
seq1    [actgcta, cggctatcg]  [cagtatacga, attacgat, atcgactagt]
seq2            [cgatcgatca]                        [cgatcgatca]
seq3  [cgatcagt, cgataataat]                          [cgatcagt]

答案2

得分: 0

很遗憾要将字典导入数据框你的列表必须具有相同的长度而它们目前并没有因此我们首先需要将这些列表放入一个列表中然后我们可以通过简单的连接操作合并数据框
英文:

Unfortunately, to import the dictionaries into dataframes, you r lists must be of the same length, which they are not. So we must first put the lists into a single list of list. Then we can merge the dataframes with simple concatenation:

import pandas as pd

pl = {'seq1' : ['actgcta', 'cggctatcg'], 'seq2': ['cgatcgatca'], 'seq3': ['cgatcagt', 'cgataataat']}
pr = {'seq1' : ['cagtatacga', 'attacgat', 'atcgactagt'], 'seq2': ['cgatcgatca'], 'seq3': ['cgatcagt']}

for keys in pl.keys():
    pl[keys] = [pl[keys]]
for keys in pr.keys():
    pr[keys] = [pr[keys]]
df = pd.DataFrame(pl)
df1 = pd.DataFrame(pr)
df2 = pd.concat([df, df1])
print(df2.transpose())

</details>



huangapple
  • 本文由 发表于 2023年2月8日 10:17:18
  • 转载请务必保留本文链接:https://go.coder-hub.com/75380794.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定