创建一个带有两列的Pandas数据框,取决于另一列的值。

huangapple go评论64阅读模式
英文:

Create a dictionary with two columns of a panda Dataframe depending on another column values

问题

我想创建一个字典,根据数据框中另一列的值来创建带有两列的Pandas数据框。

我有一个如下所示的数据框:

       a    b  c
0  brand    F  0
1  brand    G  1
2   seat  yes  1
3   seat   no  0

我想创建类似于以下的字典:

brand = {'F':0,'G':1}
seat = {'no':0,'yes':1}

尝试这样做没有给我想要的结果。

dic = {}
for x, y in zip(b.values, c.values):
    dic.setdefault(y, []).append(x)

谢谢!

英文:

I would like to create a dictionary with two columns of a panda Dataframe depending on another column values.

I have a dataframe as below:

       a    b  c
0  brand    F  0
1  brand    G  1
2   seat  yes  1
3   seat   no  0

I would like to create dictionaries like:

brand = {'F':0,'G':1}
seat = {'no':0,'yes':1}

Trying this does not give me the results.

dic = {}
for x,y in zip(b.values, c.values):
    dic.setdefault(y,[]).append(x)

Thank you!

答案1

得分: 2

你的方法的修改如下:

dic = {}
for a, b, c in zip(df['a'], df['b'], df['c']):
    dic.setdefault(a, {})[b] = c

[pandas] 等效的方法如下:

dic = (df.set_index('b').groupby('a')['c']
         .agg(lambda g: g.to_dict())
         .to_dict()
      )

dic 的输出:

{'brand': {'F': 0, 'G': 1},
 'seat': {'yes': 1, 'no': 0}}
英文:

A modification of your approach would be:

dic = {}
for a, b, c in zip(df['a'], df['b'], df['c']):
    dic.setdefault(a, {})[b] = c

The [tag:pandas] equivalent:

dic = (df.set_index('b').groupby('a')['c']
         .agg(lambda g: g.to_dict())
         .to_dict()
      )

Output dic:

{'brand': {'F': 0, 'G': 1},
 'seat': {'yes': 1, 'no': 0}}

答案2

得分: 1

你无法以编程方式将变量名称从列中的值分配给变量。一个简单的解决方案是使用字典,并将字典作为值。

import pandas as pd

df = pd.DataFrame({
  'a': ['brand', 'brand', 'seat', 'seat'],
  'b': ['F', 'G', 'yes', 'no'],
  'c': [0, 1, 1, 0]
})

dic = {}
for k, g in df.groupby('a'):
    dic[k] = dict(zip(*g[['b','c']].values.T))

dic
# 返回结果:
{'brand': {'F': 0, 'G': 1}, 'seat': {'yes': 1, 'no': 0}}
英文:

You won't be able to programmatically assign variables names from values in the columns. An easy solution is a dictionary with dictionaries as the values.

import pandas as pd

df = pd.DataFrame({
  'a': ['brand', 'brand', 'seat', 'seat'],
  'b': ['F', 'G', 'yes', 'no'],
  'c': [0, 1, 1, 0]
})

dic = {}
for k, g in df.groupby('a'):
    dic[k] = dict(zip(*g[['b','c']].values.T))

dic
# returns:
{'brand': {'F': 0, 'G': 1}, 'seat': {'yes': 1, 'no': 0}}

huangapple
  • 本文由 发表于 2023年4月13日 23:29:54
  • 转载请务必保留本文链接:https://go.coder-hub.com/76007263.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定