在 pandas 数据框列内的字典推导式

huangapple go评论65阅读模式
英文:

Dictionary Comprehension within pandas dataframe column

问题

尝试将字典项与另一列中的字符串值进行匹配。示例数据:

df =     A    B
     0  'a'  {'a': '2', 'b': '5'}
     1  'c'  {'a': '2', 'b': '16', 'c': '32'}
     2  'a'  {'a': '6', 'd': '23'} 
     3  'd'  {'b': '4', 'd': '76'}

我想要得到以下结果:

Df =     A    B
     0   'a'  {'a': '2'}
     1   'c'  {'c': '32'}
     2   'a'  {'a': '6'}
     3   'd'  {'d': '76'}

我已经在不在DataFrame内部的情况下做到了这一点:

d = {k: v for k, v in my_dict.items() if k == 'a'}

但对于单行,我无法使其工作,说实话,我也不指望它直接工作,但希望我离成功不远:

Test_df['B'] = {k: v for k, v in test_df['B'].items() if k == test_df['A']}

我遇到了以下错误:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

我需要做什么才能使其工作,或者有更好、更高效的方法吗?

翻译结果:

我尝试将字典项与另一列中的字符串值匹配
示例数据

```python
df =     A    B
     0  'a'  {'a': '2', 'b': '5'}
     1  'c'  {'a': '2', 'b': '16', 'c': '32'}
     2  'a'  {'a': '6', 'd': '23'} 
     3  'd'  {'b': '4', 'd': '76'}

我试图得到以下结果:

Df =     A    B
     0   'a'  {'a': '2'}
     1   'c'  {'c': '32'}
     2   'a'  {'a': '6'}
     3   'd'  {'d': '76'}

我已经在不在DataFrame内部的情况下做到了这一点:

d = {k: v for k, v in my_dict.items() if k == 'a'}

但对于单行,我无法使其工作,说实话,我也不指望它直接工作,但希望我离成功不远:

Test_df['B'] = {k: v for k, v in test_df['B'].items() if k == test_df['A']}

我遇到了以下错误:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

我需要做什么才能使其工作,或者有更好、更高效的方法吗?


<details>
<summary>英文:</summary>

Trying to match a dictionary item with a string value from another column. 
sample data:

    df =     A    B
         0  &#39;a&#39;  {&#39;a&#39;: &#39;2&#39;, &#39;b&#39;: &#39;5&#39;}
         1  &#39;c&#39;  {&#39;a&#39;: &#39;2&#39;, &#39;b&#39;: &#39;16&#39;, &#39;c&#39;: &#39;32&#39;}
         2  &#39;a&#39;  {&#39;a&#39;: &#39;6&#39;, &#39;d&#39;: &#39;23&#39;} 
         3  &#39;d&#39;  {&#39;b&#39;: &#39;4&#39;, &#39;d&#39;: &#39;76&#39;}
         

I&#39;m trying to get the following out:

    Df =     A    B
         0   &#39;a&#39;  {&#39;a&#39;: &#39;2&#39;}
         1   &#39;c&#39;  {&#39;c&#39;: &#39;32&#39;}
         2   &#39;a&#39;  {&#39;a&#39;: &#39;6&#39;}
         3   &#39;d&#39;  {&#39;d&#39;: &#39;76&#39;}

I got this far not inside a dataframe:

    d = {k: v for k, v in my_dict.items() if k == &#39;a&#39;}

for a single line, but I couldn&#39;t get this to work and to be fair, I didn&#39;t expect it to work directly, but was hoping i was close:

    Test_df[&#39;B&#39;] = {k: v for k, v in test_df[&#39;B&#39;].items() if k == test_df[&#39;A&#39;]}

I get the following error:

    ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().


What do I need to do to get this to work, or is there a better more efficient way?

</details>


# 答案1
**得分**: 2

你可以使用列表推导式与 [`zip`](https://docs.python.org/3/library/functions.html#zip) 函数来实现:
```python
df['B'] = [{x: d[x]} for x, d in zip(df['A'], df['B'])]

输出:

   A            B
0  a   {'a': '2'}
1  c  {'c': '32'}
2  a   {'a': '6'}
3  d  {'d': '76'}
英文:

You can use a list comprehension with zip:

df[&#39;B&#39;] = [{x: d[x]} for x, d in zip(df[&#39;A&#39;], df[&#39;B&#39;])]

Output:

   A            B
0  a   {&#39;a&#39;: &#39;2&#39;}
1  c  {&#39;c&#39;: &#39;32&#39;}
2  a   {&#39;a&#39;: &#39;6&#39;}
3  d  {&#39;d&#39;: &#39;76&#39;}

答案2

得分: 1

你可以在 pandas 中简单高效地实现这一点,方法如下:

df['B'] = df.apply(lambda x: {x[0]: x[1][x[0]]}, axis=1)

输出:

	A	B
0	a	{'a': '2'}
1	c	{'c': '32'}
2	a	{'a': '6'}
3	d	{'d': '76'}

请注意,没有检查键是否存在的错误检查。

英文:

You can do it simply and efficiently within pandas itself using the following:

df[&#39;B&#39;] = df.apply(lambda x: {x[0]: x[1][x[0]]}, axis=1)

Output:

	A	B
0	a	{&#39;a&#39;: &#39;2&#39;}
1	c	{&#39;c&#39;: &#39;32&#39;}
2	a	{&#39;a&#39;: &#39;6&#39;}
3	d	{&#39;d&#39;: &#39;76&#39;}

Note that there is no error checking for if a key does not exist

答案3

得分: 0

class MyDict():
    def __init__(self, d: dict) -> None:
        self.dict = d
    
    def __sub__(self, other):
        return {x: self.dict[x] for x in other}
        
df.B.map(MyDict) - df.A
英文:
class MyDict():
    def __init__(self, d: dict) -&gt; None:
        self.dict = d

    def __sub__(self, other):
        return {x: self.dict[x] for x in other}
    
df.B.map(MyDict) - df.A

huangapple
  • 本文由 发表于 2023年6月1日 22:58:22
  • 转载请务必保留本文链接:https://go.coder-hub.com/76383242.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定