在Pandas数据框中评估两个条件并进行分别的赋值。

huangapple go评论58阅读模式
英文:

Evaluating Two Conditions in Pandas Dataframe with Separate Assignments

问题

在尝试了许多不同的组合并进行了研究后,我提出了这个解决方案,但我仍然遇到了一个 ValueError。我需要根据对比两个结果是否高于或低于阈值来进行列比较的评估,然后进行“1”或“0”的赋值。例如,假设我的数据如下:

df:

       avg   var1   
    0  30     60   
    1  40     50
    2  45     20
    3  50     10
    4  50     74

df_final 需要如下所示:

       avg   var1  condition 
    0  30     60   1
    1  40     50   1
    2  45     20   0
    3  50     10   0
    4  50     74   1
    
我尝试过使用“|”作为“或”运算符,并且还尝试过使用以下条件的 np.where,得到了答案,但答案是错误的。

    df['condition'] = df[(df.var1 > df.avg == 1) | (df.var1 < df.avg == 0)]

但是得到了 ValueError。

    ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

谢谢。我觉得我已经接近了,只是差了一点。
英文:

After trying lots of different combinations and researching, I've come up with this solution but I'm still getting a ValueError. I need to make a "1" or "0" assignment based on an evaluation of a column comparison based on 2 results being above or below a threshold. For example, let's suppose my data looks like this:

df:

   avg   var1   
0  30     60   
1  40     50
2  45     20
3  50     10
4  50     74

df_final needs to look like this:

   avg   var1  condition 
0  30     60   1
1  40     50   1
2  45     20   0
3  50     10   0
4  50     74   1

I have tried this using "|" for the "or" operator and I've also tried using np.where with the condition below and get an answer but the answer is incorrect.

df[&#39;condition&#39;] = df[(df.var1 &gt; df.avg == 1) | (df.var1 &lt; df.avg == 0)]

but get the ValueError.

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Thank you. I think I'm close but just off by a little.

答案1

得分: 2

只需将布尔掩码(经过评估的条件)转换为整数类型:

df['condition'] = (df.var1 > df.avg).astype(int)

   avg  var1  condition
0   30    60          1
1   40    50          1
2   45    20          0
3   50    10          0
4   50    74          1
英文:

Just convert boolean mask (of the evaluated condition) to integer type:

df[&#39;condition&#39;] = (df.var1 &gt; df.avg).astype(int)

   avg  var1  condition
0   30    60          1
1   40    50          1
2   45    20          0
3   50    10          0
4   50    74          1

huangapple
  • 本文由 发表于 2023年2月24日 02:12:47
  • 转载请务必保留本文链接:https://go.coder-hub.com/75548757.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定