英文:
Evaluating Two Conditions in Pandas Dataframe with Separate Assignments
问题
在尝试了许多不同的组合并进行了研究后,我提出了这个解决方案,但我仍然遇到了一个 ValueError。我需要根据对比两个结果是否高于或低于阈值来进行列比较的评估,然后进行“1”或“0”的赋值。例如,假设我的数据如下:
df:
avg var1
0 30 60
1 40 50
2 45 20
3 50 10
4 50 74
df_final 需要如下所示:
avg var1 condition
0 30 60 1
1 40 50 1
2 45 20 0
3 50 10 0
4 50 74 1
我尝试过使用“|”作为“或”运算符,并且还尝试过使用以下条件的 np.where,得到了答案,但答案是错误的。
df['condition'] = df[(df.var1 > df.avg == 1) | (df.var1 < df.avg == 0)]
但是得到了 ValueError。
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
谢谢。我觉得我已经接近了,只是差了一点。
英文:
After trying lots of different combinations and researching, I've come up with this solution but I'm still getting a ValueError. I need to make a "1" or "0" assignment based on an evaluation of a column comparison based on 2 results being above or below a threshold. For example, let's suppose my data looks like this:
df:
avg var1
0 30 60
1 40 50
2 45 20
3 50 10
4 50 74
df_final needs to look like this:
avg var1 condition
0 30 60 1
1 40 50 1
2 45 20 0
3 50 10 0
4 50 74 1
I have tried this using "|" for the "or" operator and I've also tried using np.where with the condition below and get an answer but the answer is incorrect.
df['condition'] = df[(df.var1 > df.avg == 1) | (df.var1 < df.avg == 0)]
but get the ValueError.
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Thank you. I think I'm close but just off by a little.
答案1
得分: 2
只需将布尔掩码(经过评估的条件)转换为整数类型:
df['condition'] = (df.var1 > df.avg).astype(int)
avg var1 condition
0 30 60 1
1 40 50 1
2 45 20 0
3 50 10 0
4 50 74 1
英文:
Just convert boolean mask (of the evaluated condition) to integer type:
df['condition'] = (df.var1 > df.avg).astype(int)
avg var1 condition
0 30 60 1
1 40 50 1
2 45 20 0
3 50 10 0
4 50 74 1
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论