英文:
How to use when and Otherwise statement for a Spark dataframe by boolean columns?
问题
我有一个包含三列的数据集,列1:country(字符串),列2:threshold_1(布尔值),列3:threshold_2(布尔值)
我试图根据以下逻辑创建一个新列,但出现错误
我正在使用Palantir代码工作簿进行操作,有人可以告诉我这里缺少什么吗?
df = df.withColumn("Threshold_Filter",
when((df["country"] == "INDIA") & (df["threshold_1"] == True) | (df["threshold_2"] == True), "Ind_country"
).otherwise("Dif_country"))
请注意,我只翻译了代码和与之相关的内容。
英文:
I have a dataset with three columns, col 1: country (String), col 2: threshold_1 (bool), col 3: threshold_2 (bool)
I am trying to create a new column with this logic, but getting an error
I am using the Palantir code workbook for this, can anyone tell me what I am missing here?
df = df.withColumn("Threshold_Filter",
when(df["country"]=="INDIA" & df["threshold_1"]==True | df["threshold_2 "]==True, "Ind_country"
).otherwise("Dif_country"))
答案1
得分: 2
df = (
df
.withColumn(
"Threshold_Filter",
when(
(df["country"] == "印度") &
(df["threshold_1"] == True) |
(df["threshold_2"] == True),
"印度国家"
)
.otherwise("其他国家")
)
)
英文:
You just need to put your statements in parentheses.
df = (
df
.withColumn(
"Threshold_Filter",
when(
(df["country"]=="INDIA") &
(df["threshold_1"]==True) |
(df["threshold_2 "]==True),
"Ind_country")
.otherwise("Dif_country"))
)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论