英文:
remove values on the basis of another column
问题
我有两列在数据框中,一列是总分,另一列是预期分数。现在我想要从预期分数列中获取那些预期分数大于总分的值。
df <- data.frame(total_score=c(4.5,12.2,4.6,9.2,12.2,36.4,4.5,12.2,4.6,9.2,12.2,36.4),
expected_score=c(4.5,12.1,NA,10,12.2,NA,5,12.5,NA,9.2,16,NA),
Region1=c("All region",NA,NA,"All region","All region",NA,"All region",NA,NA,"All region","All region",NA),
Region2=c("EAST","EAST","EAST","EAST","EAST",NA,"EAST","EAST","EAST","EAST","EAST",NA),
Region3=c("West",NA,"West","West","West","West","West",NA,"West","West","West","West"))
英文:
i have two columns in data frame both have values like one have total score and one have expected score. now i want to values from expected score columns where expected score is greater that total score.
df <- data.frame(tota_score=c(4.5,12.2,4.6,9.2,12.2,36.4,4.5,12.2,4.6,9.2,12.2,36.4),
expected_score=c(4.5,12.1,NA,10,12.2,NA,5,12.5,NA,9.2,16,NA),
Region1=c("All region",NA,NA,"All region","All region",NA,"All region",NA,NA,"All region","All region",NA),
Region2=c("EAST","EAST","EAST","EAST","EAST",NA,"EAST","EAST","EAST","EAST","EAST",NA),
Region3=c("West",NA,"West","West","West","West","West",NA,"West","West","West","West"))
答案1
得分: 1
使用 dplyr
的第一个选项如下:
library(dplyr)
df %>%
mutate(expected_score = ifelse(expected_score > total_score,
NA, expected_score))
total_score expected_score Region1 Region2 Region3
1 4.5 4.5 All region EAST West
2 12.2 12.1 <NA> EAST <NA>
3 4.6 NA <NA> EAST West
4 9.2 NA All region EAST West
5 12.2 12.2 All region EAST West
6 36.4 NA <NA> <NA> West
7 4.5 NA All region EAST West
8 12.2 NA <NA> EAST <NA>
9 4.6 NA <NA> EAST West
10 9.2 9.2 All region EAST West
11 12.2 NA All region EAST West
12 36.4 NA <NA> <NA> West
使用 data.table
你可以这样做:
library(data.table)
setDT(df)
df[expected_score > total_score, expected_score := NA]
在一个大型的 data.frame
上,使用 data.table
的后一种选项应该更快。
英文:
Try this first option that uses dplyr
:
library(dplyr)
df %>%
mutate(expected_score = ifelse(expected_score > total_score,
NA, expected_score))
total_score expected_score Region1 Region2 Region3
1 4.5 4.5 All region EAST West
2 12.2 12.1 <NA> EAST <NA>
3 4.6 NA <NA> EAST West
4 9.2 NA All region EAST West
5 12.2 12.2 All region EAST West
6 36.4 NA <NA> <NA> West
7 4.5 NA All region EAST West
8 12.2 NA <NA> EAST <NA>
9 4.6 NA <NA> EAST West
10 9.2 9.2 All region EAST West
11 12.2 NA All region EAST West
12 36.4 NA <NA> <NA> West
Using data.table
you can do:
library(data.table)
setDT(df)
df[expected_score > total_score, expected_score := NA]
On a large data.frame
, the latter option using data.table
should be much faster.
答案2
得分: 1
以下是已翻译的内容:
一行解决方案:
df[which(df[!is.na(df$expected_score),2] > df[!is.na(df$expected_score),1]),]
# 这里的2是您的预期分数,1是您的总分
英文:
One line solution:
df[which(df[!is.na(df$expected_score),2] > df[!is.na(df$expected_score),1]),]
#here 2 is your expected score and 1 is your total score
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论