2023年5月22日 19:12:14go评论94阅读模式

英文:

remove values on the basis of another column

问题

我有两列在数据框中，一列是总分，另一列是预期分数。现在我想要从预期分数列中获取那些预期分数大于总分的值。

df <- data.frame(total_score=c(4.5,12.2,4.6,9.2,12.2,36.4,4.5,12.2,4.6,9.2,12.2,36.4),
                 expected_score=c(4.5,12.1,NA,10,12.2,NA,5,12.5,NA,9.2,16,NA),
                 Region1=c("All region",NA,NA,"All region","All region",NA,"All region",NA,NA,"All region","All region",NA),
                 Region2=c("EAST","EAST","EAST","EAST","EAST",NA,"EAST","EAST","EAST","EAST","EAST",NA),
                 Region3=c("West",NA,"West","West","West","West","West",NA,"West","West","West","West"))

英文:

i have two columns in data frame both have values like one have total score and one have expected score. now i want to values from expected score columns where expected score is greater that total score.

df &lt;- data.frame(tota_score=c(4.5,12.2,4.6,9.2,12.2,36.4,4.5,12.2,4.6,9.2,12.2,36.4),
                 expected_score=c(4.5,12.1,NA,10,12.2,NA,5,12.5,NA,9.2,16,NA),
                 Region1=c(&quot;All region&quot;,NA,NA,&quot;All region&quot;,&quot;All region&quot;,NA,&quot;All region&quot;,NA,NA,&quot;All region&quot;,&quot;All region&quot;,NA),
                 Region2=c(&quot;EAST&quot;,&quot;EAST&quot;,&quot;EAST&quot;,&quot;EAST&quot;,&quot;EAST&quot;,NA,&quot;EAST&quot;,&quot;EAST&quot;,&quot;EAST&quot;,&quot;EAST&quot;,&quot;EAST&quot;,NA),
                 Region3=c(&quot;West&quot;,NA,&quot;West&quot;,&quot;West&quot;,&quot;West&quot;,&quot;West&quot;,&quot;West&quot;,NA,&quot;West&quot;,&quot;West&quot;,&quot;West&quot;,&quot;West&quot;))

答案1

得分: 1

使用 dplyr 的第一个选项如下：

library(dplyr)
df %>%
    mutate(expected_score = ifelse(expected_score > total_score, 
                                   NA, expected_score))

   total_score expected_score    Region1 Region2 Region3
1          4.5            4.5 All region    EAST    West
2         12.2           12.1       &lt;NA&gt;    EAST    &lt;NA&gt;
3          4.6             NA       &lt;NA&gt;    EAST    West
4          9.2             NA All region    EAST    West
5         12.2           12.2 All region    EAST    West
6         36.4             NA       &lt;NA&gt;    &lt;NA&gt;    West
7          4.5             NA All region    EAST    West
8         12.2             NA       &lt;NA&gt;    EAST    &lt;NA&gt;
9          4.6             NA       &lt;NA&gt;    EAST    West
10         9.2            9.2 All region    EAST    West
11        12.2             NA All region    EAST    West
12        36.4             NA       &lt;NA&gt;    &lt;NA&gt;    West

使用 data.table 你可以这样做：

library(data.table)
setDT(df)
df[expected_score > total_score, expected_score  := NA]

在一个大型的 data.frame 上，使用 data.table 的后一种选项应该更快。

英文:

Try this first option that uses dplyr:

library(dplyr)
df %&gt;%
    mutate(expected_score = ifelse(expected_score &gt; total_score, 
                                   NA, expected_score))

   total_score expected_score    Region1 Region2 Region3
1          4.5            4.5 All region    EAST    West
2         12.2           12.1       &lt;NA&gt;    EAST    &lt;NA&gt;
3          4.6             NA       &lt;NA&gt;    EAST    West
4          9.2             NA All region    EAST    West
5         12.2           12.2 All region    EAST    West
6         36.4             NA       &lt;NA&gt;    &lt;NA&gt;    West
7          4.5             NA All region    EAST    West
8         12.2             NA       &lt;NA&gt;    EAST    &lt;NA&gt;
9          4.6             NA       &lt;NA&gt;    EAST    West
10         9.2            9.2 All region    EAST    West
11        12.2             NA All region    EAST    West
12        36.4             NA       &lt;NA&gt;    &lt;NA&gt;    West

Using data.table you can do:

library(data.table)
setDT(df)
df[expected_score &gt; total_score, expected_score  := NA]

On a large data.frame, the latter option using data.table should be much faster.

答案2

得分: 1

以下是已翻译的内容：

一行解决方案：
df[which(df[!is.na(df$expected_score),2] > df[!is.na(df$expected_score),1]),]
# 这里的2是您的预期分数，1是您的总分

英文:

One line solution:

df[which(df[!is.na(df$expected_score),2] &gt; df[!is.na(df$expected_score),1]),]

#here 2 is your expected score and 1 is your total score

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

基于另一列删除数值。

问题

答案1

答案2

使用readRDS()和哈希检索缓存的对象。

R代码用于确定个体是否先前曾在同一组中。

R的`update()`公式不按预期工作

如何在R中将XML中的字典解嵌套？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。