2023年8月4日 23:04:09go评论115阅读模式

英文:

How to remove duplicate rows in R based on condition?

问题

以下是翻译好的内容：

我有以下数据：

df <- data.frame(id = c("001", "001", "001", "002", "002", "003", "003"),
                 x = c(0, 0, 0, 0, 1, 0, 1))

数据的性质是，某些 id 只可能有 x = 0 的行。在给定的 id 中，如果 x = 1，那么只会在该 id 的最后一行出现。我想要删除每个 id 的重复行，但是如果某个 id 的情况是 x = 1，则只保留该行。

期望的输出：

最好使用 tidyverse 来解决。谢谢！

英文:

I have the following data:

df &lt;- data.frame(id = c(&quot;001&quot;, &quot;001&quot;, &quot;001&quot;, &quot;002&quot;, &quot;002&quot;, &quot;003&quot;, &quot;003&quot;),
                 x = c(0, 0, 0, 0, 1, 0, 1))
 id x
001 0
001 0
001 0
002 0
002 1
003 0
003 1

The nature of the data is such that it is possible for some id to only have x = 0 rows. In the case where x = 1 for a given id, it only occurs once, and that too in the last row for that id. I want to remove duplicate rows for each id, but in case x = 1 for an id, I want to keep only that row.

The desired output:

A tidyverse solution is preferable. Thanks!

答案1

得分: 5

在基本的 R 中，你可以使用 aggregate 函数：

aggregate(x ~ id, df, max)
   id x
1 001 0
2 002 1
3 003 1

英文:

in base R you could use aggregate function:

aggregate(x ~ id, df, max)
   id x
1 001 0
2 002 1
3 003 1

答案2

得分: 4

可能是slice_max

df %>%
    slice_max(x, by = id) %>%
    distinct()

或者（来自@r2evans的评论）

df %>%
    slice_max(x, by = id, with_ties = FALSE)

这将得到

英文:

Probably slice_max

df %&gt;%
    slice_max(x, by = id) %&gt;%
    distinct()

or (as comments from @r2evans)

df %&gt;%
    slice_max(x, by = id, with_ties = FALSE)

which gives

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何根据条件在R中删除重复的行？

问题

答案1

答案2

在R中的for循环 – 迭代地重复一段代码。

如何将特定列与数据框的每一列合并？

导入CSV到R并删除开头和中间的注释行。

在R中合并两个没有共同列的数据框。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。