仅在组中只有一个值时保留该值,否则过滤掉。

huangapple go评论127阅读模式
英文:

Retain Value if it's the Only One in Group, Otherwise Filter Out

问题

让我们假设我有一个看起来像这样的数据框:

dat <- data.frame(account = c(1, 2, 3, 4, 5, 6, 7, 8, 9),
                  group = c("a", "a", "b", "b", "c", "c", "c", "d", "d"),
                  status = c("active", "inactive", "inactive", "inactive", "active", "open", "inactive", "active", "active"))

我想要进行筛选,以便每个组(abc)至少有一个帐户。因此,只有状态为activeopen的帐户被保留,除非一个组中唯一的状态是inactive

如何进行这种筛选呢?

我认为可以进行多次调用,但是否有一种一行的方法呢?

dat_clean <- dat %>%
    group_by(group) %>%
    filter(all(status != "inactive") | any(status %in% c("active", "open"))) %>%
    ungroup()

这种方法会根据组筛选出至少有一个activeopen状态的帐户,除非该组中唯一的状态是inactive。这可能是一种更简洁的方法。

英文:

Let's say I have a dataframe that looks like:

dat &lt;- data.frame(account = c(1, 2, 3, 4, 5, 6, 7, 8, 9),
                  group = c(&quot;a&quot;, &quot;a&quot;, &quot;b&quot;, &quot;b&quot;, &quot;c&quot;, &quot;c&quot;, &quot;c&quot;, &quot;d&quot;, &quot;d&quot;),
                  status = c(&quot;active&quot;, &quot;inactive&quot;, &quot;inactive&quot;, &quot;inactive&quot;, &quot;active&quot;, &quot;open&quot;, &quot;inactive&quot;, &quot;active&quot;, &quot;active&quot;))

And I want to filter such that each group (a, b, or c) has at least one account. So only accounts where the status is active or open are kept, unless the only status in a group is inactive.

How would I go about filtering that?

I think I could do it in multiple calls, but is there a way to do it in one line?

active_open &lt;- dat %&gt;%
    group_by(group) %&gt;%
    filter(status == &quot;active&quot; | status == &quot;open&quot;) %&gt;%
    ungroup()

inactive &lt;- dat %&gt;%
    group_by(group) %&gt;%
    filter(n_distinct(status) == 1 &amp; status == &quot;inactive&quot;) %&gt;%
    slice_head(n = 1) %&gt;%
    ungroup()

dat_clean &lt;- bind_rows(active_open, inactive)

This works, but I'm wondering if there's a cleaner way.

答案1

得分: 1

你可以尝试如下使用 filter

dat %>%
    filter(
        if (all(status == "inactive")) {
            !duplicated(status)
        } else {
            status != "inactive"
        }, .by = group)

这将得到以下结果:

  account group   status
1       1     a   active
2       3     b inactive
3       5     c   active
4       6     c     open
英文:

You can try filter like below

dat %&gt;%
    filter(
        if (all(status == &quot;inactive&quot;)) {
            !duplicated(status)
        } else {
            status != &quot;inactive&quot;
        }, .by = group)

which gives

  account group   status
1       1     a   active
2       3     b inactive
3       5     c   active
4       6     c     open

huangapple
  • 本文由 发表于 2023年8月5日 02:57:22
  • 转载请务必保留本文链接:https://go.coder-hub.com/76838539.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定