检查先前的数值是否以逻辑运算方式存在于数据集中 [R]

huangapple go评论91阅读模式
英文:

Check if the previous value is present in the dataset with a logical operation [R]

问题

我有这个数据集

  1. structure(list(N = c("a", "b", "c", "a", "b", "a", "b", "c", "a", "c"), S = c("4", "4", "4", "3", "3", "2", "2", "2", "1", "1")), class = "data.frame", row.names = c(NA, -10L))

我想验证所有组 'N' 是否有前一个观察 'S'。 并使用逻辑操作进行验证

  1. library(tidyverse)
  2. df %>% group_by(N) %>% arrange(desc(S)) %>% mutate(L = ifelse(****))

输出应该如下所示

N S P
a 4 TRUE
b 4 TRUE
c 4 FALSE
a 3 TRUE
b 3 TRUE
a 2 TRUE
b 2 FALSE
c 2 TRUE
a 1 FALSE
c 1 FALSE
英文:

I have this dataset

  1. structure(list(N = c("a", "b", "c", "a", "b", "a", "b", "c",
  2. "a", "c"), S = c("4", "4", "4", "3", "3", "2", "2", "2", "1",
  3. "1")), class = "data.frame", row.names = c(NA, -10L))

And I would like to verify if all group 'N' have a previous observation 'S'. And that with a logical operation

  1. library(tidyverse)
  2. df %>% group_by(N) %>% arrange(desc(S)) %>% mutate(L = ifelse(****))

The output should looks like this

N S P
a 4 TRUE
b 4 TRUE
c 4 FALSE
a 3 TRUE
b 3 TRUE
a 2 TRUE
b 2 FALSE
c 2 TRUE
a 1 FALSE
c 1 FALSE

答案1

得分: 2

这样怎么样 - 它在每个组内按 S 进行排序,然后将第一个(S 的最小值)标识为 FALSE,如果它们等于 S 的滞后值加一,则将其余标识为 TRUE

  1. library(dplyr)
  2. dat <- structure(list(N = c("a", "b", "a", "b", "c", "a", "c"),
  3. S = c("4", "3", "2", "2", "2", "1", "1")), class = "data.frame", row.names = c(NA,-7L))
  4. dat %>%
  5. mutate(S = as.numeric(S)) %>%
  6. group_by(N) %>%
  7. arrange(S, .by_group = TRUE) %>%
  8. mutate(P= S == (lag(S)+1),
  9. P = ifelse(is.na(P), FALSE, P))
  10. #> # A tibble: 7 × 3
  11. #> # Groups: N [3]
  12. #> N S P
  13. #> <chr> <dbl> <lgl>
  14. #> 1 a 1 FALSE
  15. #> 2 a 2 TRUE
  16. #> 3 a 4 FALSE
  17. #> 4 b 2 FALSE
  18. #> 5 b 3 TRUE
  19. #> 6 c 1 FALSE
  20. #> 7 c 2 TRUE

在 2023-03-09 使用 reprex v2.0.2 创建

英文:

How about this - it sorts by S within group and then identifies the first (smallest value of S) as FALSE and the others as TRUE if they equal the lag of S plus one.

  1. library(dplyr)
  2. dat &lt;- structure(list(N = c(&quot;a&quot;, &quot;b&quot;, &quot;a&quot;, &quot;b&quot;, &quot;c&quot;, &quot;a&quot;, &quot;c&quot;),
  3. S = c(&quot;4&quot;, &quot;3&quot;, &quot;2&quot;, &quot;2&quot;, &quot;2&quot;, &quot;1&quot;, &quot;1&quot;)), class = &quot;data.frame&quot;, row.names = c(NA,-7L))
  4. dat %&gt;%
  5. mutate(S = as.numeric(S)) %&gt;%
  6. group_by(N) %&gt;%
  7. arrange(S, .by_group = TRUE) %&gt;%
  8. mutate(P= S == (lag(S)+1),
  9. P = ifelse(is.na(P), FALSE, P))
  10. #&gt; # A tibble: 7 &#215; 3
  11. #&gt; # Groups: N [3]
  12. #&gt; N S P
  13. #&gt; &lt;chr&gt; &lt;dbl&gt; &lt;lgl&gt;
  14. #&gt; 1 a 1 FALSE
  15. #&gt; 2 a 2 TRUE
  16. #&gt; 3 a 4 FALSE
  17. #&gt; 4 b 2 FALSE
  18. #&gt; 5 b 3 TRUE
  19. #&gt; 6 c 1 FALSE
  20. #&gt; 7 c 2 TRUE

<sup>Created on 2023-03-09 with reprex v2.0.2</sup>

答案2

得分: 1

我们可以使用lead函数获取相邻元素之间的差异,将差异转换为逻辑向量(==),并按N分组。

  1. library(dplyr) # 版本 >= 1.1.0
  2. df %>%
  3. type.convert(as.is = TRUE) %>%
  4. mutate(P = (S - lead(S, default = last(S)) == 1), .by = N)

-输出

  1. N S P
  2. 1 a 4 TRUE
  3. 2 b 4 TRUE
  4. 3 c 4 FALSE
  5. 4 a 3 TRUE
  6. 5 b 3 TRUE
  7. 6 a 2 TRUE
  8. 7 b 2 FALSE
  9. 8 c 2 TRUE
  10. 9 a 1 FALSE
  11. 10 c 1 FALSE
英文:

We may use the lead to get the difference with the adjacent elements, convert the difference to logical vector (==), grouped by N

  1. library(dplyr)# version &gt;= 1.1.0
  2. df %&gt;%
  3. type.convert(as.is = TRUE) %&gt;%
  4. mutate(P = (S - lead(S, default = last(S)) == 1), .by = N)

-output

  1. N S P
  2. 1 a 4 TRUE
  3. 2 b 4 TRUE
  4. 3 c 4 FALSE
  5. 4 a 3 TRUE
  6. 5 b 3 TRUE
  7. 6 a 2 TRUE
  8. 7 b 2 FALSE
  9. 8 c 2 TRUE
  10. 9 a 1 FALSE
  11. 10 c 1 FALSE

huangapple
  • 本文由 发表于 2023年3月10日 01:13:54
  • 转载请务必保留本文链接:https://go.coder-hub.com/75687915.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定