检查先前的数值是否以逻辑运算方式存在于数据集中 [R]

huangapple go评论66阅读模式
英文:

Check if the previous value is present in the dataset with a logical operation [R]

问题

我有这个数据集

structure(list(N = c("a", "b", "c", "a", "b", "a", "b", "c", "a", "c"), S = c("4", "4", "4", "3", "3", "2", "2", "2", "1", "1")), class = "data.frame", row.names = c(NA, -10L))

我想验证所有组 'N' 是否有前一个观察 'S'。 并使用逻辑操作进行验证

library(tidyverse)
df %>% group_by(N) %>% arrange(desc(S)) %>% mutate(L = ifelse(****))

输出应该如下所示

N S P
a 4 TRUE
b 4 TRUE
c 4 FALSE
a 3 TRUE
b 3 TRUE
a 2 TRUE
b 2 FALSE
c 2 TRUE
a 1 FALSE
c 1 FALSE
英文:

I have this dataset

structure(list(N = c("a", "b", "c", "a", "b", "a", "b", "c", 
"a", "c"), S = c("4", "4", "4", "3", "3", "2", "2", "2", "1", 
"1")), class = "data.frame", row.names = c(NA, -10L))

And I would like to verify if all group 'N' have a previous observation 'S'. And that with a logical operation

library(tidyverse)
df %>% group_by(N) %>% arrange(desc(S)) %>% mutate(L = ifelse(****))

The output should looks like this

N S P
a 4 TRUE
b 4 TRUE
c 4 FALSE
a 3 TRUE
b 3 TRUE
a 2 TRUE
b 2 FALSE
c 2 TRUE
a 1 FALSE
c 1 FALSE

答案1

得分: 2

这样怎么样 - 它在每个组内按 S 进行排序,然后将第一个(S 的最小值)标识为 FALSE,如果它们等于 S 的滞后值加一,则将其余标识为 TRUE

library(dplyr)
dat <- structure(list(N = c("a", "b", "a", "b", "c", "a", "c"), 
                      S = c("4", "3", "2", "2", "2", "1", "1")), class = "data.frame", row.names = c(NA,-7L))

dat %>% 
  mutate(S = as.numeric(S)) %>% 
  group_by(N) %>% 
  arrange(S, .by_group = TRUE) %>% 
  mutate(P= S == (lag(S)+1), 
         P = ifelse(is.na(P), FALSE, P))
#> # A tibble: 7 × 3
#> # Groups:   N [3]
#>   N         S P    
#>   <chr> <dbl> <lgl>
#> 1 a         1 FALSE
#> 2 a         2 TRUE 
#> 3 a         4 FALSE
#> 4 b         2 FALSE
#> 5 b         3 TRUE 
#> 6 c         1 FALSE
#> 7 c         2 TRUE

在 2023-03-09 使用 reprex v2.0.2 创建

英文:

How about this - it sorts by S within group and then identifies the first (smallest value of S) as FALSE and the others as TRUE if they equal the lag of S plus one.

library(dplyr)
dat &lt;- structure(list(N = c(&quot;a&quot;, &quot;b&quot;, &quot;a&quot;, &quot;b&quot;, &quot;c&quot;, &quot;a&quot;, &quot;c&quot;), 
                      S = c(&quot;4&quot;, &quot;3&quot;, &quot;2&quot;, &quot;2&quot;, &quot;2&quot;, &quot;1&quot;, &quot;1&quot;)), class = &quot;data.frame&quot;, row.names = c(NA,-7L))

dat %&gt;% 
  mutate(S = as.numeric(S)) %&gt;% 
  group_by(N) %&gt;% 
  arrange(S, .by_group = TRUE) %&gt;% 
  mutate(P= S == (lag(S)+1), 
         P = ifelse(is.na(P), FALSE, P))
#&gt; # A tibble: 7 &#215; 3
#&gt; # Groups:   N [3]
#&gt;   N         S P    
#&gt;   &lt;chr&gt; &lt;dbl&gt; &lt;lgl&gt;
#&gt; 1 a         1 FALSE
#&gt; 2 a         2 TRUE 
#&gt; 3 a         4 FALSE
#&gt; 4 b         2 FALSE
#&gt; 5 b         3 TRUE 
#&gt; 6 c         1 FALSE
#&gt; 7 c         2 TRUE

<sup>Created on 2023-03-09 with reprex v2.0.2</sup>

答案2

得分: 1

我们可以使用lead函数获取相邻元素之间的差异,将差异转换为逻辑向量(==),并按N分组。

library(dplyr) # 版本 >= 1.1.0
df %>%
  type.convert(as.is = TRUE) %>%
  mutate(P = (S - lead(S, default = last(S)) == 1), .by = N)

-输出

    N S     P
1  a 4  TRUE
2  b 4  TRUE
3  c 4 FALSE
4  a 3  TRUE
5  b 3  TRUE
6  a 2  TRUE
7  b 2 FALSE
8  c 2  TRUE
9  a 1 FALSE
10 c 1 FALSE
英文:

We may use the lead to get the difference with the adjacent elements, convert the difference to logical vector (==), grouped by N

library(dplyr)# version &gt;= 1.1.0
df %&gt;% 
 type.convert(as.is = TRUE) %&gt;% 
  mutate(P =  (S - lead(S, default = last(S)) == 1), .by = N)

-output

    N S     P
1  a 4  TRUE
2  b 4  TRUE
3  c 4 FALSE
4  a 3  TRUE
5  b 3  TRUE
6  a 2  TRUE
7  b 2 FALSE
8  c 2  TRUE
9  a 1 FALSE
10 c 1 FALSE

huangapple
  • 本文由 发表于 2023年3月10日 01:13:54
  • 转载请务必保留本文链接:https://go.coder-hub.com/75687915.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定