英文:
Check if the previous value is present in the dataset with a logical operation [R]
问题
我有这个数据集
structure(list(N = c("a", "b", "c", "a", "b", "a", "b", "c", "a", "c"), S = c("4", "4", "4", "3", "3", "2", "2", "2", "1", "1")), class = "data.frame", row.names = c(NA, -10L))
我想验证所有组 'N' 是否有前一个观察 'S'。 并使用逻辑操作进行验证
library(tidyverse)
df %>% group_by(N) %>% arrange(desc(S)) %>% mutate(L = ifelse(****))
输出应该如下所示
N | S | P |
---|---|---|
a | 4 | TRUE |
b | 4 | TRUE |
c | 4 | FALSE |
a | 3 | TRUE |
b | 3 | TRUE |
a | 2 | TRUE |
b | 2 | FALSE |
c | 2 | TRUE |
a | 1 | FALSE |
c | 1 | FALSE |
英文:
I have this dataset
structure(list(N = c("a", "b", "c", "a", "b", "a", "b", "c",
"a", "c"), S = c("4", "4", "4", "3", "3", "2", "2", "2", "1",
"1")), class = "data.frame", row.names = c(NA, -10L))
And I would like to verify if all group 'N' have a previous observation 'S'. And that with a logical operation
library(tidyverse)
df %>% group_by(N) %>% arrange(desc(S)) %>% mutate(L = ifelse(****))
The output should looks like this
N | S | P |
---|---|---|
a | 4 | TRUE |
b | 4 | TRUE |
c | 4 | FALSE |
a | 3 | TRUE |
b | 3 | TRUE |
a | 2 | TRUE |
b | 2 | FALSE |
c | 2 | TRUE |
a | 1 | FALSE |
c | 1 | FALSE |
答案1
得分: 2
这样怎么样 - 它在每个组内按 S
进行排序,然后将第一个(S
的最小值)标识为 FALSE
,如果它们等于 S
的滞后值加一,则将其余标识为 TRUE
。
library(dplyr)
dat <- structure(list(N = c("a", "b", "a", "b", "c", "a", "c"),
S = c("4", "3", "2", "2", "2", "1", "1")), class = "data.frame", row.names = c(NA,-7L))
dat %>%
mutate(S = as.numeric(S)) %>%
group_by(N) %>%
arrange(S, .by_group = TRUE) %>%
mutate(P= S == (lag(S)+1),
P = ifelse(is.na(P), FALSE, P))
#> # A tibble: 7 × 3
#> # Groups: N [3]
#> N S P
#> <chr> <dbl> <lgl>
#> 1 a 1 FALSE
#> 2 a 2 TRUE
#> 3 a 4 FALSE
#> 4 b 2 FALSE
#> 5 b 3 TRUE
#> 6 c 1 FALSE
#> 7 c 2 TRUE
在 2023-03-09 使用 reprex v2.0.2 创建
英文:
How about this - it sorts by S
within group and then identifies the first (smallest value of S
) as FALSE
and the others as TRUE
if they equal the lag of S
plus one.
library(dplyr)
dat <- structure(list(N = c("a", "b", "a", "b", "c", "a", "c"),
S = c("4", "3", "2", "2", "2", "1", "1")), class = "data.frame", row.names = c(NA,-7L))
dat %>%
mutate(S = as.numeric(S)) %>%
group_by(N) %>%
arrange(S, .by_group = TRUE) %>%
mutate(P= S == (lag(S)+1),
P = ifelse(is.na(P), FALSE, P))
#> # A tibble: 7 × 3
#> # Groups: N [3]
#> N S P
#> <chr> <dbl> <lgl>
#> 1 a 1 FALSE
#> 2 a 2 TRUE
#> 3 a 4 FALSE
#> 4 b 2 FALSE
#> 5 b 3 TRUE
#> 6 c 1 FALSE
#> 7 c 2 TRUE
<sup>Created on 2023-03-09 with reprex v2.0.2</sup>
答案2
得分: 1
我们可以使用lead
函数获取相邻元素之间的差异,将差异转换为逻辑向量(==
),并按N分组。
library(dplyr) # 版本 >= 1.1.0
df %>%
type.convert(as.is = TRUE) %>%
mutate(P = (S - lead(S, default = last(S)) == 1), .by = N)
-输出
N S P
1 a 4 TRUE
2 b 4 TRUE
3 c 4 FALSE
4 a 3 TRUE
5 b 3 TRUE
6 a 2 TRUE
7 b 2 FALSE
8 c 2 TRUE
9 a 1 FALSE
10 c 1 FALSE
英文:
We may use the lead
to get the difference with the adjacent elements, convert the difference to logical vector (==
), grouped by N
library(dplyr)# version >= 1.1.0
df %>%
type.convert(as.is = TRUE) %>%
mutate(P = (S - lead(S, default = last(S)) == 1), .by = N)
-output
N S P
1 a 4 TRUE
2 b 4 TRUE
3 c 4 FALSE
4 a 3 TRUE
5 b 3 TRUE
6 a 2 TRUE
7 b 2 FALSE
8 c 2 TRUE
9 a 1 FALSE
10 c 1 FALSE
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论