英文:
how to filter participants that have specific values in R?
问题
这是对[这个问题][1]的跟进。我如何筛选出所有符合我所需子集中所有因素(在这种情况下,学校)的参与者,而不是一个或另一个。
* 编辑 = 子集是到Level 04(从1到4),而不是Level 05
* 我的数据框如下:
```R
quest40_2[1:10,]
# A tibble: 10 x 4
SCHOOL Q9 Q11 Q40
<glue> <fct> <fct> <fct>
1 School1 typeB level0 NA
2 School1 typeB level01 NA
3 School1 typeB level02 NA
4 School1 typeB level03 NA
5 School1 typeB NA plan1_level0upto02
6 School1 typeB NA plan2_level_03
7 School1 typeB NA plan3_level_04
8 School2 typeB level01 NA
9 School2 typeB level02 NA
10 School2 typeB level03 NA
- 为了澄清:期望的输出:
A tibble: 12 x 4
SCHOOL Q9 Q11 Q40
<glue> <fct> <fct> <fct>
1 School2 typeB level01 NA
2 School2 typeB level02 NA
3 School2 typeB level03 NA
4 School2 typeB level04 NA
5 School3 typeB level01 NA
6 School3 typeB level02 NA
7 School3 typeB level03 NA
8 School3 typeB level04 NA
9 School5 typeC level01 NA
10 School5 typeC level02 NA
11 School5 typeC level03 NA
12 School5 typeC level04 NA
-
问题 我想要子集所有具有在
Q11
级别中 level01, level02 level03 和 level04 的SCHOOLs
(例如操作符 &&)。因此,我不能有提供其中一个,但不是全部的学校。有任何想法吗?(最好使用tidyverse
) -
数据在上面的链接中。
<details>
<summary>英文:</summary>
This is a follow up of [this question][1]. How can I filter all participants (in this case, schools) that matches ALL factors in my desired subset ? *Not one or another.*
* edit = subset was up to Level 04 (1 up to 4), not Level 05
* my dataframe looks like this:
quest40_2[1:10,]
A tibble: 10 x 4
SCHOOL Q9 Q11 Q40
<glue> <fct> <fct> <fct>
1 School1 typeB level0 NA
2 School1 typeB level01 NA
3 School1 typeB level02 NA
4 School1 typeB level03 NA
5 School1 typeB NA plan1_level0upto02
6 School1 typeB NA plan2_level_03
7 School1 typeB NA plan3_level_04
8 School2 typeB level01 NA
9 School2 typeB level02 NA
10 School2 typeB level03 NA
* edit for clarification: desired output:
A tibble: 12 x 4
SCHOOL Q9 Q11 Q40
<glue> <fct> <fct> <fct>
1 School2 typeB level01 NA
2 School2 typeB level02 NA
3 School2 typeB level03 NA
4 School2 typeB level04 NA
5 School3 typeB level01 NA
6 School3 typeB level02 NA
7 School3 typeB level03 NA
8 School3 typeB level04 NA
9 School5 typeC level01 NA
10 School5 typeC level02 NA
11 School5 typeC level03 NA
12 School5 typeC level04 NA
* **Question** I want to subset all ```SCHOOLs``` that have *level01*, *level02* *level03* **AND** *level04* in ```Q11``` levels (such as the operator &&, for example). Hence, I cannot have schools that offer one of them, but not all. Any ideas?(preferably, ```tidyverse``` ones)
* data is in the linked post above
[1]: https://stackoverflow.com/questions/75488236/how-to-get-right-proportions-based-on-a-subset-of-the-data-set-in-r?noredirect=1#comment133191892_75488236
[2]: https://i.stack.imgur.com/88JuN.png
</details>
# 答案1
**得分**: 1
代码部分不翻译,只提供翻译好的内容:
第一部分输出:
```markdown
# A tibble: 96 × 4
SCHOOL Q9 Q11 Q40
1 School3 typeB level01 <NA>
2 School3 typeB level02 <NA>
3 School3 typeB level03 <NA>
4 School3 typeB level04 <NA>
5 School3 typeB level05 <NA>
6 School3 typeB <NA> plan1_level0upto02
7 School3 typeB <NA> plan2_level_03
8 School3 typeB <NA> plan3_level_04
9 School3 typeB <NA> plan4_level_05
10 School5 typeC level01 <NA>
# … with 86 more rows
第二部分输出:
# A tibble: 44 × 4
SCHOOL Q9 Q11 Q40
1 School3 typeB level01 <NA>
2 School3 typeB level02 <NA>
3 School3 typeB level04 <NA>
4 School3 typeB level05 <NA>
5 School5 typeC level01 <NA>
6 School5 typeC level02 <NA>
7 School5 typeC level04 <NA>
8 School5 typeC level05 <NA>
9 School13 typeD level01 <NA>
10 School13 typeD level02 <NA>
# … with 34 more rows
第三部分输出:
# A tibble: 96 × 4
SCHOOL Q9 Q11 Q40
1 School3 typeB level01 <NA>
2 School3 typeB level02 <NA>
3 School3 typeB level03 <NA>
4 School3 typeB level04 <NA>
5 School3 typeB level05 <NA>
6 School3 typeB <NA> plan1_level0upto02
7 School3 typeB <NA> plan2_level_03
8 School3 typeB <NA> plan3_level_04
9 School3 typeB <NA> plan4_level_05
10 School5 typeC level01 <NA>
# … with 86 more rows
第四部分输出:
# A tibble: 115 × 4
SCHOOL Q9 Q11 Q40
1 School1 typeB level01 <NA>
2 School1 typeB level02 <NA>
3 School2 typeB level01 <NA>
4 School2 typeB level02 <NA>
5 School2 typeB level04 <NA>
6 School3 typeB level01 <NA>
7 School3 typeB level02 <NA>
8 School3 typeB level04 <NA>
9 School3 typeB level05 <NA>
10 School4 typeB level02 <NA>
# … with 105 more rows
希望这些翻译对你有帮助。
英文:
We can use filter
the 'SCHOOL' having all
the custom levels in 'Q11' (With dplyr 1.1
, can use .by
in filter
library(dplyr) # version >= 1.1.0
quest40_2 %>%
filter(all(c("level01", "level02", "level04", "level05") %in% Q11),
.by = "SCHOOL")
-output
# A tibble: 96 × 4
SCHOOL Q9 Q11 Q40
<glue> <fct> <fct> <fct>
1 School3 typeB level01 <NA>
2 School3 typeB level02 <NA>
3 School3 typeB level03 <NA>
4 School3 typeB level04 <NA>
5 School3 typeB level05 <NA>
6 School3 typeB <NA> plan1_level0upto02
7 School3 typeB <NA> plan2_level_03
8 School3 typeB <NA> plan3_level_04
9 School3 typeB <NA> plan4_level_05
10 School5 typeC level01 <NA>
# … with 86 more rows
If we want to further filter with only those levels
quest40_2 %>%
filter(all(c("level01", "level02", "level04", "level05") %in%
Q11), Q11 %in% c("level01", "level02", "level04", "level05"), .by = 'SCHOOL')
-output
# A tibble: 44 × 4
SCHOOL Q9 Q11 Q40
<glue> <fct> <fct> <fct>
1 School3 typeB level01 <NA>
2 School3 typeB level02 <NA>
3 School3 typeB level04 <NA>
4 School3 typeB level05 <NA>
5 School5 typeC level01 <NA>
6 School5 typeC level02 <NA>
7 School5 typeC level04 <NA>
8 School5 typeC level05 <NA>
9 School13 typeD level01 <NA>
10 School13 typeD level02 <NA>
# … with 34 more rows
For earlier versions, use group_by
quest40_2 %>%
group_by(SCHOOL) %>%
filter(all(c("level01", "level02", "level04", "level05") %in% Q11)) %>%
ungroup
-output
# A tibble: 96 × 4
SCHOOL Q9 Q11 Q40
<glue> <fct> <fct> <fct>
1 School3 typeB level01 <NA>
2 School3 typeB level02 <NA>
3 School3 typeB level03 <NA>
4 School3 typeB level04 <NA>
5 School3 typeB level05 <NA>
6 School3 typeB <NA> plan1_level0upto02
7 School3 typeB <NA> plan2_level_03
8 School3 typeB <NA> plan3_level_04
9 School3 typeB <NA> plan4_level_05
10 School5 typeC level01 <NA>
# … with 86 more rows
Instead of filtering the 'SCHOOL's if we need to only the filter
only the custom levels, just do
quest40_2 %>%
filter(Q11 %in% c("level01", "level02", "level04", "level05"))
-output
# A tibble: 115 × 4
SCHOOL Q9 Q11 Q40
<glue> <fct> <fct> <fct>
1 School1 typeB level01 <NA>
2 School1 typeB level02 <NA>
3 School2 typeB level01 <NA>
4 School2 typeB level02 <NA>
5 School2 typeB level04 <NA>
6 School3 typeB level01 <NA>
7 School3 typeB level02 <NA>
8 School3 typeB level04 <NA>
9 School3 typeB level05 <NA>
10 School4 typeB level02 <NA>
# … with 105 more rows
答案2
得分: 1
我们可以创建一个Q11值的向量,用于在dplyr的filter()函数中保留和使用。
library(dplyr)
tokeep <- c("level01", "level02", "level04", "level05")
quest40_2 %>%
group_by(SCHOOL) %>%
filter(Q11 %in% tokeep) %>%
ungroup()
输出的前几行如下:
# A tibble: 115 × 4
SCHOOL Q9 Q11 Q40
<glue> <fct> <fct> <fct>
1 School1 typeB level01 NA
2 School1 typeB level02 NA
3 School2 typeB level01 NA
4 School2 typeB level02 NA
5 School2 typeB level04 NA
6 School3 typeB level01 NA
7 School3 typeB level02 NA
8 School3 typeB level04 NA
9 School3 typeB level05 NA
10 School4 typeB level02 NA
# … with 105 more rows
英文:
we can make a vector of Q11 values to keep and use for filter() fanction from dplyr.
library(dplyr)
tokeep<-c("level01", "level02", "level04", "level05")
quest40_2 %>% group_by(SCHOOL) %>%
filter(Q11 %in% tokeep) %>% ungroup()
few rows of the output:
# A tibble: 115 × 4
SCHOOL Q9 Q11 Q40
<glue> <fct> <fct> <fct>
1 School1 typeB level01 NA
2 School1 typeB level02 NA
3 School2 typeB level01 NA
4 School2 typeB level02 NA
5 School2 typeB level04 NA
6 School3 typeB level01 NA
7 School3 typeB level02 NA
8 School3 typeB level04 NA
9 School3 typeB level05 NA
10 School4 typeB level02 NA
# … with 105 more rows
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论