如何在R中筛选具有特定数值的参与者?

huangapple go评论97阅读模式
英文:

how to filter participants that have specific values in R?

问题

  1. 这是对[这个问题][1]的跟进。我如何筛选出所有符合我所需子集中所有因素(在这种情况下,学校)的参与者,而不是一个或另一个。
  2. * 编辑 = 子集是到Level 04(从14),而不是Level 05
  3. * 我的数据框如下:
  4. ```R
  5. quest40_2[1:10,]
  6. # A tibble: 10 x 4
  7. SCHOOL Q9 Q11 Q40
  8. <glue> <fct> <fct> <fct>
  9. 1 School1 typeB level0 NA
  10. 2 School1 typeB level01 NA
  11. 3 School1 typeB level02 NA
  12. 4 School1 typeB level03 NA
  13. 5 School1 typeB NA plan1_level0upto02
  14. 6 School1 typeB NA plan2_level_03
  15. 7 School1 typeB NA plan3_level_04
  16. 8 School2 typeB level01 NA
  17. 9 School2 typeB level02 NA
  18. 10 School2 typeB level03 NA
  • 为了澄清:期望的输出:
  1. A tibble: 12 x 4
  2. SCHOOL Q9 Q11 Q40
  3. <glue> <fct> <fct> <fct>
  4. 1 School2 typeB level01 NA
  5. 2 School2 typeB level02 NA
  6. 3 School2 typeB level03 NA
  7. 4 School2 typeB level04 NA
  8. 5 School3 typeB level01 NA
  9. 6 School3 typeB level02 NA
  10. 7 School3 typeB level03 NA
  11. 8 School3 typeB level04 NA
  12. 9 School5 typeC level01 NA
  13. 10 School5 typeC level02 NA
  14. 11 School5 typeC level03 NA
  15. 12 School5 typeC level04 NA
  • 问题 我想要子集所有具有在Q11级别中 level01, level02 level03 level04SCHOOLs(例如操作符 &&)。因此,我不能有提供其中一个,但不是全部的学校。有任何想法吗?(最好使用tidyverse

  • 数据在上面的链接中。

  1. <details>
  2. <summary>英文:</summary>
  3. This is a follow up of [this question][1]. How can I filter all participants (in this case, schools) that matches ALL factors in my desired subset ? *Not one or another.*
  4. * edit = subset was up to Level 04 (1 up to 4), not Level 05
  5. * my dataframe looks like this:

quest40_2[1:10,]

A tibble: 10 x 4

SCHOOL Q9 Q11 Q40
<glue> <fct> <fct> <fct>
1 School1 typeB level0 NA
2 School1 typeB level01 NA
3 School1 typeB level02 NA
4 School1 typeB level03 NA
5 School1 typeB NA plan1_level0upto02
6 School1 typeB NA plan2_level_03
7 School1 typeB NA plan3_level_04
8 School2 typeB level01 NA
9 School2 typeB level02 NA
10 School2 typeB level03 NA

  1. * edit for clarification: desired output:

A tibble: 12 x 4
SCHOOL Q9 Q11 Q40
<glue> <fct> <fct> <fct>
1 School2 typeB level01 NA
2 School2 typeB level02 NA
3 School2 typeB level03 NA
4 School2 typeB level04 NA
5 School3 typeB level01 NA
6 School3 typeB level02 NA
7 School3 typeB level03 NA
8 School3 typeB level04 NA
9 School5 typeC level01 NA
10 School5 typeC level02 NA
11 School5 typeC level03 NA
12 School5 typeC level04 NA

  1. * **Question** I want to subset all ```SCHOOLs``` that have *level01*, *level02* *level03* **AND** *level04* in ```Q11``` levels (such as the operator &amp;&amp;, for example). Hence, I cannot have schools that offer one of them, but not all. Any ideas?(preferably, ```tidyverse``` ones)
  2. * data is in the linked post above
  3. [1]: https://stackoverflow.com/questions/75488236/how-to-get-right-proportions-based-on-a-subset-of-the-data-set-in-r?noredirect=1#comment133191892_75488236
  4. [2]: https://i.stack.imgur.com/88JuN.png
  5. </details>
  6. # 答案1
  7. **得分**: 1
  8. 代码部分不翻译,只提供翻译好的内容:
  9. 第一部分输出:
  10. ```markdown
  11. # A tibble: 96 × 4
  12. SCHOOL Q9 Q11 Q40
  13. 1 School3 typeB level01 <NA>
  14. 2 School3 typeB level02 <NA>
  15. 3 School3 typeB level03 <NA>
  16. 4 School3 typeB level04 <NA>
  17. 5 School3 typeB level05 <NA>
  18. 6 School3 typeB <NA> plan1_level0upto02
  19. 7 School3 typeB <NA> plan2_level_03
  20. 8 School3 typeB <NA> plan3_level_04
  21. 9 School3 typeB <NA> plan4_level_05
  22. 10 School5 typeC level01 <NA>
  23. # … with 86 more rows

第二部分输出:

  1. # A tibble: 44 × 4
  2. SCHOOL Q9 Q11 Q40
  3. 1 School3 typeB level01 <NA>
  4. 2 School3 typeB level02 <NA>
  5. 3 School3 typeB level04 <NA>
  6. 4 School3 typeB level05 <NA>
  7. 5 School5 typeC level01 <NA>
  8. 6 School5 typeC level02 <NA>
  9. 7 School5 typeC level04 <NA>
  10. 8 School5 typeC level05 <NA>
  11. 9 School13 typeD level01 <NA>
  12. 10 School13 typeD level02 <NA>
  13. # … with 34 more rows

第三部分输出:

  1. # A tibble: 96 × 4
  2. SCHOOL Q9 Q11 Q40
  3. 1 School3 typeB level01 <NA>
  4. 2 School3 typeB level02 <NA>
  5. 3 School3 typeB level03 <NA>
  6. 4 School3 typeB level04 <NA>
  7. 5 School3 typeB level05 <NA>
  8. 6 School3 typeB <NA> plan1_level0upto02
  9. 7 School3 typeB <NA> plan2_level_03
  10. 8 School3 typeB <NA> plan3_level_04
  11. 9 School3 typeB <NA> plan4_level_05
  12. 10 School5 typeC level01 <NA>
  13. # … with 86 more rows

第四部分输出:

  1. # A tibble: 115 × 4
  2. SCHOOL Q9 Q11 Q40
  3. 1 School1 typeB level01 <NA>
  4. 2 School1 typeB level02 <NA>
  5. 3 School2 typeB level01 <NA>
  6. 4 School2 typeB level02 <NA>
  7. 5 School2 typeB level04 <NA>
  8. 6 School3 typeB level01 <NA>
  9. 7 School3 typeB level02 <NA>
  10. 8 School3 typeB level04 <NA>
  11. 9 School3 typeB level05 <NA>
  12. 10 School4 typeB level02 <NA>
  13. # … with 105 more rows

希望这些翻译对你有帮助。

英文:

We can use filter the 'SCHOOL' having all the custom levels in 'Q11' (With dplyr 1.1, can use .by in filter

  1. library(dplyr) # version &gt;= 1.1.0
  2. quest40_2 %&gt;%
  3. filter(all(c(&quot;level01&quot;, &quot;level02&quot;, &quot;level04&quot;, &quot;level05&quot;) %in% Q11),
  4. .by = &quot;SCHOOL&quot;)

-output

  1. # A tibble: 96 &#215; 4
  2. SCHOOL Q9 Q11 Q40
  3. &lt;glue&gt; &lt;fct&gt; &lt;fct&gt; &lt;fct&gt;
  4. 1 School3 typeB level01 &lt;NA&gt;
  5. 2 School3 typeB level02 &lt;NA&gt;
  6. 3 School3 typeB level03 &lt;NA&gt;
  7. 4 School3 typeB level04 &lt;NA&gt;
  8. 5 School3 typeB level05 &lt;NA&gt;
  9. 6 School3 typeB &lt;NA&gt; plan1_level0upto02
  10. 7 School3 typeB &lt;NA&gt; plan2_level_03
  11. 8 School3 typeB &lt;NA&gt; plan3_level_04
  12. 9 School3 typeB &lt;NA&gt; plan4_level_05
  13. 10 School5 typeC level01 &lt;NA&gt;
  14. # … with 86 more rows

If we want to further filter with only those levels

  1. quest40_2 %&gt;%
  2. filter(all(c(&quot;level01&quot;, &quot;level02&quot;, &quot;level04&quot;, &quot;level05&quot;) %in%
  3. Q11), Q11 %in% c(&quot;level01&quot;, &quot;level02&quot;, &quot;level04&quot;, &quot;level05&quot;), .by = &#39;SCHOOL&#39;)

-output

  1. # A tibble: 44 &#215; 4
  2. SCHOOL Q9 Q11 Q40
  3. &lt;glue&gt; &lt;fct&gt; &lt;fct&gt; &lt;fct&gt;
  4. 1 School3 typeB level01 &lt;NA&gt;
  5. 2 School3 typeB level02 &lt;NA&gt;
  6. 3 School3 typeB level04 &lt;NA&gt;
  7. 4 School3 typeB level05 &lt;NA&gt;
  8. 5 School5 typeC level01 &lt;NA&gt;
  9. 6 School5 typeC level02 &lt;NA&gt;
  10. 7 School5 typeC level04 &lt;NA&gt;
  11. 8 School5 typeC level05 &lt;NA&gt;
  12. 9 School13 typeD level01 &lt;NA&gt;
  13. 10 School13 typeD level02 &lt;NA&gt;
  14. # … with 34 more rows

For earlier versions, use group_by

  1. quest40_2 %&gt;%
  2. group_by(SCHOOL) %&gt;%
  3. filter(all(c(&quot;level01&quot;, &quot;level02&quot;, &quot;level04&quot;, &quot;level05&quot;) %in% Q11)) %&gt;%
  4. ungroup

-output

  1. # A tibble: 96 &#215; 4
  2. SCHOOL Q9 Q11 Q40
  3. &lt;glue&gt; &lt;fct&gt; &lt;fct&gt; &lt;fct&gt;
  4. 1 School3 typeB level01 &lt;NA&gt;
  5. 2 School3 typeB level02 &lt;NA&gt;
  6. 3 School3 typeB level03 &lt;NA&gt;
  7. 4 School3 typeB level04 &lt;NA&gt;
  8. 5 School3 typeB level05 &lt;NA&gt;
  9. 6 School3 typeB &lt;NA&gt; plan1_level0upto02
  10. 7 School3 typeB &lt;NA&gt; plan2_level_03
  11. 8 School3 typeB &lt;NA&gt; plan3_level_04
  12. 9 School3 typeB &lt;NA&gt; plan4_level_05
  13. 10 School5 typeC level01 &lt;NA&gt;
  14. # … with 86 more rows

Instead of filtering the 'SCHOOL's if we need to only the filter only the custom levels, just do

  1. quest40_2 %&gt;%
  2. filter(Q11 %in% c(&quot;level01&quot;, &quot;level02&quot;, &quot;level04&quot;, &quot;level05&quot;))

-output

  1. # A tibble: 115 &#215; 4
  2. SCHOOL Q9 Q11 Q40
  3. &lt;glue&gt; &lt;fct&gt; &lt;fct&gt; &lt;fct&gt;
  4. 1 School1 typeB level01 &lt;NA&gt;
  5. 2 School1 typeB level02 &lt;NA&gt;
  6. 3 School2 typeB level01 &lt;NA&gt;
  7. 4 School2 typeB level02 &lt;NA&gt;
  8. 5 School2 typeB level04 &lt;NA&gt;
  9. 6 School3 typeB level01 &lt;NA&gt;
  10. 7 School3 typeB level02 &lt;NA&gt;
  11. 8 School3 typeB level04 &lt;NA&gt;
  12. 9 School3 typeB level05 &lt;NA&gt;
  13. 10 School4 typeB level02 &lt;NA&gt;
  14. # … with 105 more rows

答案2

得分: 1

我们可以创建一个Q11值的向量,用于在dplyr的filter()函数中保留和使用。

  1. library(dplyr)
  2. tokeep <- c("level01", "level02", "level04", "level05")
  3. quest40_2 %>%
  4. group_by(SCHOOL) %>%
  5. filter(Q11 %in% tokeep) %>%
  6. ungroup()

输出的前几行如下:

  1. # A tibble: 115 × 4
  2. SCHOOL Q9 Q11 Q40
  3. <glue> <fct> <fct> <fct>
  4. 1 School1 typeB level01 NA
  5. 2 School1 typeB level02 NA
  6. 3 School2 typeB level01 NA
  7. 4 School2 typeB level02 NA
  8. 5 School2 typeB level04 NA
  9. 6 School3 typeB level01 NA
  10. 7 School3 typeB level02 NA
  11. 8 School3 typeB level04 NA
  12. 9 School3 typeB level05 NA
  13. 10 School4 typeB level02 NA
  14. # … with 105 more rows
英文:

we can make a vector of Q11 values to keep and use for filter() fanction from dplyr.

  1. library(dplyr)
  2. tokeep&lt;-c(&quot;level01&quot;, &quot;level02&quot;, &quot;level04&quot;, &quot;level05&quot;)
  3. quest40_2 %&gt;% group_by(SCHOOL) %&gt;%
  4. filter(Q11 %in% tokeep) %&gt;% ungroup()

few rows of the output:

  1. # A tibble: 115 &#215; 4
  2. SCHOOL Q9 Q11 Q40
  3. &lt;glue&gt; &lt;fct&gt; &lt;fct&gt; &lt;fct&gt;
  4. 1 School1 typeB level01 NA
  5. 2 School1 typeB level02 NA
  6. 3 School2 typeB level01 NA
  7. 4 School2 typeB level02 NA
  8. 5 School2 typeB level04 NA
  9. 6 School3 typeB level01 NA
  10. 7 School3 typeB level02 NA
  11. 8 School3 typeB level04 NA
  12. 9 School3 typeB level05 NA
  13. 10 School4 typeB level02 NA
  14. # … with 105 more rows

huangapple
  • 本文由 发表于 2023年2月18日 07:05:37
  • 转载请务必保留本文链接:https://go.coder-hub.com/75489985.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定