使用purr或map对因子变量的总百分比进行汇总。

huangapple go评论97阅读模式
英文:

Summarising with purr or map the percentage of the total of a factor variable

问题

我正在尝试为列表的每个变量执行映射函数。为了执行此函数,我想要按某些类别进行分组,并显示每个变量因子的总百分比。

例如,我有这个列表:

  1. mtcars_list <- c("am","gear","carb")

我有要分组的变量:"cyl",以及我想要总结的变量。在这种情况下,我将转换mtcars数据库的变量"vs"为因子:

  1. mtcars$vs <- factor(mtcars$vs, levels=c('0', '1'))

然后我执行这个map::purr函数,当我在总结时使用count、prop.table或类似的函数时,它会给我一个错误:

  1. purrr::map(mtcars_list, ~ mtcars %>%
  2. group_by(cyl, .data[[.x]]) %>%
  3. summarise(count(vs), .groups = "drop")*100)

当我运行时,它会显示:

no applicable method for 'count' applied to an object of class "c('double', 'numeric')"

结果会类似于这样:

第一类别

  1. 0 1
  2. A 17.7% 83.3%
  3. B 5.0% 95.5%

第二类别

  1. 0 1
  2. A 2.0 98.0
  3. B 4.0 96.0

谢谢!!!

英文:

I'm trying to do a map function for each of the variables of a list. To do this function, I want to group_by by some categories and show the percentage of the total of each factor of a variable

For instance, I have this list:

  1. mtcars_list &lt;- c (&quot;am&quot;,&quot;gear&quot;,&quot;carb&quot;)

I have the variable I want to group_by for: "cyl" And the variable I want to summarise. In this case I will transform the variable "vs" of the mtcars database as a factor:

  1. mtcars$vs &lt;- factor(mtcars$vs , levels=c(&#39;0&#39;, &#39;1&#39;))

I then do this map:: purr function, which gives me error when I use count, prop.table or similar when summarising...

  1. purrr::map(mtcars_list, ~ mtcars %&gt;%
  2. group_by(cyl, .data[[.x]]) %&gt;%
  3. summarise(count(vs), .groups = &quot;drop&quot;)*100)

When I run this it says:

no applicable method for 'count' applied to an object of class "c('double', 'numeric')

The result would be something like this

First category

  1. 0 1
  2. A 17.7% 83.3%
  3. B 5.0% 95.5%
  1. Second category
  2. 0 1
  3. A 2.0 98.0
  4. B 4.0 96.0

Thank you!!!

答案1

得分: 1

请检查这是否是预期输出:

  1. df1 <- purrr::map(mtcars_list, ~ mtcars %>%
  2. select(cyl, vs, !!sym(.x)) %>%
  3. mutate(n = n(), .by = c(cyl, .data[[.x]], vs)) %>%
  4. mutate(n2 = n(), .by = c(cyl, .data[[.x]])) %>%
  5. group_by(cyl, .data[[.x]], vs) %>%
  6. slice_tail(n = 1) %>%
  7. mutate(perc = (n / n2) * 100) %>%
  8. pivot_wider(id_cols = c(cyl, .data[[.x]]), names_from = vs, values_from = perc)
  9. )
  10. df1
  11. [[1]]
  12. # A tibble: 6 × 4
  13. # Groups: cyl, am [6]
  14. cyl am `1` `0`
  15. <dbl> <dbl> <dbl> <dbl>
  16. 1 4 0 100 NA
  17. 2 4 1 87.5 12.5
  18. 3 6 0 100 NA
  19. 4 6 1 NA 100
  20. 5 8 0 NA 100
  21. 6 8 1 NA 100
  22. [[2]]
  23. # A tibble: 8 × 4
  24. # Groups: cyl, gear [8]
  25. cyl gear `1` `0`
  26. <dbl> <dbl> <dbl> <dbl>
  27. 1 4 3 100 NA
  28. 2 4 4 100 NA
  29. 3 4 5 50 50
  30. 4 6 3 100 NA
  31. 5 6 4 50 50
  32. 6 6 5 NA 100
  33. 7 8 3 NA 100
  34. 8 8 5 NA 100
  35. [[3]]
  36. # A tibble: 9 × 4
  37. # Groups: cyl, carb [9]
  38. cyl carb `1` `0`
  39. <dbl> <dbl> <dbl> <dbl>
  40. 1 4 1 100 NA
  41. 2 4 2 83.3 16.7
  42. 3 6 1 100 NA
  43. 4 6 4 50 50
  44. 5 6 6 NA 100
  45. 6 8 2 NA 100
  46. 7 8 3 NA 100
  47. 8 8 4 NA 100
  48. 9 8 8 NA 100
英文:

Please check if this is the expected output

  1. df1 &lt;- purrr::map(mtcars_list, ~ mtcars %&gt;% select(cyl,vs,!!sym(.x)) %&gt;%
  2. mutate(n=n() , .by=c(cyl, .data[[.x]], vs)) %&gt;%
  3. mutate(n2=n(), .by=c(cyl, .data[[.x]]) ) %&gt;%
  4. group_by(cyl, .data[[.x]], vs) %&gt;%
  5. slice_tail(n=1) %&gt;%
  6. mutate(perc=(n/n2)*100) %&gt;%
  7. pivot_wider(id_cols = c(cyl,.data[[.x]]), names_from = vs, values_from = perc)
  8. )
  9. df1
  10. [[1]]
  11. # A tibble: 6 &#215; 4
  12. # Groups: cyl, am [6]
  13. cyl am `1` `0`
  14. &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;
  15. 1 4 0 100 NA
  16. 2 4 1 87.5 12.5
  17. 3 6 0 100 NA
  18. 4 6 1 NA 100
  19. 5 8 0 NA 100
  20. 6 8 1 NA 100
  21. [[2]]
  22. # A tibble: 8 &#215; 4
  23. # Groups: cyl, gear [8]
  24. cyl gear `1` `0`
  25. &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;
  26. 1 4 3 100 NA
  27. 2 4 4 100 NA
  28. 3 4 5 50 50
  29. 4 6 3 100 NA
  30. 5 6 4 50 50
  31. 6 6 5 NA 100
  32. 7 8 3 NA 100
  33. 8 8 5 NA 100
  34. [[3]]
  35. # A tibble: 9 &#215; 4
  36. # Groups: cyl, carb [9]
  37. cyl carb `1` `0`
  38. &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;
  39. 1 4 1 100 NA
  40. 2 4 2 83.3 16.7
  41. 3 6 1 100 NA
  42. 4 6 4 50 50
  43. 5 6 6 NA 100
  44. 6 8 2 NA 100
  45. 7 8 3 NA 100
  46. 8 8 4 NA 100
  47. 9 8 8 NA 100

答案2

得分: 1

你的 dplyr 代码不起作用。对于你想要的输出格式,基本的 R 函数 tableprop.table 可以更快地达到目标:

  1. purrr::map(
  2. mtcars_list, \(x)
  3. (table(mtcars$cyl, mtcars[[x]]) |&gt; prop.table(margin = 1)) * 100
  4. )
  5. # [[1]]
  6. # 0 1
  7. # 4 27.27273 72.72727
  8. # 6 57.14286 42.85714
  9. # 8 85.71429 14.28571
  10. #
  11. # [[2]]
  12. # 3 4 5
  13. # 4 9.090909 72.727273 18.181818
  14. # 6 28.571429 57.142857 14.285714
  15. # 8 85.714286 0.000000 14.285714
  16. #
  17. # [[3]]
  18. # 1 2 3 4 6 8
  19. # 4 45.454545 54.545455 0.000000 0.000000 0.000000 0.000000
  20. # 6 28.571429 0.000000 0.000000 57.142857 14.285714 0.000000
  21. # 8 0.000000 28.571429 21.428571 42.857143 0.000000 7.142857

注意:我将代码部分保持不变,只进行了翻译。

英文:

Your dplyr code doesn't work. For the output format you want, the base R functions table and prop.table get you there faster:

  1. purrr::map(
  2. mtcars_list, \(x)
  3. (table(mtcars$cyl, mtcars[[x]]) |&gt; prop.table(margin = 1)) * 100
  4. )
  5. # [[1]]
  6. # 0 1
  7. # 4 27.27273 72.72727
  8. # 6 57.14286 42.85714
  9. # 8 85.71429 14.28571
  10. #
  11. # [[2]]
  12. # 3 4 5
  13. # 4 9.090909 72.727273 18.181818
  14. # 6 28.571429 57.142857 14.285714
  15. # 8 85.714286 0.000000 14.285714
  16. #
  17. # [[3]]
  18. # 1 2 3 4 6 8
  19. # 4 45.454545 54.545455 0.000000 0.000000 0.000000 0.000000
  20. # 6 28.571429 0.000000 0.000000 57.142857 14.285714 0.000000
  21. # 8 0.000000 28.571429 21.428571 42.857143 0.000000 7.142857

huangapple
  • 本文由 发表于 2023年7月13日 23:26:12
  • 转载请务必保留本文链接:https://go.coder-hub.com/76681063.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定