英文:
Summarising with purr or map the percentage of the total of a factor variable
问题
我正在尝试为列表的每个变量执行映射函数。为了执行此函数,我想要按某些类别进行分组,并显示每个变量因子的总百分比。
例如,我有这个列表:
mtcars_list <- c("am","gear","carb")
我有要分组的变量:"cyl",以及我想要总结的变量。在这种情况下,我将转换mtcars数据库的变量"vs"为因子:
mtcars$vs <- factor(mtcars$vs, levels=c('0', '1'))
然后我执行这个map::purr函数,当我在总结时使用count、prop.table或类似的函数时,它会给我一个错误:
purrr::map(mtcars_list, ~ mtcars %>%
group_by(cyl, .data[[.x]]) %>%
summarise(count(vs), .groups = "drop")*100)
当我运行时,它会显示:
no applicable method for 'count' applied to an object of class "c('double', 'numeric')"
结果会类似于这样:
第一类别
0 1
A 17.7% 83.3%
B 5.0% 95.5%
第二类别
0 1
A 2.0 98.0
B 4.0 96.0
谢谢!!!
英文:
I'm trying to do a map function for each of the variables of a list. To do this function, I want to group_by by some categories and show the percentage of the total of each factor of a variable
For instance, I have this list:
mtcars_list <- c ("am","gear","carb")
I have the variable I want to group_by for: "cyl" And the variable I want to summarise. In this case I will transform the variable "vs" of the mtcars database as a factor:
mtcars$vs <- factor(mtcars$vs , levels=c('0', '1'))
I then do this map:: purr function, which gives me error when I use count, prop.table or similar when summarising...
purrr::map(mtcars_list, ~ mtcars %>%
group_by(cyl, .data[[.x]]) %>%
summarise(count(vs), .groups = "drop")*100)
When I run this it says:
no applicable method for 'count' applied to an object of class "c('double', 'numeric')
The result would be something like this
First category
0 1
A 17.7% 83.3%
B 5.0% 95.5%
Second category
0 1
A 2.0 98.0
B 4.0 96.0
Thank you!!!
答案1
得分: 1
请检查这是否是预期输出:
df1 <- purrr::map(mtcars_list, ~ mtcars %>%
select(cyl, vs, !!sym(.x)) %>%
mutate(n = n(), .by = c(cyl, .data[[.x]], vs)) %>%
mutate(n2 = n(), .by = c(cyl, .data[[.x]])) %>%
group_by(cyl, .data[[.x]], vs) %>%
slice_tail(n = 1) %>%
mutate(perc = (n / n2) * 100) %>%
pivot_wider(id_cols = c(cyl, .data[[.x]]), names_from = vs, values_from = perc)
)
df1
[[1]]
# A tibble: 6 × 4
# Groups: cyl, am [6]
cyl am `1` `0`
<dbl> <dbl> <dbl> <dbl>
1 4 0 100 NA
2 4 1 87.5 12.5
3 6 0 100 NA
4 6 1 NA 100
5 8 0 NA 100
6 8 1 NA 100
[[2]]
# A tibble: 8 × 4
# Groups: cyl, gear [8]
cyl gear `1` `0`
<dbl> <dbl> <dbl> <dbl>
1 4 3 100 NA
2 4 4 100 NA
3 4 5 50 50
4 6 3 100 NA
5 6 4 50 50
6 6 5 NA 100
7 8 3 NA 100
8 8 5 NA 100
[[3]]
# A tibble: 9 × 4
# Groups: cyl, carb [9]
cyl carb `1` `0`
<dbl> <dbl> <dbl> <dbl>
1 4 1 100 NA
2 4 2 83.3 16.7
3 6 1 100 NA
4 6 4 50 50
5 6 6 NA 100
6 8 2 NA 100
7 8 3 NA 100
8 8 4 NA 100
9 8 8 NA 100
英文:
Please check if this is the expected output
df1 <- purrr::map(mtcars_list, ~ mtcars %>% select(cyl,vs,!!sym(.x)) %>%
mutate(n=n() , .by=c(cyl, .data[[.x]], vs)) %>%
mutate(n2=n(), .by=c(cyl, .data[[.x]]) ) %>%
group_by(cyl, .data[[.x]], vs) %>%
slice_tail(n=1) %>%
mutate(perc=(n/n2)*100) %>%
pivot_wider(id_cols = c(cyl,.data[[.x]]), names_from = vs, values_from = perc)
)
df1
[[1]]
# A tibble: 6 × 4
# Groups: cyl, am [6]
cyl am `1` `0`
<dbl> <dbl> <dbl> <dbl>
1 4 0 100 NA
2 4 1 87.5 12.5
3 6 0 100 NA
4 6 1 NA 100
5 8 0 NA 100
6 8 1 NA 100
[[2]]
# A tibble: 8 × 4
# Groups: cyl, gear [8]
cyl gear `1` `0`
<dbl> <dbl> <dbl> <dbl>
1 4 3 100 NA
2 4 4 100 NA
3 4 5 50 50
4 6 3 100 NA
5 6 4 50 50
6 6 5 NA 100
7 8 3 NA 100
8 8 5 NA 100
[[3]]
# A tibble: 9 × 4
# Groups: cyl, carb [9]
cyl carb `1` `0`
<dbl> <dbl> <dbl> <dbl>
1 4 1 100 NA
2 4 2 83.3 16.7
3 6 1 100 NA
4 6 4 50 50
5 6 6 NA 100
6 8 2 NA 100
7 8 3 NA 100
8 8 4 NA 100
9 8 8 NA 100
答案2
得分: 1
你的 dplyr
代码不起作用。对于你想要的输出格式,基本的 R 函数 table
和 prop.table
可以更快地达到目标:
purrr::map(
mtcars_list, \(x)
(table(mtcars$cyl, mtcars[[x]]) |> prop.table(margin = 1)) * 100
)
# [[1]]
# 0 1
# 4 27.27273 72.72727
# 6 57.14286 42.85714
# 8 85.71429 14.28571
#
# [[2]]
# 3 4 5
# 4 9.090909 72.727273 18.181818
# 6 28.571429 57.142857 14.285714
# 8 85.714286 0.000000 14.285714
#
# [[3]]
# 1 2 3 4 6 8
# 4 45.454545 54.545455 0.000000 0.000000 0.000000 0.000000
# 6 28.571429 0.000000 0.000000 57.142857 14.285714 0.000000
# 8 0.000000 28.571429 21.428571 42.857143 0.000000 7.142857
注意:我将代码部分保持不变,只进行了翻译。
英文:
Your dplyr
code doesn't work. For the output format you want, the base R functions table
and prop.table
get you there faster:
purrr::map(
mtcars_list, \(x)
(table(mtcars$cyl, mtcars[[x]]) |> prop.table(margin = 1)) * 100
)
# [[1]]
# 0 1
# 4 27.27273 72.72727
# 6 57.14286 42.85714
# 8 85.71429 14.28571
#
# [[2]]
# 3 4 5
# 4 9.090909 72.727273 18.181818
# 6 28.571429 57.142857 14.285714
# 8 85.714286 0.000000 14.285714
#
# [[3]]
# 1 2 3 4 6 8
# 4 45.454545 54.545455 0.000000 0.000000 0.000000 0.000000
# 6 28.571429 0.000000 0.000000 57.142857 14.285714 0.000000
# 8 0.000000 28.571429 21.428571 42.857143 0.000000 7.142857
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论