在dplyr代码中添加一行总计,但只在特定列下方。

huangapple go评论60阅读模式
英文:

add a total row in a dplyr code but just under a specifc colum

问题

我有这个数据集

结构(列表(col2 = c(1, 1, 2, 3, 1, 2, 2, 3, 1, 2), col1 = c("R", 
"R", "R", "R", "R", "L", "R", "R", "R", "R")), 类 = c("分组_df", 
"tbl_df", "tbl", "数据框"), 行名 = c(NA, -10L), 分组 = 结构(列表(
    col1 = c("L", "R", "R", "R"), col2 = c(2, 1, 2, 3), .rows = 结构(列表(
        6L, c(1L, 2L, 5L, 9L), c(3L, 7L, 10L), c(4L, 8L)), 类型 = 整数(0), 类 = c("vctrs_list_of", 
    "vctrs_vctr", "列表"))), 类 = c("tbl_df", "tbl", "数据框"
), 行名 = c(NA, -4L), .drop = TRUE))

我想做的是在'n'列下只添加最后一行(用dplyr创建的新行),并添加“总计”列。我尝试了这段代码,但是我得到了每列的总计。

```R
库(janitor)
    数据 %>% 
       按(col1, col2) %>% 
  变异(col1 = recode(col1, 'R(更改文件名)' = 'R', 
                                             'L(更改名称和数据EEG文件)' = 'L')) %>% 
  汇总(n = n()) %>%
      adorn_totals("row")

我渴望学会如何修复它或者其他实现这个目的的策略。

谢谢

英文:

I have this data set

structure(list(col2 = c(1, 1, 2, 3, 1, 2, 2, 3, 1, 2), col1 = c("R", 
"R", "R", "R", "R", "L", "R", "R", "R", "R")), class = c("grouped_df", 
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -10L), groups = structure(list(
    col1 = c("L", "R", "R", "R"), col2 = c(2, 1, 2, 3), .rows = structure(list(
        6L, c(1L, 2L, 5L, 9L), c(3L, 7L, 10L), c(4L, 8L)), ptype = integer(0), class = c("vctrs_list_of", 
    "vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -4L), .drop = TRUE))

what I would like to do is just add only a final row under 'n' column (the new created with dplyr) wit column total. I have tried with this code, but I get total for every columns.

library(janitor)
    data %>% 
       group_by(col1, col2) %>% 
  mutate(col1 = recode(col1, 'R (change file name)' = 'R', 
                                             'L (change name e data EEG file)' = 'L')) %>% 
  summarise(n = n()) %>%
      adorn_totals("row")

I would be eager to learn how to fix it or other strategy for this purpose.

Thanks

答案1

得分: 3

你可以使用bind_rowssummarise中添加一行,其中summarise的结果为n的总和和col1的总和。你的第一个summarise已经被改成了reframe,以便得到一个无分组的数据框,就像这样:

library(dplyr)
data %>%
  group_by(col1, col2) %>%
  mutate(col1 = recode(col1, 'R (change file name)' = 'R', 
                       'L (change name e data EEG file)' = 'L')) %>%
  reframe(n = n()) %>%
  bind_rows(summarise(., across(n, sum), across(col1, ~ "Total")))
#> # A tibble: 5 × 3
#>   col1   col2     n
#>   <chr> <dbl> <int>
#> 1 L         2     1
#> 2 R         1     4
#> 3 R         2     3
#> 4 R         3     2
#> 5 Total    NA    10

英文:

You could use bind_rows to add a row with summarise which has sum for n and total for col1. Your first summarise has been changed to reframe to have an ungrouped dataframe like this:

library(dplyr)
data %&gt;% 
  group_by(col1, col2) %&gt;% 
  mutate(col1 = recode(col1, &#39;R (change file name)&#39; = &#39;R&#39;, 
                       &#39;L (change name e data EEG file)&#39; = &#39;L&#39;)) %&gt;% 
  reframe(n = n()) %&gt;%
  bind_rows(summarise(., across(n, sum), across(col1, ~ &quot;Total&quot;)))
#&gt; # A tibble: 5 &#215; 3
#&gt;   col1   col2     n
#&gt;   &lt;chr&gt; &lt;dbl&gt; &lt;int&gt;
#&gt; 1 L         2     1
#&gt; 2 R         1     4
#&gt; 3 R         2     3
#&gt; 4 R         3     2
#&gt; 5 Total    NA    10

Old answer with different dataset from OP:

You could use the adorn_totals function from the janitor package like this:

library(dplyr)
library(janitor)

data %&gt;% 
  group_by(col1, col2) %&gt;% 
  mutate(col1 = recode(col2, &#39;R (change file name)&#39; = &#39;R&#39;, 
                       &#39;L (change name e data EEG file)&#39; = &#39;L&#39;)) %&gt;% 
  summarise(n = n()) %&gt;%
  adorn_totals(&quot;row&quot;)
#&gt; `summarise()` has grouped output by &#39;col1&#39;. You can override using the
#&gt; `.groups` argument.
#&gt;   col1                            col2  n
#&gt;      L L (change name e data EEG file)  1
#&gt;      R                               R  8
#&gt;      R            R (change file name)  1
#&gt;  Total                               - 10

<sup>Created on 2023-03-09 with reprex v2.0.2</sup>

答案2

得分: 2

这是使用 `adorn_totals` 的方法:

`adorn_totals` 有一个 `...` 参数:使用 ... 需要为其他参数指定值,即使它们是空的,因此下面的 ,,,, 用于接受这些参数的默认值。请参考 @Sam Firke 的原始答案 &lt;https://stackoverflow.com/questions/69745242/calculating-and-appending-column-totals-of-select-columns-in-a-data-frame-in-r&gt; 
库(dplyr)
库(janitor)

df %>% 
  按(col1, col2) %>% 
  变异(col1 = recode(col1, 'R (change file name)' = 'R', 
                       'L (change name e data EEG file)' = 'L')) %>% 
  总结(n = n()) %>%
  adorn_totals("row",,,,,n) 


  col1 col2  n
     L    2  1
     R    1  4
     R    2  3
     R    3  2
 总计    - 10
英文:

Here is how we could do it with adorn_totals:

adorn_totals has a ... argument: Using ... requires specifying values for the other arguments, even if they're empty, thus the ,,,, below to accept the default values for those arguments. See original answer by @Sam Firke <https://stackoverflow.com/questions/69745242/calculating-and-appending-column-totals-of-select-columns-in-a-data-frame-in-r>

library(dplyr)
library(janitor)

df %&gt;% 
  group_by(col1, col2) %&gt;% 
  mutate(col1 = recode(col1, &#39;R (change file name)&#39; = &#39;R&#39;, 
                       &#39;L (change name e data EEG file)&#39; = &#39;L&#39;)) %&gt;% 
  summarise(n = n()) %&gt;%
  adorn_totals(&quot;row&quot;,,,,n) 


  col1 col2  n
     L    2  1
     R    1  4
     R    2  3
     R    3  2
 Total    - 10

huangapple
  • 本文由 发表于 2023年3月9日 16:56:57
  • 转载请务必保留本文链接:https://go.coder-hub.com/75682329.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定