在dplyr代码中添加一行总计,但只在特定列下方。

huangapple go评论89阅读模式
英文:

add a total row in a dplyr code but just under a specifc colum

问题

我有这个数据集

  1. 结构(列表(col2 = c(1, 1, 2, 3, 1, 2, 2, 3, 1, 2), col1 = c("R",
  2. "R", "R", "R", "R", "L", "R", "R", "R", "R")), = c("分组_df",
  3. "tbl_df", "tbl", "数据框"), 行名 = c(NA, -10L), 分组 = 结构(列表(
  4. col1 = c("L", "R", "R", "R"), col2 = c(2, 1, 2, 3), .rows = 结构(列表(
  5. 6L, c(1L, 2L, 5L, 9L), c(3L, 7L, 10L), c(4L, 8L)), 类型 = 整数(0), = c("vctrs_list_of",
  6. "vctrs_vctr", "列表"))), = c("tbl_df", "tbl", "数据框"
  7. ), 行名 = c(NA, -4L), .drop = TRUE))
  8. 我想做的是在'n'列下只添加最后一行(用dplyr创建的新行),并添加“总计”列。我尝试了这段代码,但是我得到了每列的总计。
  9. ```R
  10. 库(janitor)
  11. 数据 %>%
  12. 按(col1, col2) %>%
  13. 变异(col1 = recode(col1, 'R(更改文件名)' = 'R',
  14. 'L(更改名称和数据EEG文件)' = 'L')) %>%
  15. 汇总(n = n()) %>%
  16. adorn_totals("row")

我渴望学会如何修复它或者其他实现这个目的的策略。

谢谢

英文:

I have this data set

  1. structure(list(col2 = c(1, 1, 2, 3, 1, 2, 2, 3, 1, 2), col1 = c("R",
  2. "R", "R", "R", "R", "L", "R", "R", "R", "R")), class = c("grouped_df",
  3. "tbl_df", "tbl", "data.frame"), row.names = c(NA, -10L), groups = structure(list(
  4. col1 = c("L", "R", "R", "R"), col2 = c(2, 1, 2, 3), .rows = structure(list(
  5. 6L, c(1L, 2L, 5L, 9L), c(3L, 7L, 10L), c(4L, 8L)), ptype = integer(0), class = c("vctrs_list_of",
  6. "vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
  7. ), row.names = c(NA, -4L), .drop = TRUE))

what I would like to do is just add only a final row under 'n' column (the new created with dplyr) wit column total. I have tried with this code, but I get total for every columns.

  1. library(janitor)
  2. data %>%
  3. group_by(col1, col2) %>%
  4. mutate(col1 = recode(col1, 'R (change file name)' = 'R',
  5. 'L (change name e data EEG file)' = 'L')) %>%
  6. summarise(n = n()) %>%
  7. adorn_totals("row")

I would be eager to learn how to fix it or other strategy for this purpose.

Thanks

答案1

得分: 3

你可以使用bind_rowssummarise中添加一行,其中summarise的结果为n的总和和col1的总和。你的第一个summarise已经被改成了reframe,以便得到一个无分组的数据框,就像这样:

  1. library(dplyr)
  2. data %>%
  3. group_by(col1, col2) %>%
  4. mutate(col1 = recode(col1, 'R (change file name)' = 'R',
  5. 'L (change name e data EEG file)' = 'L')) %>%
  6. reframe(n = n()) %>%
  7. bind_rows(summarise(., across(n, sum), across(col1, ~ "Total")))
  8. #> # A tibble: 5 × 3
  9. #> col1 col2 n
  10. #> <chr> <dbl> <int>
  11. #> 1 L 2 1
  12. #> 2 R 1 4
  13. #> 3 R 2 3
  14. #> 4 R 3 2
  15. #> 5 Total NA 10

英文:

You could use bind_rows to add a row with summarise which has sum for n and total for col1. Your first summarise has been changed to reframe to have an ungrouped dataframe like this:

  1. library(dplyr)
  2. data %&gt;%
  3. group_by(col1, col2) %&gt;%
  4. mutate(col1 = recode(col1, &#39;R (change file name)&#39; = &#39;R&#39;,
  5. &#39;L (change name e data EEG file)&#39; = &#39;L&#39;)) %&gt;%
  6. reframe(n = n()) %&gt;%
  7. bind_rows(summarise(., across(n, sum), across(col1, ~ &quot;Total&quot;)))
  8. #&gt; # A tibble: 5 &#215; 3
  9. #&gt; col1 col2 n
  10. #&gt; &lt;chr&gt; &lt;dbl&gt; &lt;int&gt;
  11. #&gt; 1 L 2 1
  12. #&gt; 2 R 1 4
  13. #&gt; 3 R 2 3
  14. #&gt; 4 R 3 2
  15. #&gt; 5 Total NA 10

Old answer with different dataset from OP:

You could use the adorn_totals function from the janitor package like this:

  1. library(dplyr)
  2. library(janitor)
  3. data %&gt;%
  4. group_by(col1, col2) %&gt;%
  5. mutate(col1 = recode(col2, &#39;R (change file name)&#39; = &#39;R&#39;,
  6. &#39;L (change name e data EEG file)&#39; = &#39;L&#39;)) %&gt;%
  7. summarise(n = n()) %&gt;%
  8. adorn_totals(&quot;row&quot;)
  9. #&gt; `summarise()` has grouped output by &#39;col1&#39;. You can override using the
  10. #&gt; `.groups` argument.
  11. #&gt; col1 col2 n
  12. #&gt; L L (change name e data EEG file) 1
  13. #&gt; R R 8
  14. #&gt; R R (change file name) 1
  15. #&gt; Total - 10

<sup>Created on 2023-03-09 with reprex v2.0.2</sup>

答案2

得分: 2

  1. 这是使用 `adorn_totals` 的方法:
  2. `adorn_totals` 有一个 `...` 参数:使用 ... 需要为其他参数指定值,即使它们是空的,因此下面的 ,,,, 用于接受这些参数的默认值。请参考 @Sam Firke 的原始答案 &lt;https://stackoverflow.com/questions/69745242/calculating-and-appending-column-totals-of-select-columns-in-a-data-frame-in-r&gt;
  1. 库(dplyr)
  2. 库(janitor)
  3. df %>%
  4. 按(col1, col2) %>%
  5. 变异(col1 = recode(col1, 'R (change file name)' = 'R',
  6. 'L (change name e data EEG file)' = 'L')) %>%
  7. 总结(n = n()) %>%
  8. adorn_totals("row",,,,,n)
  9. col1 col2 n
  10. L 2 1
  11. R 1 4
  12. R 2 3
  13. R 3 2
  14. 总计 - 10
英文:

Here is how we could do it with adorn_totals:

adorn_totals has a ... argument: Using ... requires specifying values for the other arguments, even if they're empty, thus the ,,,, below to accept the default values for those arguments. See original answer by @Sam Firke <https://stackoverflow.com/questions/69745242/calculating-and-appending-column-totals-of-select-columns-in-a-data-frame-in-r>

  1. library(dplyr)
  2. library(janitor)
  3. df %&gt;%
  4. group_by(col1, col2) %&gt;%
  5. mutate(col1 = recode(col1, &#39;R (change file name)&#39; = &#39;R&#39;,
  6. &#39;L (change name e data EEG file)&#39; = &#39;L&#39;)) %&gt;%
  7. summarise(n = n()) %&gt;%
  8. adorn_totals(&quot;row&quot;,,,,n)
  9. col1 col2 n
  10. L 2 1
  11. R 1 4
  12. R 2 3
  13. R 3 2
  14. Total - 10

huangapple
  • 本文由 发表于 2023年3月9日 16:56:57
  • 转载请务必保留本文链接:https://go.coder-hub.com/75682329.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定