对最近三行进行列求和,并将它们除以相邻的单元格。

huangapple go评论86阅读模式
英文:

Sum up columns and divide them by adjacent cell FOR THE MOST RECENT THREE ROWS

问题

I understand that you want the code part to remain untranslated. Here's the translated text for the content you provided:

我之前提过类似的问题,但想要进行更新。我要如何将列求和并将它们除以相邻的单元格,仅针对最近的三行?例如,如果我有以下数据集:
我有以下表格:

日期 1 2 3 4 5 6 7
2021年07月10日 20 25 25 26 26 25 31
2021年08月10日 26 25 25 26 25 31
2021年09月10日 29 29 25 26 39
2021年10月10日 20 29 25 31
2021年11月10日 20 30 30
2021年12月10日 30 30
2022年01月10日 39

我尝试将列相加,但只加到具有相邻单元格的列,然后仅对每个相应列的底部三个值相加。然后,我尝试将每个总和除以前一个总和。例如,列1和2将一直加到2021年10月10日,然后将列2除以列1。输出将是该表格以及每列下方的总值行,如下所示:

日期 1 2 3 4 5 6 7
2021年07月10日 20 25 25 26 26 25 31
2021年08月10日 26 25 25 26 25 31
2021年09月10日 29 29 25 26 39
2021年10月10日 20 29 25 31
2021年11月10日 20 30 30
2021年12月10日 30 30
2022年01月10日 39
总计 1 1.27 0.90 1.18 1.15 1.09 1.24

我正在尝试修改此代码以实现这一目标:

  1. (d[,8:3] |> apply(2, cumsum)/
  2. d[,7:2] |> apply(2, cumsum)
  3. ) |>
  4. diag() |>
  5. rev()
英文:

I asked a similar question a few days back, but wanted to update it. How would I sum up columns and divide them by adjacent cell for the most recent three rows? For example, if I had the dataset:
I have the following table:

Date 1 2 3 4 5 6 7
07/10/2021 20 25 25 26 26 25 31
08/10/2021 26 25 25 26 25 31
09/10/2021 29 29 25 26 39
10/10/2021 20 29 25 31
11/10/2021 20 30 30
12/10/2021 30 30
1/10/2022 39

<!-- begin snippet: js hide: false console: true babel: false -->

<!-- language: lang-html -->

  1. ## example data d:
  2. d &lt;-
  3. structure(list(Date = c(&quot;07.10.2021&quot;, &quot;08.10.2021&quot;, &quot;09.10.2021&quot;,
  4. &quot;10.10.2021&quot;, &quot;11.10.2021&quot;, &quot;12.10.2021&quot;, &quot;01.10.2022&quot;), X1 = c(20L,
  5. 26L, 29L, 20L, 20L, 30L, 39L), X2 = c(25L, 25L, 29L, 29L, 30L,
  6. 30L, NA), X3 = c(25L, 25L, 25L, 25L, 30L, NA, NA), X4 = c(26L,
  7. 26L, 26L, 31L, NA, NA, NA), X5 = c(26L, 25L, 39L, NA, NA, NA,
  8. NA), X6 = c(25L, 31L, NA, NA, NA, NA, NA), X7 = c(31L, NA, NA,
  9. NA, NA, NA, NA)), class = &quot;data.frame&quot;, row.names = c(NA, 7L))

<!-- end snippet -->

I'm trying to add up the columns, but only up to the column that has an adjacent cell, and then ONLY UP TO THREE VALUES FROM THE BOTTOM CELL FOR EACH RESPECTIVE COLUMN. I then am trying to divide each sum by the previous sum. For example, column 1 and 2 would add down to 12/10/2021 and then column 2 would be divided by column 1. The output would be the table plus a row with the total values under each of the columns, as shown below:

Date 1 2 3 4 5 6 7
07/10/2021 20 25 25 26 26 25 31
08/10/2021 26 25 25 26 25 31
09/10/2021 29 29 25 26 39
10/10/2021 20 29 25 31
11/10/2021 20 30 30
12/10/2021 30 30
1/10/2022 39
TOTAL 1 1.27 0.90 1.18 1.15 1.09 1.24

I am trying to modify this code to do so:

<!-- begin snippet: js hide: false console: true babel: false -->

<!-- language: lang-html -->

  1. (d[,8:3] |&gt; apply(2, cumsum)/
  2. d[,7:2] |&gt; apply(2, cumsum)
  3. ) |&gt;
  4. diag() |&gt;
  5. rev()

<!-- end snippet -->

答案1

得分: 1

  1. 库(tidyverse)
  2. d <-
  3. 结构(list(日期 = c(
  4. "07.10.2021", "08.10.2021", "09.10.2021",
  5. "10.10.2021", "11.10.2021", "12.10.2021", "01.10.2022"
  6. ), X1 = c(
  7. 20L,
  8. 26L, 29L, 20L, 20L, 30L, 39L
  9. ), X2 = c(
  10. 25L, 25L, 29L, 29L, 30L,
  11. 30L, NA
  12. ), X3 = c(25L, 25L, 25L, 25L, 30L, NA, NA), X4 = c(
  13. 26L,
  14. 26L, 26L, 31L, NA, NA, NA
  15. ), X5 = c(
  16. 26L, 25L, 39L, NA, NA, NA,
  17. NA
  18. ), X6 = c(25L, 31L, NA, NA, NA, NA, NA), X7 = c(
  19. 31L, NA, NA,
  20. NA, NA, NA, NA
  21. )), = "数据框", 行名 = c(NA, 7L))
  22. (nd <- 列名(d))
  23. (ndx <- nd[startsWith(nd, "X")])
  24. (ndx1 <- c(ndx[1], na.omit(dplyr::lag(ndx, 1))))
  25. (结果行 <- map2_dbl(ndx, ndx1, \(calc_col, prev_col){
  26. lc <- length(na.omit(d[[calc_col]]))
  27. prev_3 <- sum(tail(head(d[[prev_col]], lc), n = 3))
  28. calc_3 <- sum(tail(head(d[[calc_col]], lc), n = 3))
  29. calc_3 / prev_3
  30. }))
  31. 列名(结果行) <- ndx
  32. bind_cols(
  33. 数据框(日期 = "总计"),
  34. enframe(结果行) |> pivot_wider()
  35. )
  36. d |> bind_rows(
  37. bind_cols(
  38. 数据框(日期 = "总计"),
  39. enframe(结果行) |> pivot_wider()
  40. )
  41. )
  1. 日期 X1 X2 X3 X4 X5 X6 X7
  2. 1 07.10.2021 20 25.000000 25.0000000 26.000000 26.000000 25.000000 31.00
  3. 2 08.10.2021 26 25.000000 25.0000000 26.000000 25.000000 31.000000 NA
  4. 3 09.10.2021 29 29.000000 25.0000000 26.000000 39.000000 NA NA
  5. 4 10.10.2021 20 29.000000 25.0000000 31.000000 NA NA NA
  6. 5 11.10.2021 20 30.000000 30.0000000 NA NA NA NA
  7. 6 12.10.2021 30 30.000000 NA NA NA NA NA
  8. 7 01.10.2022 39 NA NA NA NA NA NA
  9. 8 总计 1 1.271429 0.9090909 1.106667 1.153846 1.098039 1.24
英文:
  1. library(tidyverse)
  2. d &lt;-
  3. structure(list(Date = c(
  4. &quot;07.10.2021&quot;, &quot;08.10.2021&quot;, &quot;09.10.2021&quot;,
  5. &quot;10.10.2021&quot;, &quot;11.10.2021&quot;, &quot;12.10.2021&quot;, &quot;01.10.2022&quot;
  6. ), X1 = c(
  7. 20L,
  8. 26L, 29L, 20L, 20L, 30L, 39L
  9. ), X2 = c(
  10. 25L, 25L, 29L, 29L, 30L,
  11. 30L, NA
  12. ), X3 = c(25L, 25L, 25L, 25L, 30L, NA, NA), X4 = c(
  13. 26L,
  14. 26L, 26L, 31L, NA, NA, NA
  15. ), X5 = c(
  16. 26L, 25L, 39L, NA, NA, NA,
  17. NA
  18. ), X6 = c(25L, 31L, NA, NA, NA, NA, NA), X7 = c(
  19. 31L, NA, NA,
  20. NA, NA, NA, NA
  21. )), class = &quot;data.frame&quot;, row.names = c(NA, 7L))
  22. (nd &lt;- names(d))
  23. (ndx &lt;- nd[startsWith(nd, &quot;X&quot;)])
  24. (ndx1 &lt;- c(ndx[1], na.omit(dplyr::lag(ndx, 1))))
  25. (result_row &lt;- map2_dbl(ndx, ndx1, \(calc_col, prev_col){
  26. lc &lt;- length(na.omit(d[[calc_col]]))
  27. prev_3 &lt;- sum(tail(head(d[[prev_col]], lc), n = 3))
  28. calc_3 &lt;- sum(tail(head(d[[calc_col]], lc), n = 3))
  29. calc_3 / prev_3
  30. }))
  31. names(result_row) &lt;- ndx
  32. bind_cols(
  33. tibble(Date = &quot;Total&quot;),
  34. enframe(result_row) |&gt; pivot_wider()
  35. )
  36. d |&gt; bind_rows(
  37. bind_cols(
  38. tibble(Date = &quot;Total&quot;),
  39. enframe(result_row) |&gt; pivot_wider()
  40. )
  41. )
  1. Date X1 X2 X3 X4 X5 X6 X7
  2. 1 07.10.2021 20 25.000000 25.0000000 26.000000 26.000000 25.000000 31.00
  3. 2 08.10.2021 26 25.000000 25.0000000 26.000000 25.000000 31.000000 NA
  4. 3 09.10.2021 29 29.000000 25.0000000 26.000000 39.000000 NA NA
  5. 4 10.10.2021 20 29.000000 25.0000000 31.000000 NA NA NA
  6. 5 11.10.2021 20 30.000000 30.0000000 NA NA NA NA
  7. 6 12.10.2021 30 30.000000 NA NA NA NA NA
  8. 7 01.10.2022 39 NA NA NA NA NA NA
  9. 8 Total 1 1.271429 0.9090909 1.106667 1.153846 1.098039 1.24
  10. </details>
  11. # 答案2
  12. **得分**: 0
  13. 我们可以使用这个简单的`sapply`
  14. 首先,我认为你的`1.18`应该改为
  15. ```r
  16. (26+26+31)/(25+25+25)
  17. # [1] 1.106667

代码:

  1. tmp &lt;- as.data.frame(matrix(c(1, sapply(3:ncol(d), function(ci) {
  2. ri &lt;- tail(which(!is.na(d[[ci]])), n = 3)
  3. sum(d[[ci]][ri]) / sum(d[[ci-1]][ri])
  4. })), nrow = 1))
  5. rbind(d, cbind(data.frame(Date=&quot;Total&quot;), setNames(tmp, names(d)[-1])))
  6. # Date X1 X2 X3 X4 X5 X6 X7
  7. # 1 07.10.2021 20 25.000000 25.0000000 26.000000 26.000000 25.000000 31.00
  8. # 2 08.10.2021 26 25.000000 25.0000000 26.000000 25.000000 31.000000 NA
  9. # 3 09.10.2021 29 29.000000 25.0000000 26.000000 39.000000 NA NA
  10. # 4 10.10.2021 20 29.000000 25.0000000 31.000000 NA NA NA
  11. # 5 11.10.2021 20 30.000000 30.0000000 NA NA NA NA
  12. # 6 12.10.2021 30 30.000000 NA NA NA NA NA
  13. # 7 01.10.2022 39 NA NA NA NA NA NA
  14. # 8 Total 1 1.271429 0.9090909 1.106667 1.153846 1.098039 1.24

用原生管道的说法,它可以(有点)展开为:

  1. fun &lt;- function(ci) {
  2. ri &lt;- tail(which(!is.na(d[[ci]])), n = 3)
  3. sum(d[[ci]][ri]) / sum(d[[ci-1]][ri])
  4. }
  5. tmp &lt;- c(1, sapply(3:ncol(d), fun)) |&gt;
  6. matrix(nrow = 1) |&gt;
  7. `colnames&lt;-`(names(d)[-1])
  8. tmp
  9. # X1 X2 X3 X4 X5 X6 X7
  10. # [1,] 1 1.271429 0.9090909 1.106667 1.153846 1.098039 1.24
  11. out &lt;- data.frame(Date = &quot;Total&quot;) |&gt;
  12. cbind(tmp) |&gt;
  13. (\(z) rbind(d, z))()
  14. out
  15. # Date X1 X2 X3 X4 X5 X6 X7
  16. # 1 07.10.2021 20 25.000000 25.0000000 26.000000 26.000000 25.000000 31.00
  17. # 2 08.10.2021 26 25.000000 25.0000000 26.000000 25.000000 31.000000 NA
  18. # 3 09.10.2021 29 29.000000 25.0000000 26.000000 39.000000 NA NA
  19. # 4 10.10.2021 20 29.000000 25.0000000 31.000000 NA NA NA
  20. # 5 11.10.2021 20 30.000000 30.0000000 NA NA NA NA
  21. # 6 12.10.2021 30 30.000000 NA NA NA NA NA
  22. # 7 01.10.2022 39 NA NA NA NA NA NA
  23. # 8 Total 1 1.271429 0.9090909 1.106667 1.153846 1.098039 1.24
英文:

We can use this simple sapply.

Up front, I think your 1.18 should instead be

  1. (26+26+31)/(25+25+25)
  2. # [1] 1.106667

Code:

  1. tmp &lt;- as.data.frame(matrix(c(1, sapply(3:ncol(d), function(ci) {
  2. ri &lt;- tail(which(!is.na(d[[ci]])), n = 3)
  3. sum(d[[ci]][ri]) / sum(d[[ci-1]][ri])
  4. })), nrow = 1))
  5. rbind(d, cbind(data.frame(Date=&quot;Total&quot;), setNames(tmp, names(d)[-1])))
  6. # Date X1 X2 X3 X4 X5 X6 X7
  7. # 1 07.10.2021 20 25.000000 25.0000000 26.000000 26.000000 25.000000 31.00
  8. # 2 08.10.2021 26 25.000000 25.0000000 26.000000 25.000000 31.000000 NA
  9. # 3 09.10.2021 29 29.000000 25.0000000 26.000000 39.000000 NA NA
  10. # 4 10.10.2021 20 29.000000 25.0000000 31.000000 NA NA NA
  11. # 5 11.10.2021 20 30.000000 30.0000000 NA NA NA NA
  12. # 6 12.10.2021 30 30.000000 NA NA NA NA NA
  13. # 7 01.10.2022 39 NA NA NA NA NA NA
  14. # 8 Total 1 1.271429 0.9090909 1.106667 1.153846 1.098039 1.24

In native-pipe speak, it can be (somewhat) spread out as:

  1. fun &lt;- function(ci) {
  2. ri &lt;- tail(which(!is.na(d[[ci]])), n = 3)
  3. sum(d[[ci]][ri]) / sum(d[[ci-1]][ri])
  4. }
  5. tmp &lt;- c(1, sapply(3:ncol(d), fun)) |&gt;
  6. matrix(nrow = 1) |&gt;
  7. `colnames&lt;-`(names(d)[-1])
  8. tmp
  9. # X1 X2 X3 X4 X5 X6 X7
  10. # [1,] 1 1.271429 0.9090909 1.106667 1.153846 1.098039 1.24
  11. out &lt;- data.frame(Date = &quot;Total&quot;) |&gt;
  12. cbind(tmp) |&gt;
  13. (\(z) rbind(d, z))()
  14. out
  15. # Date X1 X2 X3 X4 X5 X6 X7
  16. # 1 07.10.2021 20 25.000000 25.0000000 26.000000 26.000000 25.000000 31.00
  17. # 2 08.10.2021 26 25.000000 25.0000000 26.000000 25.000000 31.000000 NA
  18. # 3 09.10.2021 29 29.000000 25.0000000 26.000000 39.000000 NA NA
  19. # 4 10.10.2021 20 29.000000 25.0000000 31.000000 NA NA NA
  20. # 5 11.10.2021 20 30.000000 30.0000000 NA NA NA NA
  21. # 6 12.10.2021 30 30.000000 NA NA NA NA NA
  22. # 7 01.10.2022 39 NA NA NA NA NA NA
  23. # 8 Total 1 1.271429 0.9090909 1.106667 1.153846 1.098039 1.24

huangapple
  • 本文由 发表于 2023年6月12日 23:09:38
  • 转载请务必保留本文链接:https://go.coder-hub.com/76457996.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定