对最近三行进行列求和,并将它们除以相邻的单元格。

huangapple go评论60阅读模式
英文:

Sum up columns and divide them by adjacent cell FOR THE MOST RECENT THREE ROWS

问题

I understand that you want the code part to remain untranslated. Here's the translated text for the content you provided:

我之前提过类似的问题,但想要进行更新。我要如何将列求和并将它们除以相邻的单元格,仅针对最近的三行?例如,如果我有以下数据集:
我有以下表格:

日期 1 2 3 4 5 6 7
2021年07月10日 20 25 25 26 26 25 31
2021年08月10日 26 25 25 26 25 31
2021年09月10日 29 29 25 26 39
2021年10月10日 20 29 25 31
2021年11月10日 20 30 30
2021年12月10日 30 30
2022年01月10日 39

我尝试将列相加,但只加到具有相邻单元格的列,然后仅对每个相应列的底部三个值相加。然后,我尝试将每个总和除以前一个总和。例如,列1和2将一直加到2021年10月10日,然后将列2除以列1。输出将是该表格以及每列下方的总值行,如下所示:

日期 1 2 3 4 5 6 7
2021年07月10日 20 25 25 26 26 25 31
2021年08月10日 26 25 25 26 25 31
2021年09月10日 29 29 25 26 39
2021年10月10日 20 29 25 31
2021年11月10日 20 30 30
2021年12月10日 30 30
2022年01月10日 39
总计 1 1.27 0.90 1.18 1.15 1.09 1.24

我正在尝试修改此代码以实现这一目标:

(d[,8:3] |> apply(2, cumsum)/
 d[,7:2] |> apply(2, cumsum)
) |>
  diag() |>
  rev()
英文:

I asked a similar question a few days back, but wanted to update it. How would I sum up columns and divide them by adjacent cell for the most recent three rows? For example, if I had the dataset:
I have the following table:

Date 1 2 3 4 5 6 7
07/10/2021 20 25 25 26 26 25 31
08/10/2021 26 25 25 26 25 31
09/10/2021 29 29 25 26 39
10/10/2021 20 29 25 31
11/10/2021 20 30 30
12/10/2021 30 30
1/10/2022 39

<!-- begin snippet: js hide: false console: true babel: false -->

<!-- language: lang-html -->

## example data d:
d &lt;- 
structure(list(Date = c(&quot;07.10.2021&quot;, &quot;08.10.2021&quot;, &quot;09.10.2021&quot;, 
&quot;10.10.2021&quot;, &quot;11.10.2021&quot;, &quot;12.10.2021&quot;, &quot;01.10.2022&quot;), X1 = c(20L, 
26L, 29L, 20L, 20L, 30L, 39L), X2 = c(25L, 25L, 29L, 29L, 30L, 
30L, NA), X3 = c(25L, 25L, 25L, 25L, 30L, NA, NA), X4 = c(26L, 
26L, 26L, 31L, NA, NA, NA), X5 = c(26L, 25L, 39L, NA, NA, NA, 
NA), X6 = c(25L, 31L, NA, NA, NA, NA, NA), X7 = c(31L, NA, NA, 
NA, NA, NA, NA)), class = &quot;data.frame&quot;, row.names = c(NA, 7L))

<!-- end snippet -->

I'm trying to add up the columns, but only up to the column that has an adjacent cell, and then ONLY UP TO THREE VALUES FROM THE BOTTOM CELL FOR EACH RESPECTIVE COLUMN. I then am trying to divide each sum by the previous sum. For example, column 1 and 2 would add down to 12/10/2021 and then column 2 would be divided by column 1. The output would be the table plus a row with the total values under each of the columns, as shown below:

Date 1 2 3 4 5 6 7
07/10/2021 20 25 25 26 26 25 31
08/10/2021 26 25 25 26 25 31
09/10/2021 29 29 25 26 39
10/10/2021 20 29 25 31
11/10/2021 20 30 30
12/10/2021 30 30
1/10/2022 39
TOTAL 1 1.27 0.90 1.18 1.15 1.09 1.24

I am trying to modify this code to do so:

<!-- begin snippet: js hide: false console: true babel: false -->

<!-- language: lang-html -->

(d[,8:3] |&gt; apply(2, cumsum)/
 d[,7:2] |&gt; apply(2, cumsum)
) |&gt;
  diag() |&gt;
  rev()

<!-- end snippet -->

答案1

得分: 1

库(tidyverse)

d <-
  结构(list(日期 = c(
    "07.10.2021", "08.10.2021", "09.10.2021",
    "10.10.2021", "11.10.2021", "12.10.2021", "01.10.2022"
  ), X1 = c(
    20L,
    26L, 29L, 20L, 20L, 30L, 39L
  ), X2 = c(
    25L, 25L, 29L, 29L, 30L,
    30L, NA
  ), X3 = c(25L, 25L, 25L, 25L, 30L, NA, NA), X4 = c(
    26L,
    26L, 26L, 31L, NA, NA, NA
  ), X5 = c(
    26L, 25L, 39L, NA, NA, NA,
    NA
  ), X6 = c(25L, 31L, NA, NA, NA, NA, NA), X7 = c(
    31L, NA, NA,
    NA, NA, NA, NA
  )), 类 = "数据框", 行名 = c(NA, 7L))


(nd <- 列名(d))
(ndx <- nd[startsWith(nd, "X")])
(ndx1 <- c(ndx[1], na.omit(dplyr::lag(ndx, 1))))

(结果行 <- map2_dbl(ndx, ndx1, \(calc_col, prev_col){
  lc <- length(na.omit(d[[calc_col]]))
  prev_3 <- sum(tail(head(d[[prev_col]], lc), n = 3))
  calc_3 <- sum(tail(head(d[[calc_col]], lc), n = 3))
  calc_3 / prev_3
}))

列名(结果行) <- ndx
bind_cols(
  数据框(日期 = "总计"),
  enframe(结果行) |> pivot_wider()
)
d |> bind_rows(
  bind_cols(
    数据框(日期 = "总计"),
    enframe(结果行) |> pivot_wider()
  )
)
        日期 X1        X2         X3        X4        X5        X6    X7
1 07.10.2021 20 25.000000 25.0000000 26.000000 26.000000 25.000000 31.00
2 08.10.2021 26 25.000000 25.0000000 26.000000 25.000000 31.000000    NA
3 09.10.2021 29 29.000000 25.0000000 26.000000 39.000000        NA    NA
4 10.10.2021 20 29.000000 25.0000000 31.000000        NA        NA    NA
5 11.10.2021 20 30.000000 30.0000000        NA        NA        NA    NA
6 12.10.2021 30 30.000000         NA        NA        NA        NA    NA
7 01.10.2022 39        NA         NA        NA        NA        NA    NA
8       总计  1  1.271429  0.9090909  1.106667  1.153846  1.098039  1.24
英文:
library(tidyverse)

d &lt;-
  structure(list(Date = c(
    &quot;07.10.2021&quot;, &quot;08.10.2021&quot;, &quot;09.10.2021&quot;,
    &quot;10.10.2021&quot;, &quot;11.10.2021&quot;, &quot;12.10.2021&quot;, &quot;01.10.2022&quot;
  ), X1 = c(
    20L,
    26L, 29L, 20L, 20L, 30L, 39L
  ), X2 = c(
    25L, 25L, 29L, 29L, 30L,
    30L, NA
  ), X3 = c(25L, 25L, 25L, 25L, 30L, NA, NA), X4 = c(
    26L,
    26L, 26L, 31L, NA, NA, NA
  ), X5 = c(
    26L, 25L, 39L, NA, NA, NA,
    NA
  ), X6 = c(25L, 31L, NA, NA, NA, NA, NA), X7 = c(
    31L, NA, NA,
    NA, NA, NA, NA
  )), class = &quot;data.frame&quot;, row.names = c(NA, 7L))


(nd &lt;- names(d))
(ndx &lt;- nd[startsWith(nd, &quot;X&quot;)])
(ndx1 &lt;- c(ndx[1], na.omit(dplyr::lag(ndx, 1))))

(result_row &lt;- map2_dbl(ndx, ndx1, \(calc_col, prev_col){
  lc &lt;- length(na.omit(d[[calc_col]]))
  prev_3 &lt;- sum(tail(head(d[[prev_col]], lc), n = 3))
  calc_3 &lt;- sum(tail(head(d[[calc_col]], lc), n = 3))
  calc_3 / prev_3
}))

names(result_row) &lt;- ndx
bind_cols(
  tibble(Date = &quot;Total&quot;),
  enframe(result_row) |&gt; pivot_wider()
)
d |&gt; bind_rows(
  bind_cols(
    tibble(Date = &quot;Total&quot;),
    enframe(result_row) |&gt; pivot_wider()
  )
)
        Date X1        X2         X3        X4        X5        X6    X7
1 07.10.2021 20 25.000000 25.0000000 26.000000 26.000000 25.000000 31.00
2 08.10.2021 26 25.000000 25.0000000 26.000000 25.000000 31.000000    NA
3 09.10.2021 29 29.000000 25.0000000 26.000000 39.000000        NA    NA
4 10.10.2021 20 29.000000 25.0000000 31.000000        NA        NA    NA
5 11.10.2021 20 30.000000 30.0000000        NA        NA        NA    NA
6 12.10.2021 30 30.000000         NA        NA        NA        NA    NA
7 01.10.2022 39        NA         NA        NA        NA        NA    NA
8      Total  1  1.271429  0.9090909  1.106667  1.153846  1.098039  1.24

</details>



# 答案2
**得分**: 0

我们可以使用这个简单的`sapply`。

首先,我认为你的`1.18`应该改为

```r
(26+26+31)/(25+25+25)
# [1] 1.106667

代码:

tmp &lt;- as.data.frame(matrix(c(1, sapply(3:ncol(d), function(ci) {
  ri &lt;- tail(which(!is.na(d[[ci]])), n = 3)
  sum(d[[ci]][ri]) / sum(d[[ci-1]][ri])
})), nrow = 1))
rbind(d, cbind(data.frame(Date=&quot;Total&quot;), setNames(tmp, names(d)[-1])))
#         Date X1        X2         X3        X4        X5        X6    X7
# 1 07.10.2021 20 25.000000 25.0000000 26.000000 26.000000 25.000000 31.00
# 2 08.10.2021 26 25.000000 25.0000000 26.000000 25.000000 31.000000    NA
# 3 09.10.2021 29 29.000000 25.0000000 26.000000 39.000000        NA    NA
# 4 10.10.2021 20 29.000000 25.0000000 31.000000        NA        NA    NA
# 5 11.10.2021 20 30.000000 30.0000000        NA        NA        NA    NA
# 6 12.10.2021 30 30.000000         NA        NA        NA        NA    NA
# 7 01.10.2022 39        NA         NA        NA        NA        NA    NA
# 8      Total  1  1.271429  0.9090909  1.106667  1.153846  1.098039  1.24

用原生管道的说法,它可以(有点)展开为:

fun &lt;- function(ci) {
  ri &lt;- tail(which(!is.na(d[[ci]])), n = 3)
  sum(d[[ci]][ri]) / sum(d[[ci-1]][ri])
}
tmp &lt;- c(1, sapply(3:ncol(d), fun)) |&gt;
  matrix(nrow = 1) |&gt;
  `colnames&lt;-`(names(d)[-1])
tmp
#      X1       X2        X3       X4       X5       X6   X7
# [1,]  1 1.271429 0.9090909 1.106667 1.153846 1.098039 1.24
out &lt;- data.frame(Date = &quot;Total&quot;) |&gt;
  cbind(tmp) |&gt;
  (\(z) rbind(d, z))()
out
#         Date X1        X2         X3        X4        X5        X6    X7
# 1 07.10.2021 20 25.000000 25.0000000 26.000000 26.000000 25.000000 31.00
# 2 08.10.2021 26 25.000000 25.0000000 26.000000 25.000000 31.000000    NA
# 3 09.10.2021 29 29.000000 25.0000000 26.000000 39.000000        NA    NA
# 4 10.10.2021 20 29.000000 25.0000000 31.000000        NA        NA    NA
# 5 11.10.2021 20 30.000000 30.0000000        NA        NA        NA    NA
# 6 12.10.2021 30 30.000000         NA        NA        NA        NA    NA
# 7 01.10.2022 39        NA         NA        NA        NA        NA    NA
# 8      Total  1  1.271429  0.9090909  1.106667  1.153846  1.098039  1.24
英文:

We can use this simple sapply.

Up front, I think your 1.18 should instead be

(26+26+31)/(25+25+25)
# [1] 1.106667

Code:

tmp &lt;- as.data.frame(matrix(c(1, sapply(3:ncol(d), function(ci) {
  ri &lt;- tail(which(!is.na(d[[ci]])), n = 3)
  sum(d[[ci]][ri]) / sum(d[[ci-1]][ri])
})), nrow = 1))
rbind(d, cbind(data.frame(Date=&quot;Total&quot;), setNames(tmp, names(d)[-1])))
#         Date X1        X2         X3        X4        X5        X6    X7
# 1 07.10.2021 20 25.000000 25.0000000 26.000000 26.000000 25.000000 31.00
# 2 08.10.2021 26 25.000000 25.0000000 26.000000 25.000000 31.000000    NA
# 3 09.10.2021 29 29.000000 25.0000000 26.000000 39.000000        NA    NA
# 4 10.10.2021 20 29.000000 25.0000000 31.000000        NA        NA    NA
# 5 11.10.2021 20 30.000000 30.0000000        NA        NA        NA    NA
# 6 12.10.2021 30 30.000000         NA        NA        NA        NA    NA
# 7 01.10.2022 39        NA         NA        NA        NA        NA    NA
# 8      Total  1  1.271429  0.9090909  1.106667  1.153846  1.098039  1.24

In native-pipe speak, it can be (somewhat) spread out as:

fun &lt;- function(ci) {
  ri &lt;- tail(which(!is.na(d[[ci]])), n = 3)
  sum(d[[ci]][ri]) / sum(d[[ci-1]][ri])
}
tmp &lt;- c(1, sapply(3:ncol(d), fun)) |&gt;
  matrix(nrow = 1) |&gt;
  `colnames&lt;-`(names(d)[-1])
tmp
#      X1       X2        X3       X4       X5       X6   X7
# [1,]  1 1.271429 0.9090909 1.106667 1.153846 1.098039 1.24
out &lt;- data.frame(Date = &quot;Total&quot;) |&gt;
  cbind(tmp) |&gt;
  (\(z) rbind(d, z))()
out
#         Date X1        X2         X3        X4        X5        X6    X7
# 1 07.10.2021 20 25.000000 25.0000000 26.000000 26.000000 25.000000 31.00
# 2 08.10.2021 26 25.000000 25.0000000 26.000000 25.000000 31.000000    NA
# 3 09.10.2021 29 29.000000 25.0000000 26.000000 39.000000        NA    NA
# 4 10.10.2021 20 29.000000 25.0000000 31.000000        NA        NA    NA
# 5 11.10.2021 20 30.000000 30.0000000        NA        NA        NA    NA
# 6 12.10.2021 30 30.000000         NA        NA        NA        NA    NA
# 7 01.10.2022 39        NA         NA        NA        NA        NA    NA
# 8      Total  1  1.271429  0.9090909  1.106667  1.153846  1.098039  1.24

huangapple
  • 本文由 发表于 2023年6月12日 23:09:38
  • 转载请务必保留本文链接:https://go.coder-hub.com/76457996.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定