英文:
Sum up columns and divide them by adjacent cell FOR THE MOST RECENT THREE ROWS
问题
I understand that you want the code part to remain untranslated. Here's the translated text for the content you provided:
我之前提过类似的问题,但想要进行更新。我要如何将列求和并将它们除以相邻的单元格,仅针对最近的三行?例如,如果我有以下数据集:
我有以下表格:
日期 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
---|---|---|---|---|---|---|---|
2021年07月10日 | 20 | 25 | 25 | 26 | 26 | 25 | 31 |
2021年08月10日 | 26 | 25 | 25 | 26 | 25 | 31 | |
2021年09月10日 | 29 | 29 | 25 | 26 | 39 | ||
2021年10月10日 | 20 | 29 | 25 | 31 | |||
2021年11月10日 | 20 | 30 | 30 | ||||
2021年12月10日 | 30 | 30 | |||||
2022年01月10日 | 39 |
我尝试将列相加,但只加到具有相邻单元格的列,然后仅对每个相应列的底部三个值相加。然后,我尝试将每个总和除以前一个总和。例如,列1和2将一直加到2021年10月10日,然后将列2除以列1。输出将是该表格以及每列下方的总值行,如下所示:
日期 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
---|---|---|---|---|---|---|---|
2021年07月10日 | 20 | 25 | 25 | 26 | 26 | 25 | 31 |
2021年08月10日 | 26 | 25 | 25 | 26 | 25 | 31 | |
2021年09月10日 | 29 | 29 | 25 | 26 | 39 | ||
2021年10月10日 | 20 | 29 | 25 | 31 | |||
2021年11月10日 | 20 | 30 | 30 | ||||
2021年12月10日 | 30 | 30 | |||||
2022年01月10日 | 39 | ||||||
总计 | 1 | 1.27 | 0.90 | 1.18 | 1.15 | 1.09 | 1.24 |
我正在尝试修改此代码以实现这一目标:
(d[,8:3] |> apply(2, cumsum)/
d[,7:2] |> apply(2, cumsum)
) |>
diag() |>
rev()
英文:
I asked a similar question a few days back, but wanted to update it. How would I sum up columns and divide them by adjacent cell for the most recent three rows? For example, if I had the dataset:
I have the following table:
Date | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
---|---|---|---|---|---|---|---|
07/10/2021 | 20 | 25 | 25 | 26 | 26 | 25 | 31 |
08/10/2021 | 26 | 25 | 25 | 26 | 25 | 31 | |
09/10/2021 | 29 | 29 | 25 | 26 | 39 | ||
10/10/2021 | 20 | 29 | 25 | 31 | |||
11/10/2021 | 20 | 30 | 30 | ||||
12/10/2021 | 30 | 30 | |||||
1/10/2022 | 39 |
<!-- begin snippet: js hide: false console: true babel: false -->
<!-- language: lang-html -->
## example data d:
d <-
structure(list(Date = c("07.10.2021", "08.10.2021", "09.10.2021",
"10.10.2021", "11.10.2021", "12.10.2021", "01.10.2022"), X1 = c(20L,
26L, 29L, 20L, 20L, 30L, 39L), X2 = c(25L, 25L, 29L, 29L, 30L,
30L, NA), X3 = c(25L, 25L, 25L, 25L, 30L, NA, NA), X4 = c(26L,
26L, 26L, 31L, NA, NA, NA), X5 = c(26L, 25L, 39L, NA, NA, NA,
NA), X6 = c(25L, 31L, NA, NA, NA, NA, NA), X7 = c(31L, NA, NA,
NA, NA, NA, NA)), class = "data.frame", row.names = c(NA, 7L))
<!-- end snippet -->
I'm trying to add up the columns, but only up to the column that has an adjacent cell, and then ONLY UP TO THREE VALUES FROM THE BOTTOM CELL FOR EACH RESPECTIVE COLUMN. I then am trying to divide each sum by the previous sum. For example, column 1 and 2 would add down to 12/10/2021 and then column 2 would be divided by column 1. The output would be the table plus a row with the total values under each of the columns, as shown below:
Date | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
---|---|---|---|---|---|---|---|
07/10/2021 | 20 | 25 | 25 | 26 | 26 | 25 | 31 |
08/10/2021 | 26 | 25 | 25 | 26 | 25 | 31 | |
09/10/2021 | 29 | 29 | 25 | 26 | 39 | ||
10/10/2021 | 20 | 29 | 25 | 31 | |||
11/10/2021 | 20 | 30 | 30 | ||||
12/10/2021 | 30 | 30 | |||||
1/10/2022 | 39 | ||||||
TOTAL | 1 | 1.27 | 0.90 | 1.18 | 1.15 | 1.09 | 1.24 |
I am trying to modify this code to do so:
<!-- begin snippet: js hide: false console: true babel: false -->
<!-- language: lang-html -->
(d[,8:3] |> apply(2, cumsum)/
d[,7:2] |> apply(2, cumsum)
) |>
diag() |>
rev()
<!-- end snippet -->
答案1
得分: 1
库(tidyverse)
d <-
结构(list(日期 = c(
"07.10.2021", "08.10.2021", "09.10.2021",
"10.10.2021", "11.10.2021", "12.10.2021", "01.10.2022"
), X1 = c(
20L,
26L, 29L, 20L, 20L, 30L, 39L
), X2 = c(
25L, 25L, 29L, 29L, 30L,
30L, NA
), X3 = c(25L, 25L, 25L, 25L, 30L, NA, NA), X4 = c(
26L,
26L, 26L, 31L, NA, NA, NA
), X5 = c(
26L, 25L, 39L, NA, NA, NA,
NA
), X6 = c(25L, 31L, NA, NA, NA, NA, NA), X7 = c(
31L, NA, NA,
NA, NA, NA, NA
)), 类 = "数据框", 行名 = c(NA, 7L))
(nd <- 列名(d))
(ndx <- nd[startsWith(nd, "X")])
(ndx1 <- c(ndx[1], na.omit(dplyr::lag(ndx, 1))))
(结果行 <- map2_dbl(ndx, ndx1, \(calc_col, prev_col){
lc <- length(na.omit(d[[calc_col]]))
prev_3 <- sum(tail(head(d[[prev_col]], lc), n = 3))
calc_3 <- sum(tail(head(d[[calc_col]], lc), n = 3))
calc_3 / prev_3
}))
列名(结果行) <- ndx
bind_cols(
数据框(日期 = "总计"),
enframe(结果行) |> pivot_wider()
)
d |> bind_rows(
bind_cols(
数据框(日期 = "总计"),
enframe(结果行) |> pivot_wider()
)
)
日期 X1 X2 X3 X4 X5 X6 X7
1 07.10.2021 20 25.000000 25.0000000 26.000000 26.000000 25.000000 31.00
2 08.10.2021 26 25.000000 25.0000000 26.000000 25.000000 31.000000 NA
3 09.10.2021 29 29.000000 25.0000000 26.000000 39.000000 NA NA
4 10.10.2021 20 29.000000 25.0000000 31.000000 NA NA NA
5 11.10.2021 20 30.000000 30.0000000 NA NA NA NA
6 12.10.2021 30 30.000000 NA NA NA NA NA
7 01.10.2022 39 NA NA NA NA NA NA
8 总计 1 1.271429 0.9090909 1.106667 1.153846 1.098039 1.24
英文:
library(tidyverse)
d <-
structure(list(Date = c(
"07.10.2021", "08.10.2021", "09.10.2021",
"10.10.2021", "11.10.2021", "12.10.2021", "01.10.2022"
), X1 = c(
20L,
26L, 29L, 20L, 20L, 30L, 39L
), X2 = c(
25L, 25L, 29L, 29L, 30L,
30L, NA
), X3 = c(25L, 25L, 25L, 25L, 30L, NA, NA), X4 = c(
26L,
26L, 26L, 31L, NA, NA, NA
), X5 = c(
26L, 25L, 39L, NA, NA, NA,
NA
), X6 = c(25L, 31L, NA, NA, NA, NA, NA), X7 = c(
31L, NA, NA,
NA, NA, NA, NA
)), class = "data.frame", row.names = c(NA, 7L))
(nd <- names(d))
(ndx <- nd[startsWith(nd, "X")])
(ndx1 <- c(ndx[1], na.omit(dplyr::lag(ndx, 1))))
(result_row <- map2_dbl(ndx, ndx1, \(calc_col, prev_col){
lc <- length(na.omit(d[[calc_col]]))
prev_3 <- sum(tail(head(d[[prev_col]], lc), n = 3))
calc_3 <- sum(tail(head(d[[calc_col]], lc), n = 3))
calc_3 / prev_3
}))
names(result_row) <- ndx
bind_cols(
tibble(Date = "Total"),
enframe(result_row) |> pivot_wider()
)
d |> bind_rows(
bind_cols(
tibble(Date = "Total"),
enframe(result_row) |> pivot_wider()
)
)
Date X1 X2 X3 X4 X5 X6 X7
1 07.10.2021 20 25.000000 25.0000000 26.000000 26.000000 25.000000 31.00
2 08.10.2021 26 25.000000 25.0000000 26.000000 25.000000 31.000000 NA
3 09.10.2021 29 29.000000 25.0000000 26.000000 39.000000 NA NA
4 10.10.2021 20 29.000000 25.0000000 31.000000 NA NA NA
5 11.10.2021 20 30.000000 30.0000000 NA NA NA NA
6 12.10.2021 30 30.000000 NA NA NA NA NA
7 01.10.2022 39 NA NA NA NA NA NA
8 Total 1 1.271429 0.9090909 1.106667 1.153846 1.098039 1.24
</details>
# 答案2
**得分**: 0
我们可以使用这个简单的`sapply`。
首先,我认为你的`1.18`应该改为
```r
(26+26+31)/(25+25+25)
# [1] 1.106667
代码:
tmp <- as.data.frame(matrix(c(1, sapply(3:ncol(d), function(ci) {
ri <- tail(which(!is.na(d[[ci]])), n = 3)
sum(d[[ci]][ri]) / sum(d[[ci-1]][ri])
})), nrow = 1))
rbind(d, cbind(data.frame(Date="Total"), setNames(tmp, names(d)[-1])))
# Date X1 X2 X3 X4 X5 X6 X7
# 1 07.10.2021 20 25.000000 25.0000000 26.000000 26.000000 25.000000 31.00
# 2 08.10.2021 26 25.000000 25.0000000 26.000000 25.000000 31.000000 NA
# 3 09.10.2021 29 29.000000 25.0000000 26.000000 39.000000 NA NA
# 4 10.10.2021 20 29.000000 25.0000000 31.000000 NA NA NA
# 5 11.10.2021 20 30.000000 30.0000000 NA NA NA NA
# 6 12.10.2021 30 30.000000 NA NA NA NA NA
# 7 01.10.2022 39 NA NA NA NA NA NA
# 8 Total 1 1.271429 0.9090909 1.106667 1.153846 1.098039 1.24
用原生管道的说法,它可以(有点)展开为:
fun <- function(ci) {
ri <- tail(which(!is.na(d[[ci]])), n = 3)
sum(d[[ci]][ri]) / sum(d[[ci-1]][ri])
}
tmp <- c(1, sapply(3:ncol(d), fun)) |>
matrix(nrow = 1) |>
`colnames<-`(names(d)[-1])
tmp
# X1 X2 X3 X4 X5 X6 X7
# [1,] 1 1.271429 0.9090909 1.106667 1.153846 1.098039 1.24
out <- data.frame(Date = "Total") |>
cbind(tmp) |>
(\(z) rbind(d, z))()
out
# Date X1 X2 X3 X4 X5 X6 X7
# 1 07.10.2021 20 25.000000 25.0000000 26.000000 26.000000 25.000000 31.00
# 2 08.10.2021 26 25.000000 25.0000000 26.000000 25.000000 31.000000 NA
# 3 09.10.2021 29 29.000000 25.0000000 26.000000 39.000000 NA NA
# 4 10.10.2021 20 29.000000 25.0000000 31.000000 NA NA NA
# 5 11.10.2021 20 30.000000 30.0000000 NA NA NA NA
# 6 12.10.2021 30 30.000000 NA NA NA NA NA
# 7 01.10.2022 39 NA NA NA NA NA NA
# 8 Total 1 1.271429 0.9090909 1.106667 1.153846 1.098039 1.24
英文:
We can use this simple sapply
.
Up front, I think your 1.18
should instead be
(26+26+31)/(25+25+25)
# [1] 1.106667
Code:
tmp <- as.data.frame(matrix(c(1, sapply(3:ncol(d), function(ci) {
ri <- tail(which(!is.na(d[[ci]])), n = 3)
sum(d[[ci]][ri]) / sum(d[[ci-1]][ri])
})), nrow = 1))
rbind(d, cbind(data.frame(Date="Total"), setNames(tmp, names(d)[-1])))
# Date X1 X2 X3 X4 X5 X6 X7
# 1 07.10.2021 20 25.000000 25.0000000 26.000000 26.000000 25.000000 31.00
# 2 08.10.2021 26 25.000000 25.0000000 26.000000 25.000000 31.000000 NA
# 3 09.10.2021 29 29.000000 25.0000000 26.000000 39.000000 NA NA
# 4 10.10.2021 20 29.000000 25.0000000 31.000000 NA NA NA
# 5 11.10.2021 20 30.000000 30.0000000 NA NA NA NA
# 6 12.10.2021 30 30.000000 NA NA NA NA NA
# 7 01.10.2022 39 NA NA NA NA NA NA
# 8 Total 1 1.271429 0.9090909 1.106667 1.153846 1.098039 1.24
In native-pipe speak, it can be (somewhat) spread out as:
fun <- function(ci) {
ri <- tail(which(!is.na(d[[ci]])), n = 3)
sum(d[[ci]][ri]) / sum(d[[ci-1]][ri])
}
tmp <- c(1, sapply(3:ncol(d), fun)) |>
matrix(nrow = 1) |>
`colnames<-`(names(d)[-1])
tmp
# X1 X2 X3 X4 X5 X6 X7
# [1,] 1 1.271429 0.9090909 1.106667 1.153846 1.098039 1.24
out <- data.frame(Date = "Total") |>
cbind(tmp) |>
(\(z) rbind(d, z))()
out
# Date X1 X2 X3 X4 X5 X6 X7
# 1 07.10.2021 20 25.000000 25.0000000 26.000000 26.000000 25.000000 31.00
# 2 08.10.2021 26 25.000000 25.0000000 26.000000 25.000000 31.000000 NA
# 3 09.10.2021 29 29.000000 25.0000000 26.000000 39.000000 NA NA
# 4 10.10.2021 20 29.000000 25.0000000 31.000000 NA NA NA
# 5 11.10.2021 20 30.000000 30.0000000 NA NA NA NA
# 6 12.10.2021 30 30.000000 NA NA NA NA NA
# 7 01.10.2022 39 NA NA NA NA NA NA
# 8 Total 1 1.271429 0.9090909 1.106667 1.153846 1.098039 1.24
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论