总结列,直到相邻列中的单元格为NA,然后将它们分别除以前一列[SOLVED]

huangapple go评论55阅读模式
英文:

Sum up columns until the cell in adjacent column is NA and then divide each of them by the preceding column[SOLVED]

问题

我正在尝试在数据表中将列相加,直到相邻列的值为NA为止。例如,如果我有以下表格:

日期 1 2 3
07/01/2022 2 12 12
08/01/2022 1 1 NA
09/01/2022 3 NA NA

我只会将列1加到第2行,因为第2行的列2为NA,而第1行的列2也是NA。然后,我将每列除以前一列。我还想忽略日期。是否有一种方法可以在不硬编码的情况下执行此操作?这将是期望的最终行:

总计 3 12 12

我尝试使用"adorn_table"并排除NA,但似乎没有成功。

英文:

I'm trying to add up columns in a data table until the adjacent column has a value of NA. For example, if I had the table:

DATE 1 2 3
07/01/2022 2 12 12
08/01/2022 1 1 NA
09/01/2022 3 NA NA

I would only add up to row 2 in column 1, as row 3 in column 2 is NA, and row 1 in column 2. I would then divide each of those columns by the preceding column. I'd also want to ignore the date. Is there a way to do this without hard coding it? This would be the desired end row:

TOTAL 3 12 12

I tried using "adorn_table" and excluding NAs, but that didn't seem to do it.

答案1

得分: 2

以下是您要翻译的内容:

library(tidyverse)

# create made-up data
df <- tibble(
  x = c(2, 1, 3),
  y = c(12, 1, NA)
)

df |>
  filter(cumsum(is.na(y)) == 0) |>
  summarize(
    s = sum(x)
  ) |>
  pull(s)
library(tidyverse)

# create a function to do the work
f <- function(.x, .y) {
  keep_rows <- cumsum(is.na(.y)) == 0
  sum(.x[keep_rows])
}

# create made-up data
df <- tibble(
  date = 1:3,
  x = c(2, 1, 3),
  y = c(12, 1, NA),
  z = c(12, NA, NA)
)

total_row <- map2(
 df |> select(x:z),
 df |> select(y, z1 = z, z2 = z),
  f
)

bind_rows(df |>
  mutate(date = as.character(date)), 
  c(date = "Total", total_row)
)
英文:

For the original question, try:

library(tidyverse)

# create made-up data
df &lt;- tibble(
  x = c(2, 1, 3),
  y = c(12, 1, NA)
)

df |&gt;
  filter(cumsum(is.na(y)) == 0) |&gt;  # This will get rid of all rows starting from the first time when y is NA
  summarize(                        # add up the values of x
    s = sum(x)
  ) |&gt;
  pull(s)                        

For the clarified question, I'm assuming the number of columns is arbitrary.

library(tidyverse)

# create a function to do the work
f &lt;- function(.x, .y) {
  keep_rows &lt;- cumsum(is.na(.y)) == 0
  sum(.x[keep_rows])
}

# create made-up data
df &lt;- tibble(
  date = 1:3, # dates in your case, I&#39;m keeping it simple
  x = c(2, 1, 3),
  y = c(12, 1, NA),
  z = c(12, NA, NA)
)

total_row &lt;- map2(
 df |&gt; select(x:z),  # select the columns being summed
 df |&gt; select(y, z1 = z, z2 = z), # If you want the last column to depend on itself, you&#39;ll need to do something like this
  f
)

bind_rows(df |&gt;
  mutate(date = as.character(date)), 
  c(date = &quot;Total&quot;, total_row)
)

答案2

得分: 0

1) 定义一个名为 sumToNA 的函数,该函数将 x 求和到 y 中第一个 NA 之前的位置。如果没有 NA,则将 x 整个求和。

然后将该函数应用于 DF,但不包括第一列,以及不包括前两列但最后一列重复的 DF。不使用任何包。

sumToNA <- function(x, y) sum(x[!cummax(is.na(y))], na.rm = TRUE)
n <- ncol(DF)
rbind(DF, c("Total", mapply(sumToNA, DF[2:n], DF[c(3:n, n)])))
##         DATE 1    2    3
## 1 07/01/2022 2   12   12
## 2 08/01/2022 1    1 <NA>
## 3 09/01/2022 3 <NA> <NA>
## 4      Total 3   12   12

2) 定义一个名为 NAfwd 的函数,该函数接受一个向量并返回其中第一个 NA 之后的所有 NA,将所有剩余的非 NA 转换为 0。然后将 DF[-1] 添加到处理过的 DF 中,不包括前两列和最后一列,然后将每一列都应用 NAfwd 转换。然后使用 rowSums 对行进行求和,并将结果作为新行插入。

NAfwd <- \(z) 0 * Reduce(\(x, y) if (is.na(x)) x else y, z, acc = TRUE)
    
n <- ncol(DF)
rbind(DF, c("Total", colSums(DF[-1] + apply(DF[c(3:n,n)], 2, NAfwd), na.rm=TRUE)))
##         DATE 1    2    3
## 1 07/01/2022 2   12   12
## 2 08/01/2022 1    1 <NA>
## 3 09/01/2022 3 <NA> <NA>
## 4      Total 3   12   12

注意

DF <- data.frame(
  DATE = c("07/01/2022", "08/01/2022", "09/01/2022"),
  `1` = c(2L, 1L, 3L),
  `2` = c(12L, 1L, NA),
  `3` = c(12L, NA, NA),
  check.names = FALSE
)
英文:

1) Define a function sumToNA which sums x to the position prior to the first NA in y. If there are no NAs it sums x entirely.

Then mapply that function to DF without column 1 and DF without the first two columns but the last column twice. No packages are used.

sumToNA &lt;- function(x, y) sum(x[!cummax(is.na(y))], na.rm = TRUE)
n &lt;- ncol(DF)
rbind(DF, c(&quot;Total&quot;, mapply(sumToNA, DF[2:n], DF[c(3:n, n)])))
##         DATE 1    2    3
## 1 07/01/2022 2   12   12
## 2 08/01/2022 1    1 &lt;NA&gt;
## 3 09/01/2022 3 &lt;NA&gt; &lt;NA&gt;
## 4      Total 3   12   12

2) Define a function NAfwd which takes a vector and returns it but with all NA's after the first NA and converts all remaining non-NA's to 0. Then add DF[-1] to the processed DF without the first two columns and the last column twice after transforming each column with NAfwd. Then use rowSums to sum the rows of that and insert the result as a new row.

NAfwd &lt;- \(z) 0 * Reduce(\(x, y) if (is.na(x)) x else y, z, acc = TRUE)

n &lt;- ncol(DF)
rbind(DF, c(&quot;Total&quot;, colSums(DF[-1] + apply(DF[c(3:n,n)], 2, NAfwd), na.rm=TRUE)))
##         DATE 1    2    3
## 1 07/01/2022 2   12   12
## 2 08/01/2022 1    1 &lt;NA&gt;
## 3 09/01/2022 3 &lt;NA&gt; &lt;NA&gt;
## 4      Total 3   12   12

Note

DF &lt;- data.frame(
  DATE = c(&quot;07/01/2022&quot;, &quot;08/01/2022&quot;, &quot;09/01/2022&quot;),
  `1` = c(2L, 1L, 3L),
  `2` = c(12L, 1L, NA),
  `3` = c(12L, NA, NA),
  check.names = FALSE
)

huangapple
  • 本文由 发表于 2023年6月6日 02:16:28
  • 转载请务必保留本文链接:https://go.coder-hub.com/76409018.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定