2023年6月6日 02:16:28go评论95阅读模式

英文:

Sum up columns until the cell in adjacent column is NA and then divide each of them by the preceding column[SOLVED]

问题

我正在尝试在数据表中将列相加，直到相邻列的值为NA为止。例如，如果我有以下表格：

日期	1	2	3
07/01/2022	2	12	12
08/01/2022	1	1	NA
09/01/2022	3	NA	NA

我只会将列1加到第2行，因为第2行的列2为NA，而第1行的列2也是NA。然后，我将每列除以前一列。我还想忽略日期。是否有一种方法可以在不硬编码的情况下执行此操作？这将是期望的最终行：

总计	3	12	12

我尝试使用"adorn_table"并排除NA，但似乎没有成功。

英文:

I'm trying to add up columns in a data table until the adjacent column has a value of NA. For example, if I had the table:

DATE	1	2	3
07/01/2022	2	12	12
08/01/2022	1	1	NA
09/01/2022	3	NA	NA

I would only add up to row 2 in column 1, as row 3 in column 2 is NA, and row 1 in column 2. I would then divide each of those columns by the preceding column. I'd also want to ignore the date. Is there a way to do this without hard coding it? This would be the desired end row:

TOTAL	3	12	12

I tried using "adorn_table" and excluding NAs, but that didn't seem to do it.

答案1

得分: 2

以下是您要翻译的内容：

library(tidyverse)
# create made-up data
df <- tibble(
  x = c(2, 1, 3),
  y = c(12, 1, NA)
)
df |>
  filter(cumsum(is.na(y)) == 0) |>
  summarize(
    s = sum(x)
  ) |>
  pull(s)

library(tidyverse)
# create a function to do the work
f <- function(.x, .y) {
  keep_rows <- cumsum(is.na(.y)) == 0
  sum(.x[keep_rows])
}
# create made-up data
df <- tibble(
  date = 1:3,
  x = c(2, 1, 3),
  y = c(12, 1, NA),
  z = c(12, NA, NA)
)
total_row <- map2(
 df |> select(x:z),
 df |> select(y, z1 = z, z2 = z),
  f
)
bind_rows(df |>
  mutate(date = as.character(date)), 
  c(date = "Total", total_row)
)

英文:

For the original question, try:

library(tidyverse)
# create made-up data
df &lt;- tibble(
  x = c(2, 1, 3),
  y = c(12, 1, NA)
)
df |&gt;
  filter(cumsum(is.na(y)) == 0) |&gt;  # This will get rid of all rows starting from the first time when y is NA
  summarize(                        # add up the values of x
    s = sum(x)
  ) |&gt;
  pull(s)

For the clarified question, I'm assuming the number of columns is arbitrary.

library(tidyverse)
# create a function to do the work
f &lt;- function(.x, .y) {
  keep_rows &lt;- cumsum(is.na(.y)) == 0
  sum(.x[keep_rows])
}
# create made-up data
df &lt;- tibble(
  date = 1:3, # dates in your case, I&#39;m keeping it simple
  x = c(2, 1, 3),
  y = c(12, 1, NA),
  z = c(12, NA, NA)
)
total_row &lt;- map2(
 df |&gt; select(x:z),  # select the columns being summed
 df |&gt; select(y, z1 = z, z2 = z), # If you want the last column to depend on itself, you&#39;ll need to do something like this
  f
)
bind_rows(df |&gt;
  mutate(date = as.character(date)), 
  c(date = &quot;Total&quot;, total_row)
)

答案2

得分: 0

1) 定义一个名为 sumToNA 的函数，该函数将 x 求和到 y 中第一个 NA 之前的位置。如果没有 NA，则将 x 整个求和。

然后将该函数应用于 DF，但不包括第一列，以及不包括前两列但最后一列重复的 DF。不使用任何包。

sumToNA <- function(x, y) sum(x[!cummax(is.na(y))], na.rm = TRUE)
n <- ncol(DF)
rbind(DF, c("Total", mapply(sumToNA, DF[2:n], DF[c(3:n, n)])))
##         DATE 1    2    3
## 1 07/01/2022 2   12   12
## 2 08/01/2022 1    1 <NA>
## 3 09/01/2022 3 <NA> <NA>
## 4      Total 3   12   12

2) 定义一个名为 NAfwd 的函数，该函数接受一个向量并返回其中第一个 NA 之后的所有 NA，将所有剩余的非 NA 转换为 0。然后将 DF[-1] 添加到处理过的 DF 中，不包括前两列和最后一列，然后将每一列都应用 NAfwd 转换。然后使用 rowSums 对行进行求和，并将结果作为新行插入。

NAfwd <- \(z) 0 * Reduce(\(x, y) if (is.na(x)) x else y, z, acc = TRUE)
    
n <- ncol(DF)
rbind(DF, c("Total", colSums(DF[-1] + apply(DF[c(3:n,n)], 2, NAfwd), na.rm=TRUE)))
##         DATE 1    2    3
## 1 07/01/2022 2   12   12
## 2 08/01/2022 1    1 <NA>
## 3 09/01/2022 3 <NA> <NA>
## 4      Total 3   12   12

注意

DF <- data.frame(
  DATE = c("07/01/2022", "08/01/2022", "09/01/2022"),
  `1` = c(2L, 1L, 3L),
  `2` = c(12L, 1L, NA),
  `3` = c(12L, NA, NA),
  check.names = FALSE
)

英文:

1) Define a function sumToNA which sums x to the position prior to the first NA in y. If there are no NAs it sums x entirely.

Then mapply that function to DF without column 1 and DF without the first two columns but the last column twice. No packages are used.

sumToNA &lt;- function(x, y) sum(x[!cummax(is.na(y))], na.rm = TRUE)
n &lt;- ncol(DF)
rbind(DF, c(&quot;Total&quot;, mapply(sumToNA, DF[2:n], DF[c(3:n, n)])))
##         DATE 1    2    3
## 1 07/01/2022 2   12   12
## 2 08/01/2022 1    1 &lt;NA&gt;
## 3 09/01/2022 3 &lt;NA&gt; &lt;NA&gt;
## 4      Total 3   12   12

2) Define a function NAfwd which takes a vector and returns it but with all NA's after the first NA and converts all remaining non-NA's to 0. Then add DF[-1] to the processed DF without the first two columns and the last column twice after transforming each column with NAfwd. Then use rowSums to sum the rows of that and insert the result as a new row.

NAfwd &lt;- \(z) 0 * Reduce(\(x, y) if (is.na(x)) x else y, z, acc = TRUE)
n &lt;- ncol(DF)
rbind(DF, c(&quot;Total&quot;, colSums(DF[-1] + apply(DF[c(3:n,n)], 2, NAfwd), na.rm=TRUE)))
##         DATE 1    2    3
## 1 07/01/2022 2   12   12
## 2 08/01/2022 1    1 &lt;NA&gt;
## 3 09/01/2022 3 &lt;NA&gt; &lt;NA&gt;
## 4      Total 3   12   12

Note

DF &lt;- data.frame(
  DATE = c(&quot;07/01/2022&quot;, &quot;08/01/2022&quot;, &quot;09/01/2022&quot;),
  `1` = c(2L, 1L, 3L),
  `2` = c(12L, 1L, NA),
  `3` = c(12L, NA, NA),
  check.names = FALSE
)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

总结列，直到相邻列中的单元格为NA，然后将它们分别除以前一列[SOLVED]

问题

答案1

答案2

注意

Note

为什么我的变异函数只对单个数字值起作用

在R中编写循环以遍历多次输入。

如何使用strsplit基于行名称筛选数据框。

重塑数据框中的字符串在 R 中

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。