英文:
how to do conditional summation in R
问题
I am trying to do a conditional summation based on a table that looks like this:
]1
我正在尝试基于这样的表格进行条件求和:
I am trying to do a summation of the column "value" and group by Location. Normally I would just do this:
通常情况下,我只需这样做:
data <- file %>%
group_by(location, date) %>%
summarize(value = sum (value))
However, only for the Location "Central" I would like to exclude the Program "B". So I tried this way, but it did not work:
但是,仅对于“Central”位置,我想排除“B”程序。所以我尝试了这种方式,但没有成功:
data <- file %>%
group_by(location, date) %>%
summarize(value =
case_when(location == "Central" ~ filter(program != "B")),
TRUE ~ sum(value)
)
If someone could please help me with the code above, I would much appreciate that.
Thank you
如果有人能帮助我处理上面的代码,我将不胜感激。
谢谢
英文:
I am trying to do a conditional summation based on a table that looks like this:
]1
I am trying to do a summation of the column "value" and group by Location. Normally I would just do this:
data <- file %>%
group_by(location, date) %>%
summarize(value = sum (value))
However, only for the Location "Central" I would like to exclude the Program "B". So I tried this way, but it did not work:
data <- file %>%
group_by(location, date) %>%
summarize(value =
case_when(location == "Central" ~ filter(program != "B")),
TRUE ~ sum(value)
)
If someone oculd please help me with the code above, I would much appreciate that.
Thank you
EDIT:
Here is the reproducible data using dput:
structure(list(pid = c(123, 123, 123, 123, 123, 123, 123,
123, 123, 123), program = c("A", "A",
"A", "A", "A",
"A", "A", "A",
"A", "A"), location = c("Central",
"Central", "Central", "Central", "Central", "Central", "Central",
"Central", "Central", "Central"), locationid = c("123-Central",
"123-Central", "123-Central", "123-Central", "123-Central",
"123-Central", "123-Central", "123-Central", "123-Central",
"123-Central"), date = structure(c(1302480000, 1305072000, 1307750400,
1310342400, 1313020800, 1315699200, 1318291200, 1323561600, 1326240000,
1328918400), tzone = "UTC", class = c("POSIXct", "POSIXt")),
value = c(37207.43, -56936.95, -52871, 6980.05, 10703.16,
4006.1, 6505.3, 9661.29, 6897.26, 7212.87)), row.names = c(NA,
-10L), class = c("tbl_df", "tbl", "data.frame"))
答案1
得分: 2
filter
作用于整个数据框。你可以首先进行筛选:
file |>
filter(!(program == "B" & location == "Central")) |>
group_by(location, date) |>
summarize(value = sum(value))
或者你可以在 sum
中使用向量子集函数,像这样:
data <- file |>
group_by(location, date) |>
summarize(value =
case_when(
location == "Central" ~ sum(value[program != "B"]),
TRUE ~ sum(value)
)
)
但你不能在向量/列上调用 filter
。也不能像这样将其用作结果,location == "Central" ~ filter(program != "B")
,当你希望结果是一个总和时。
英文:
filter
works on a whole data frame. You can filter first:
file |>
filter(!(program == "B" & location == "Central")) |>
group_by(location, date) |>
summarize(value = sum (value))
Or you can use vector subsetting functions like [
inside the sum
like this:
data <- file |>
group_by(location, date) |>
summarize(value =
case_when(
location == "Central" ~ sum(value[program != "B"]),
TRUE ~ sum(value)
)
)
But you can't call filter
on a vector/column. Nor can you use it like a result, location == "Central" ~ filter(program != "B")
when you want the result to be a sum.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论