英文:
Get mean of every n rows for multiple columns R
问题
我觉得这应该很简单,但我找不到现有问题的答案。我有一个数据框df:
df <- data.frame(ID = c('a', 'b', 'c', 'c1', 'd', 'e', 'f', 'g', 'h', 'h1'),
var2 = c(7, 9, 2, 4, 3, 6, 8, 2, 1, 2),
var3 = c(21, 50, 40, 30, 29, 45, 33, 51, 70, 46))
我想分别计算var2和var3列每n行的均值,使输出看起来像这样:
var2 var3
1 8.0 35.5
2 3.0 35.0
3 4.5 37.0
4 5.0 42.0
5 1.5 58.0
如果我能保留两行中的第一个ID,那就更好了,例如:
ID var2 var3
1 a 8.0 35.5
2 c 3.0 35.0
3 d 4.5 37.0
4 f 5.0 42.0
5 h 1.5 58.0
提前感谢。
英文:
I feel like this should be straightforward but I can't find an existing answer to my question. I have a df:
df <- data.frame(ID = c('a', 'b', 'c', 'c1', 'd', 'e', 'f', 'g', 'h', 'h1'),
var2 = c(7, 9, 2, 4, 3, 6, 8, 2, 1, 2),
var3 = c(21, 50, 40, 30, 29, 45, 33, 51, 70, 46))
And I'd like to get the mean of every n rows for columns var2 and var3 separately, so that the output looks like this:
var2 var3
1 8.0 35.5
2 3.0 35.0
3 4.5 37.0
4 5.0 42.0
5 1.5 58.0
It would be a bonus if I could keep the first ID of the two rows, e.g:
ID var2 var3
1 a 8.0 35.5
2 c 3.0 35.0
3 d 4.5 37.0
4 f 5.0 42.0
5 h 1.5 58.0
Ty in advance
答案1
得分: 0
我们需要添加一个分组列,然后这是一个标准的分组均值:
library(dplyr)
n = 2
df |>
mutate(group = ((row_number() - 1) %/% n) + 1) |>
summarize(
first_id = first(ID),
across(starts_with("var"), mean),
.by = group
)
# group first_id var2 var3
# 1 1 a 8.0 35.5
# 2 2 c 3.0 35.0
# 3 3 d 4.5 37.0
# 4 4 f 5.0 42.0
# 5 5 h 1.5 58.0
英文:
We need to add a grouping column, and then this is a standard grouped mean:
library(dplyr)
n = 2
df |>
mutate(group = ((row_number() - 1) %/% n) + 1) |>
summarize(
first_id = first(ID),
across(starts_with("var"), mean),
.by = group
)
# group first_id var2 var3
# 1 1 a 8.0 35.5
# 2 2 c 3.0 35.0
# 3 3 d 4.5 37.0
# 4 4 f 5.0 42.0
# 5 5 h 1.5 58.0
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论