英文:
mean by group, excluding selected rows
问题
我会以此1旧帖子作为参考。所以,修改后的数据集如下:
df <- data.frame(dive = factor(sample(c("dive1","dive2","dive3","dive4"), 14, replace=TRUE)),
speed = runif(14)
)
> df
dive speed
1 dive1 0.627296799
2 dive1 0.288594538
3 dive4 0.598177856
4 dive2 0.371158436
5 dive2 0.827468739
6 dive3 0.485977449
7 dive2 0.151295215
8 dive4 0.773988372
9 dive2 0.567155356
10 dive1 0.008585884
11 dive4 0.433648371
12 dive2 0.759196515
13 dive2 0.641193241
14 dive3 0.089451537
我想修改speed列,使其包含dive1和dive2的每个组的平均值,对于其他两个组,保持df不变。
我尝试过使用if(当然还有group_by和summarise),但这不是我想要的,我收到了警告消息并且只有4个结果...
df2 <- if(!(df$dive %in% c("dive3", "dive4"))){
summarise(group_by(df, dive), speed = mean(speed))
}
警告信息:
In if (!(df$dive %in% c("dive3", "dive4"))) { :
the condition has length > 1 and only the first element will be used
> df2
# A tibble: 4 x 2
dive speed
<fct> <dbl>
1 dive1 0.860
2 dive2 0.460
3 dive3 0.277
4 dive4 0.330
英文:
I'll take this old post as reference. So, the modified dataset looks like the following:
df <- data.frame(dive = factor(sample(c("dive1","dive2","dive3","dive4"), 14, replace=TRUE)),
speed = runif(14)
)
> df
dive speed
1 dive1 0.627296799
2 dive1 0.288594538
3 dive4 0.598177856
4 dive2 0.371158436
5 dive2 0.827468739
6 dive3 0.485977449
7 dive2 0.151295215
8 dive4 0.773988372
9 dive2 0.567155356
10 dive1 0.008585884
11 dive4 0.433648371
12 dive2 0.759196515
13 dive2 0.641193241
14 dive3 0.089451537
I would like to modify the column speed so that it contains the mean per group (same entry for each .group) for dive1 and dive2, and do nothing (keep df as it is) for the other two groups).
I tried with if (and, of course, group_by and summarise), but that's not what I want, I receive a warning message and only 4 results...
df2 <- if(!(df$dive %in% c("dive3", "dive4"))){
summarise(group_by(df, dive), speed = mean(speed))
}
Warning message:
In if (!(df$dive %in% c("dive3", "dive4"))) { :
the condition has length > 1 and only the first element will be used
> df2
# A tibble: 4 x 2
dive speed
<fct> <dbl>
1 dive1 0.860
2 dive2 0.460
3 dive3 0.277
4 dive4 0.330
答案1
得分: 4
df %>%
group_by(dive) %>%
mutate(speed = if (first(dive) %in% c("dive1", "dive2")) mean(speed) else speed) %>%
ungroup()
or a shorter version using
df %>%
mutate(speed = if (first(dive) %in% c("dive1", "dive2")) mean(speed) else speed,
.by = dive)
If you want to reduce the two groups to a single row while keeping other groups as-is (not reduced), then perhaps:
df %>%
filter(dive %in% c("dive1", "dive2")) %>%
summarize(speed = mean(speed), .by = dive) %>%
bind_rows(filter(df, !dive %in% c("dive1", "dive2")))
以上是您要的代码的翻译部分。
英文:
df %>%
group_by(dive) %>%
mutate(speed = if (first(dive) %in% c("dive1", "dive2")) mean(speed) else speed) %>%
ungroup()
# # A tibble: 14 × 2
# dive speed
# <fct> <dbl>
# 1 dive4 0.548
# 2 dive3 0.156
# 3 dive4 0.207
# 4 dive3 0.148
# 5 dive4 0.886
# 6 dive1 0.498
# 7 dive3 0.690
# 8 dive1 0.498
# 9 dive4 0.0968
# 10 dive3 0.596
# 11 dive2 0.447
# 12 dive2 0.447
# 13 dive3 0.859
# 14 dive3 0.663
or perhaps a little shorter using
df %>%
mutate(speed = if (first(dive) %in% c("dive1", "dive2")) mean(speed) else speed,
.by = dive)
If I misunderstood, and instead you want to reduce the two groups to a single row while keeping other groups as-is (not reduced), then perhaps:
df %>%
filter(dive %in% c("dive1", "dive2")) %>%
summarize(speed = mean(speed), .by = dive) %>%
bind_rows(filter(df, !dive %in% c("dive1", "dive2")))
# dive speed
# 1 dive1 0.4983562
# 2 dive2 0.4470575
# 3 dive4 0.5477776
# 4 dive3 0.1558491
# 5 dive4 0.2068528
# 6 dive3 0.1479428
# 7 dive4 0.8858552
# 8 dive3 0.6896862
# 9 dive4 0.0967569
# 10 dive3 0.5961494
# 11 dive3 0.8593978
# 12 dive3 0.6634452
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论