英文:
mean by group, excluding selected rows
问题
我会以此1旧帖子作为参考。所以,修改后的数据集如下:
df <- data.frame(dive = factor(sample(c("dive1","dive2","dive3","dive4"), 14, replace=TRUE)),
speed = runif(14)
)
> df
dive speed
1 dive1 0.627296799
2 dive1 0.288594538
3 dive4 0.598177856
4 dive2 0.371158436
5 dive2 0.827468739
6 dive3 0.485977449
7 dive2 0.151295215
8 dive4 0.773988372
9 dive2 0.567155356
10 dive1 0.008585884
11 dive4 0.433648371
12 dive2 0.759196515
13 dive2 0.641193241
14 dive3 0.089451537
我想修改speed
列,使其包含dive1
和dive2
的每个组的平均值,对于其他两个组,保持df
不变。
我尝试过使用if
(当然还有group_by
和summarise
),但这不是我想要的,我收到了警告消息并且只有4个结果...
df2 <- if(!(df$dive %in% c("dive3", "dive4"))){
summarise(group_by(df, dive), speed = mean(speed))
}
警告信息:
In if (!(df$dive %in% c("dive3", "dive4"))) { :
the condition has length > 1 and only the first element will be used
> df2
# A tibble: 4 x 2
dive speed
<fct> <dbl>
1 dive1 0.860
2 dive2 0.460
3 dive3 0.277
4 dive4 0.330
英文:
I'll take this old post as reference. So, the modified dataset looks like the following:
df <- data.frame(dive = factor(sample(c("dive1","dive2","dive3","dive4"), 14, replace=TRUE)),
speed = runif(14)
)
> df
dive speed
1 dive1 0.627296799
2 dive1 0.288594538
3 dive4 0.598177856
4 dive2 0.371158436
5 dive2 0.827468739
6 dive3 0.485977449
7 dive2 0.151295215
8 dive4 0.773988372
9 dive2 0.567155356
10 dive1 0.008585884
11 dive4 0.433648371
12 dive2 0.759196515
13 dive2 0.641193241
14 dive3 0.089451537
I would like to modify the column speed
so that it contains the mean per group (same entry for each .group
) for dive1
and dive2
, and do nothing (keep df
as it is) for the other two groups).
I tried with if
(and, of course, group_by
and summarise
), but that's not what I want, I receive a warning message and only 4 results...
df2 <- if(!(df$dive %in% c("dive3", "dive4"))){
summarise(group_by(df, dive), speed = mean(speed))
}
Warning message:
In if (!(df$dive %in% c("dive3", "dive4"))) { :
the condition has length > 1 and only the first element will be used
> df2
# A tibble: 4 x 2
dive speed
<fct> <dbl>
1 dive1 0.860
2 dive2 0.460
3 dive3 0.277
4 dive4 0.330
答案1
得分: 4
df %>%
group_by(dive) %>%
mutate(speed = if (first(dive) %in% c("dive1", "dive2")) mean(speed) else speed) %>%
ungroup()
or a shorter version using
df %>%
mutate(speed = if (first(dive) %in% c("dive1", "dive2")) mean(speed) else speed,
.by = dive)
If you want to reduce the two groups to a single row while keeping other groups as-is (not reduced), then perhaps:
df %>%
filter(dive %in% c("dive1", "dive2")) %>%
summarize(speed = mean(speed), .by = dive) %>%
bind_rows(filter(df, !dive %in% c("dive1", "dive2")))
以上是您要的代码的翻译部分。
英文:
df %>%
group_by(dive) %>%
mutate(speed = if (first(dive) %in% c("dive1", "dive2")) mean(speed) else speed) %>%
ungroup()
# # A tibble: 14 × 2
# dive speed
# <fct> <dbl>
# 1 dive4 0.548
# 2 dive3 0.156
# 3 dive4 0.207
# 4 dive3 0.148
# 5 dive4 0.886
# 6 dive1 0.498
# 7 dive3 0.690
# 8 dive1 0.498
# 9 dive4 0.0968
# 10 dive3 0.596
# 11 dive2 0.447
# 12 dive2 0.447
# 13 dive3 0.859
# 14 dive3 0.663
or perhaps a little shorter using
df %>%
mutate(speed = if (first(dive) %in% c("dive1", "dive2")) mean(speed) else speed,
.by = dive)
If I misunderstood, and instead you want to reduce the two groups to a single row while keeping other groups as-is (not reduced), then perhaps:
df %>%
filter(dive %in% c("dive1", "dive2")) %>%
summarize(speed = mean(speed), .by = dive) %>%
bind_rows(filter(df, !dive %in% c("dive1", "dive2")))
# dive speed
# 1 dive1 0.4983562
# 2 dive2 0.4470575
# 3 dive4 0.5477776
# 4 dive3 0.1558491
# 5 dive4 0.2068528
# 6 dive3 0.1479428
# 7 dive4 0.8858552
# 8 dive3 0.6896862
# 9 dive4 0.0967569
# 10 dive3 0.5961494
# 11 dive3 0.8593978
# 12 dive3 0.6634452
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论