英文:
Calculate mean of group without 0
问题
我明白了,你想要计算每列中每个组的均值,但不包括0值。这是你所期望的结果:
W X Y Z
A_rep1 0 15 10 2
A_rep2 2 14 4 8
A_rep3 15 0 8 22
A_rep4 18 3 0 9
A_rep5 0 12 0 22
B_rep1 13 0 12 80
B_rep2 14 2 10 45
B_rep3 15 10 12 36
B_rep4 89 18 22 78
B_rep5 0 22 16 0
C_rep1 15 0 12 0
C_rep2 12 0 17 5
C_rep3 22 4 50 47
C_rep4 0 14 0 0
C_rep5 0 5 4 12
Mean_A 11.66
Mean_B
Mean_C
如果你需要更多的帮助或有其他问题,请告诉我。
英文:
I have data set like this :
W X Y Z
A_rep1 0 15 10 2
A_rep2 2 14 4 8
A_rep3 15 0 8 22
A_rep4 18 3 0 9
A_rep5 0 12 0 22
B_rep1 13 0 12 80
B_rep2 14 2 10 45
B_rep3 15 10 12 36
B_rep4 89 18 22 78
B_rep5 0 22 16 0
C_rep1 15 0 12 0
C_rep2 12 0 17 5
C_rep3 22 4 50 47
C_rep4 0 14 0 0
C_rep5 0 5 4 12
And i want to calculate the mean of each group in every column but without the 0 values
for example for W column group A i want to calculate the mean of A_rep2 A_rep3 A_rep4 (because i have 0 values in A_rep1 and A_rep5
so the result i want is new rows that contain the mean of each group like this
W X Y Z
A_rep1 0 15 10 2
A_rep2 2 14 4 8
A_rep3 15 0 8 22
A_rep4 18 3 0 9
A_rep5 0 12 0 22
B_rep1 13 0 12 80
B_rep2 14 2 10 45
B_rep3 15 10 12 36
B_rep4 89 18 22 78
B_rep5 0 22 16 0
C_rep1 15 0 12 0
C_rep2 12 0 17 5
C_rep3 22 4 50 47
C_rep4 0 14 0 0
C_rep5 0 5 4 12
Mean_A 11.66
Mean_B
Mean_C
Please tell me if you need any other clarification of explanations
Thank you
答案1
得分: 2
library(tidyverse)
df %>%
rownames_to_column('groups') %>%
group_by(groups = sub('_.*', '', groups)) %>%
summarise(across(everything(), ~mean(.x[.x != 0])))
# A tibble: 3 × 5
groups W X Y Z
<chr> <dbl> <dbl> <dbl> <dbl>
1 A 11.7 11 7.33 12.6
2 B 32.8 13 14.4 59.8
3 C 16.3 7.67 20.8 21.3
英文:
Using dplyr
,
library(tidyverse)
df %>%
rownames_to_column('groups') %>%
group_by(groups = sub('_.*', '', groups)) %>%
summarise(across(everything(), ~mean(.x[.x != 0])))
# A tibble: 3 × 5
groups W X Y Z
<chr> <dbl> <dbl> <dbl> <dbl>
1 A 11.7 11 7.33 12.6
2 B 32.8 13 14.4 59.8
3 C 16.3 7.67 20.8 21.3
答案2
得分: 1
以下是翻译好的部分:
> df$group=unlist(lapply(strsplit(rownames(df),"_"),"[[",1))
> aggregate(.~group,data=df,FUN=function(x){mean(x[x!=0])})
group W X Y Z
1 A 11.66667 11.000000 7.333333 12.60000
2 B 32.75000 13.000000 14.400000 59.75000
3 C 16.33333 7.666667 20.750000 21.33333
英文:
Your output is not quite clear, so here is an attempt
> df$group=unlist(lapply(strsplit(rownames(df),"_"),"[[",1))
> aggregate(.~group,data=df,FUN=function(x){mean(x[x!=0])})
group W X Y Z
1 A 11.66667 11.000000 7.333333 12.60000
2 B 32.75000 13.000000 14.400000 59.75000
3 C 16.33333 7.666667 20.750000 21.33333
答案3
得分: 0
这是一个使用dplyr
的解决方案。您可以使用separate_wider_delim()
将组字母与观察标识的其余部分分开。然后,通过组字母对数据进行分组,可以对所有变量进行汇总,仅取不等于零的变量值的平均值。接下来,您可以更改组值以在其前附加"Mean "。最后,您可以将汇总数据附加到原始数据上。
library(dplyr)
library(tidyr)
tab <- read.table(header=TRUE, textConnection("
W X Y Z
A_rep1 0 15 10 2
A_rep2 2 14 4 8
A_rep3 15 0 8 22
A_rep4 18 3 0 9
A_rep5 0 12 0 22
B_rep1 13 0 12 80
B_rep2 14 2 10 45
B_rep3 15 10 12 36
B_rep4 89 18 22 78
B_rep5 0 22 16 0
C_rep1 15 0 12 0
C_rep2 12 0 17 5
C_rep3 22 4 50 47
C_rep4 0 14 0 0
C_rep5 0 5 4 12"))
tab <- tab %>%
as_tibble(rownames = "obs") %>%
separate_wider_delim(obs, names = c("group", "rep"), delim="_")
tab %>%
group_by(group) %>%
summarise(across(W:Z, function(x)mean(x[which(x != 0)]))) %>%
mutate(group = paste("Mean ", group, sep="")) %>%
bind_rows(tab, .)
#> # A tibble: 18 × 6
#> group rep W X Y Z
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 A rep1 0 15 10 2
#> 2 A rep2 2 14 4 8
#> 3 A rep3 15 0 8 22
#> 4 A rep4 18 3 0 9
#> 5 A rep5 0 12 0 22
#> 6 B rep1 13 0 12 80
#> 7 B rep2 14 2 10 45
#> 8 B rep3 15 10 12 36
#> 9 B rep4 89 18 22 78
#> 10 B rep5 0 22 16 0
#> 11 C rep1 15 0 12 0
#> 12 C rep2 12 0 17 5
#> 13 C rep3 22 4 50 47
#> 14 C rep4 0 14 0 0
#> 15 C rep5 0 5 4 12
#> 16 Mean A <NA> 11.7 11 7.33 12.6
#> 17 Mean B <NA> 32.8 13 14.4 59.8
#> 18 Mean C <NA> 16.3 7.67 20.8 21.3
创建于2023-06-29,使用reprex v2.0.2
英文:
Here's a dplyr
solution. You can use separate_wider_delim()
to separate the group letter from the rest of the observation identifier. Then, grouping by the group letter, you can summarize across all variables, taking the mean of only those values of the variable not equal to zero. Next, you can change the group values to append them with "Mean ". Finally, you can append the summarized data to the original data.
library(dplyr)
library(tidyr)
tab <- read.table(header=TRUE, textConnection("
W X Y Z
A_rep1 0 15 10 2
A_rep2 2 14 4 8
A_rep3 15 0 8 22
A_rep4 18 3 0 9
A_rep5 0 12 0 22
B_rep1 13 0 12 80
B_rep2 14 2 10 45
B_rep3 15 10 12 36
B_rep4 89 18 22 78
B_rep5 0 22 16 0
C_rep1 15 0 12 0
C_rep2 12 0 17 5
C_rep3 22 4 50 47
C_rep4 0 14 0 0
C_rep5 0 5 4 12"))
tab <- tab %>%
as_tibble(rownames = "obs") %>%
separate_wider_delim(obs, names = c("group", "rep"), delim="_")
tab %>%
group_by(group) %>%
summarise(across(W:Z, function(x)mean(x[which(x != 0)]))) %>%
mutate(group = paste("Mean ", group, sep="")) %>%
bind_rows(tab, .)
#> # A tibble: 18 × 6
#> group rep W X Y Z
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 A rep1 0 15 10 2
#> 2 A rep2 2 14 4 8
#> 3 A rep3 15 0 8 22
#> 4 A rep4 18 3 0 9
#> 5 A rep5 0 12 0 22
#> 6 B rep1 13 0 12 80
#> 7 B rep2 14 2 10 45
#> 8 B rep3 15 10 12 36
#> 9 B rep4 89 18 22 78
#> 10 B rep5 0 22 16 0
#> 11 C rep1 15 0 12 0
#> 12 C rep2 12 0 17 5
#> 13 C rep3 22 4 50 47
#> 14 C rep4 0 14 0 0
#> 15 C rep5 0 5 4 12
#> 16 Mean A <NA> 11.7 11 7.33 12.6
#> 17 Mean B <NA> 32.8 13 14.4 59.8
#> 18 Mean C <NA> 16.3 7.67 20.8 21.3
<sup>Created on 2023-06-29 with reprex v2.0.2</sup>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论