Calculate mean of group without 0.

huangapple go评论74阅读模式
英文:

Calculate mean of group without 0

问题

我明白了,你想要计算每列中每个组的均值,但不包括0值。这是你所期望的结果:

             W          X    Y    Z
A_rep1       0         15   10   2   
A_rep2       2         14    4   8
A_rep3      15          0    8  22
A_rep4      18          3    0   9
A_rep5       0         12    0  22    
B_rep1      13          0   12  80 
B_rep2      14          2   10  45
B_rep3      15         10   12  36 
B_rep4      89         18   22  78 
B_rep5       0         22   16   0
C_rep1      15          0   12   0
C_rep2      12          0   17   5
C_rep3      22          4   50  47
C_rep4       0         14    0   0 
C_rep5       0          5    4  12 
Mean_A      11.66  
Mean_B
Mean_C

如果你需要更多的帮助或有其他问题,请告诉我。

英文:

I have data set like this :

          W   X    Y    Z
A_rep1     0  15   10   2   
A_rep2     2  14    4   8
A_rep3    15   0    8  22
A_rep4    18   3    0   9
A_rep5     0  12    0  22    
B_rep1    13   0   12  80 
B_rep2    14   2   10  45
B_rep3    15  10   12  36 
B_rep4    89  18   22  78 
B_rep5     0  22   16   0
C_rep1    15   0   12   0
C_rep2    12   0   17   5
C_rep3    22   4   50  47
C_rep4     0  14    0   0 
C_rep5     0   5    4  12 

And i want to calculate the mean of each group in every column but without the 0 values
for example for W column group A i want to calculate the mean of A_rep2 A_rep3 A_rep4 (because i have 0 values in A_rep1 and A_rep5
so the result i want is new rows that contain the mean of each group like this

             W          X    Y    Z
A_rep1       0         15   10   2   
A_rep2       2         14    4   8
A_rep3      15          0    8  22
A_rep4      18          3    0   9
A_rep5       0         12    0  22    
B_rep1      13          0   12  80 
B_rep2      14          2   10  45
B_rep3      15         10   12  36 
B_rep4      89         18   22  78 
B_rep5       0         22   16   0
C_rep1      15          0   12   0
C_rep2      12          0   17   5
C_rep3      22          4   50  47
C_rep4       0         14    0   0 
C_rep5       0          5    4  12 
Mean_A      11.66  
Mean_B
Mean_C  

Please tell me if you need any other clarification of explanations

Thank you

答案1

得分: 2

library(tidyverse)

df %>%
  rownames_to_column('groups') %>%
  group_by(groups = sub('_.*', '', groups)) %>%
  summarise(across(everything(), ~mean(.x[.x != 0])))

# A tibble: 3 × 5
  groups     W     X     Y     Z
  <chr>  <dbl> <dbl> <dbl> <dbl>
1 A       11.7    11  7.33  12.6
2 B       32.8    13 14.4   59.8
3 C       16.3   7.67 20.8   21.3
英文:

Using dplyr,

library(tidyverse)

df %&gt;% 
 rownames_to_column(&#39;groups&#39;) %&gt;% 
 group_by(groups = sub(&#39;_.*&#39;, &#39;&#39;, groups)) %&gt;% 
 summarise(across(everything(), ~mean(.x[.x != 0])))

# A tibble: 3 &#215; 5
  groups     W     X     Y     Z
  &lt;chr&gt;  &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;
1 A       11.7 11     7.33  12.6
2 B       32.8 13    14.4   59.8
3 C       16.3  7.67 20.8   21.3

答案2

得分: 1

以下是翻译好的部分:

> df$group=unlist(lapply(strsplit(rownames(df),"_"),"[[",1))
> aggregate(.~group,data=df,FUN=function(x){mean(x[x!=0])})

  group        W         X         Y        Z
1     A 11.66667 11.000000  7.333333 12.60000
2     B 32.75000 13.000000 14.400000 59.75000
3     C 16.33333  7.666667 20.750000 21.33333
英文:

Your output is not quite clear, so here is an attempt

&gt; df$group=unlist(lapply(strsplit(rownames(df),&quot;_&quot;),&quot;[[&quot;,1))
&gt; aggregate(.~group,data=df,FUN=function(x){mean(x[x!=0])})

  group        W         X         Y        Z
1     A 11.66667 11.000000  7.333333 12.60000
2     B 32.75000 13.000000 14.400000 59.75000
3     C 16.33333  7.666667 20.750000 21.33333

答案3

得分: 0

这是一个使用dplyr的解决方案。您可以使用separate_wider_delim()将组字母与观察标识的其余部分分开。然后,通过组字母对数据进行分组,可以对所有变量进行汇总,仅取不等于零的变量值的平均值。接下来,您可以更改组值以在其前附加"Mean "。最后,您可以将汇总数据附加到原始数据上。

library(dplyr)
library(tidyr)
tab <- read.table(header=TRUE, textConnection("
 W   X    Y    Z
A_rep1     0  15   10   2   
A_rep2     2  14    4   8
A_rep3    15   0    8  22
A_rep4    18   3    0   9
A_rep5     0  12    0  22    
B_rep1    13   0   12  80 
B_rep2    14   2   10  45
B_rep3    15  10   12  36 
B_rep4    89  18   22  78 
B_rep5     0  22   16   0
C_rep1    15   0   12   0
C_rep2    12   0   17   5
C_rep3    22   4   50  47
C_rep4     0  14    0   0 
C_rep5     0   5    4  12")) 


tab <- tab %>%
  as_tibble(rownames = "obs") %>%
  separate_wider_delim(obs, names = c("group", "rep"), delim="_") 
tab %>%
  group_by(group) %>%
  summarise(across(W:Z, function(x)mean(x[which(x != 0)]))) %>%
  mutate(group = paste("Mean ", group, sep="")) %>%
  bind_rows(tab, .)
#> # A tibble: 18 × 6
#>    group  rep       W     X     Y     Z
#>    <chr>  <chr> <dbl> <dbl> <dbl> <dbl>
#>  1 A      rep1      0    15    10     2  
#>  2 A      rep2      2    14     4     8  
#>  3 A      rep3     15     0     8    22  
#>  4 A      rep4     18     3     0     9  
#>  5 A      rep5      0    12     0    22  
#>  6 B      rep1     13     0    12    80  
#>  7 B      rep2     14     2    10    45  
#>  8 B      rep3     15    10    12    36  
#>  9 B      rep4     89    18    22    78  
#> 10 B      rep5      0    22    16     0  
#> 11 C      rep1     15     0    12     0  
#> 12 C      rep2     12     0    17     5  
#> 13 C      rep3     22     4    50    47  
#> 14 C      rep4      0    14     0     0  
#> 15 C      rep5      0     5     4    12  
#> 16 Mean A <NA>    11.7  11     7.33  12.6
#> 17 Mean B <NA>    32.8  13    14.4   59.8
#> 18 Mean C <NA>    16.3   7.67 20.8   21.3

创建于2023-06-29,使用reprex v2.0.2

英文:

Here's a dplyr solution. You can use separate_wider_delim() to separate the group letter from the rest of the observation identifier. Then, grouping by the group letter, you can summarize across all variables, taking the mean of only those values of the variable not equal to zero. Next, you can change the group values to append them with "Mean ". Finally, you can append the summarized data to the original data.

library(dplyr)
library(tidyr)
tab &lt;- read.table(header=TRUE, textConnection(&quot;
 W   X    Y    Z
A_rep1     0  15   10   2   
A_rep2     2  14    4   8
A_rep3    15   0    8  22
A_rep4    18   3    0   9
A_rep5     0  12    0  22    
B_rep1    13   0   12  80 
B_rep2    14   2   10  45
B_rep3    15  10   12  36 
B_rep4    89  18   22  78 
B_rep5     0  22   16   0
C_rep1    15   0   12   0
C_rep2    12   0   17   5
C_rep3    22   4   50  47
C_rep4     0  14    0   0 
C_rep5     0   5    4  12&quot;)) 


tab &lt;- tab %&gt;% 
  as_tibble(rownames = &quot;obs&quot;) %&gt;% 
  separate_wider_delim(obs, names = c(&quot;group&quot;, &quot;rep&quot;), delim=&quot;_&quot;) 
tab %&gt;% 
  group_by(group) %&gt;% 
  summarise(across(W:Z, function(x)mean(x[which(x != 0)]))) %&gt;% 
  mutate(group = paste(&quot;Mean &quot;, group, sep=&quot;&quot;)) %&gt;% 
  bind_rows(tab, .)
#&gt; # A tibble: 18 &#215; 6
#&gt;    group  rep       W     X     Y     Z
#&gt;    &lt;chr&gt;  &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;
#&gt;  1 A      rep1    0   15    10      2  
#&gt;  2 A      rep2    2   14     4      8  
#&gt;  3 A      rep3   15    0     8     22  
#&gt;  4 A      rep4   18    3     0      9  
#&gt;  5 A      rep5    0   12     0     22  
#&gt;  6 B      rep1   13    0    12     80  
#&gt;  7 B      rep2   14    2    10     45  
#&gt;  8 B      rep3   15   10    12     36  
#&gt;  9 B      rep4   89   18    22     78  
#&gt; 10 B      rep5    0   22    16      0  
#&gt; 11 C      rep1   15    0    12      0  
#&gt; 12 C      rep2   12    0    17      5  
#&gt; 13 C      rep3   22    4    50     47  
#&gt; 14 C      rep4    0   14     0      0  
#&gt; 15 C      rep5    0    5     4     12  
#&gt; 16 Mean A &lt;NA&gt;   11.7 11     7.33  12.6
#&gt; 17 Mean B &lt;NA&gt;   32.8 13    14.4   59.8
#&gt; 18 Mean C &lt;NA&gt;   16.3  7.67 20.8   21.3

<sup>Created on 2023-06-29 with reprex v2.0.2</sup>

huangapple
  • 本文由 发表于 2023年6月29日 21:38:51
  • 转载请务必保留本文链接:https://go.coder-hub.com/76581587.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定