2023年6月29日 21:38:51go评论94阅读模式

英文:

Calculate mean of group without 0

问题

我明白了，你想要计算每列中每个组的均值，但不包括0值。这是你所期望的结果：

             W          X    Y    Z
A_rep1       0         15   10   2   
A_rep2       2         14    4   8
A_rep3      15          0    8  22
A_rep4      18          3    0   9
A_rep5       0         12    0  22    
B_rep1      13          0   12  80 
B_rep2      14          2   10  45
B_rep3      15         10   12  36 
B_rep4      89         18   22  78 
B_rep5       0         22   16   0
C_rep1      15          0   12   0
C_rep2      12          0   17   5
C_rep3      22          4   50  47
C_rep4       0         14    0   0 
C_rep5       0          5    4  12 
Mean_A      11.66  
Mean_B
Mean_C

如果你需要更多的帮助或有其他问题，请告诉我。

英文:

I have data set like this :

          W   X    Y    Z
A_rep1     0  15   10   2   
A_rep2     2  14    4   8
A_rep3    15   0    8  22
A_rep4    18   3    0   9
A_rep5     0  12    0  22    
B_rep1    13   0   12  80 
B_rep2    14   2   10  45
B_rep3    15  10   12  36 
B_rep4    89  18   22  78 
B_rep5     0  22   16   0
C_rep1    15   0   12   0
C_rep2    12   0   17   5
C_rep3    22   4   50  47
C_rep4     0  14    0   0 
C_rep5     0   5    4  12

And i want to calculate the mean of each group in every column but without the 0 values
for example for W column group A i want to calculate the mean of A_rep2 A_rep3 A_rep4 (because i have 0 values in A_rep1 and A_rep5
so the result i want is new rows that contain the mean of each group like this

             W          X    Y    Z
A_rep1       0         15   10   2   
A_rep2       2         14    4   8
A_rep3      15          0    8  22
A_rep4      18          3    0   9
A_rep5       0         12    0  22    
B_rep1      13          0   12  80 
B_rep2      14          2   10  45
B_rep3      15         10   12  36 
B_rep4      89         18   22  78 
B_rep5       0         22   16   0
C_rep1      15          0   12   0
C_rep2      12          0   17   5
C_rep3      22          4   50  47
C_rep4       0         14    0   0 
C_rep5       0          5    4  12 
Mean_A      11.66  
Mean_B
Mean_C

Please tell me if you need any other clarification of explanations

Thank you

答案1

得分: 2

library(tidyverse)

df %>%
  rownames_to_column('groups') %>%
  group_by(groups = sub('_.*', '', groups)) %>%
  summarise(across(everything(), ~mean(.x[.x != 0])))

# A tibble: 3 × 5
  groups     W     X     Y     Z
  <chr>  <dbl> <dbl> <dbl> <dbl>
1 A       11.7    11  7.33  12.6
2 B       32.8    13 14.4   59.8
3 C       16.3   7.67 20.8   21.3

英文:

Using dplyr,

library(tidyverse)

df %&gt;% 
 rownames_to_column(&#39;groups&#39;) %&gt;% 
 group_by(groups = sub(&#39;_.*&#39;, &#39;&#39;, groups)) %&gt;% 
 summarise(across(everything(), ~mean(.x[.x != 0])))

# A tibble: 3 &#215; 5
  groups     W     X     Y     Z
  &lt;chr&gt;  &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;
1 A       11.7 11     7.33  12.6
2 B       32.8 13    14.4   59.8
3 C       16.3  7.67 20.8   21.3

答案2

得分: 1

以下是翻译好的部分：

> df$group=unlist(lapply(strsplit(rownames(df),"_"),"[[",1))
> aggregate(.~group,data=df,FUN=function(x){mean(x[x!=0])})

  group        W         X         Y        Z
1     A 11.66667 11.000000  7.333333 12.60000
2     B 32.75000 13.000000 14.400000 59.75000
3     C 16.33333  7.666667 20.750000 21.33333

英文:

Your output is not quite clear, so here is an attempt

&gt; df$group=unlist(lapply(strsplit(rownames(df),&quot;_&quot;),&quot;[[&quot;,1))
&gt; aggregate(.~group,data=df,FUN=function(x){mean(x[x!=0])})

  group        W         X         Y        Z
1     A 11.66667 11.000000  7.333333 12.60000
2     B 32.75000 13.000000 14.400000 59.75000
3     C 16.33333  7.666667 20.750000 21.33333

答案3

得分: 0

这是一个使用dplyr的解决方案。您可以使用separate_wider_delim()将组字母与观察标识的其余部分分开。然后，通过组字母对数据进行分组，可以对所有变量进行汇总，仅取不等于零的变量值的平均值。接下来，您可以更改组值以在其前附加"Mean "。最后，您可以将汇总数据附加到原始数据上。

library(dplyr)
library(tidyr)
tab <- read.table(header=TRUE, textConnection("
 W   X    Y    Z
A_rep1     0  15   10   2   
A_rep2     2  14    4   8
A_rep3    15   0    8  22
A_rep4    18   3    0   9
A_rep5     0  12    0  22    
B_rep1    13   0   12  80 
B_rep2    14   2   10  45
B_rep3    15  10   12  36 
B_rep4    89  18   22  78 
B_rep5     0  22   16   0
C_rep1    15   0   12   0
C_rep2    12   0   17   5
C_rep3    22   4   50  47
C_rep4     0  14    0   0 
C_rep5     0   5    4  12")) 


tab <- tab %>%
  as_tibble(rownames = "obs") %>%
  separate_wider_delim(obs, names = c("group", "rep"), delim="_") 
tab %>%
  group_by(group) %>%
  summarise(across(W:Z, function(x)mean(x[which(x != 0)]))) %>%
  mutate(group = paste("Mean ", group, sep="")) %>%
  bind_rows(tab, .)
#> # A tibble: 18 × 6
#>    group  rep       W     X     Y     Z
#>    <chr>  <chr> <dbl> <dbl> <dbl> <dbl>
#>  1 A      rep1      0    15    10     2  
#>  2 A      rep2      2    14     4     8  
#>  3 A      rep3     15     0     8    22  
#>  4 A      rep4     18     3     0     9  
#>  5 A      rep5      0    12     0    22  
#>  6 B      rep1     13     0    12    80  
#>  7 B      rep2     14     2    10    45  
#>  8 B      rep3     15    10    12    36  
#>  9 B      rep4     89    18    22    78  
#> 10 B      rep5      0    22    16     0  
#> 11 C      rep1     15     0    12     0  
#> 12 C      rep2     12     0    17     5  
#> 13 C      rep3     22     4    50    47  
#> 14 C      rep4      0    14     0     0  
#> 15 C      rep5      0     5     4    12  
#> 16 Mean A <NA>    11.7  11     7.33  12.6
#> 17 Mean B <NA>    32.8  13    14.4   59.8
#> 18 Mean C <NA>    16.3   7.67 20.8   21.3

^{创建于2023-06-29，使用reprex v2.0.2}

英文:

Here's a dplyr solution. You can use separate_wider_delim() to separate the group letter from the rest of the observation identifier. Then, grouping by the group letter, you can summarize across all variables, taking the mean of only those values of the variable not equal to zero. Next, you can change the group values to append them with "Mean ". Finally, you can append the summarized data to the original data.

library(dplyr)
library(tidyr)
tab &lt;- read.table(header=TRUE, textConnection(&quot;
 W   X    Y    Z
A_rep1     0  15   10   2   
A_rep2     2  14    4   8
A_rep3    15   0    8  22
A_rep4    18   3    0   9
A_rep5     0  12    0  22    
B_rep1    13   0   12  80 
B_rep2    14   2   10  45
B_rep3    15  10   12  36 
B_rep4    89  18   22  78 
B_rep5     0  22   16   0
C_rep1    15   0   12   0
C_rep2    12   0   17   5
C_rep3    22   4   50  47
C_rep4     0  14    0   0 
C_rep5     0   5    4  12&quot;)) 


tab &lt;- tab %&gt;% 
  as_tibble(rownames = &quot;obs&quot;) %&gt;% 
  separate_wider_delim(obs, names = c(&quot;group&quot;, &quot;rep&quot;), delim=&quot;_&quot;) 
tab %&gt;% 
  group_by(group) %&gt;% 
  summarise(across(W:Z, function(x)mean(x[which(x != 0)]))) %&gt;% 
  mutate(group = paste(&quot;Mean &quot;, group, sep=&quot;&quot;)) %&gt;% 
  bind_rows(tab, .)
#&gt; # A tibble: 18 &#215; 6
#&gt;    group  rep       W     X     Y     Z
#&gt;    &lt;chr&gt;  &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;
#&gt;  1 A      rep1    0   15    10      2  
#&gt;  2 A      rep2    2   14     4      8  
#&gt;  3 A      rep3   15    0     8     22  
#&gt;  4 A      rep4   18    3     0      9  
#&gt;  5 A      rep5    0   12     0     22  
#&gt;  6 B      rep1   13    0    12     80  
#&gt;  7 B      rep2   14    2    10     45  
#&gt;  8 B      rep3   15   10    12     36  
#&gt;  9 B      rep4   89   18    22     78  
#&gt; 10 B      rep5    0   22    16      0  
#&gt; 11 C      rep1   15    0    12      0  
#&gt; 12 C      rep2   12    0    17      5  
#&gt; 13 C      rep3   22    4    50     47  
#&gt; 14 C      rep4    0   14     0      0  
#&gt; 15 C      rep5    0    5     4     12  
#&gt; 16 Mean A &lt;NA&gt;   11.7 11     7.33  12.6
#&gt; 17 Mean B &lt;NA&gt;   32.8 13    14.4   59.8
#&gt; 18 Mean C &lt;NA&gt;   16.3  7.67 20.8   21.3

<sup>Created on 2023-06-29 with reprex v2.0.2</sup>

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Calculate mean of group without 0.

问题

答案1

答案2

答案3

Pandas Yahoo Finance: AttributeError: ‘Index’ 对象没有属性 ‘tz_localize’

“R CMD Check依赖错误“条件长度大于1””

如何合并两个数据框并进行数据透视。

创建一个基于匹配字符串的新列。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论