使用tidyr或dplyr方法的Fisher精确检验

huangapple go评论75阅读模式
英文:

fisher exact using tidyr or dplyr approach

问题

我有一个包含两组数据的数据框,分别是肿瘤组和正常组。对于每个位点/行,我想要计算在肿瘤组和正常组之间使用Methyl和UnMethy的Fisher精确度。

我正在寻找如何将数据转换为使用dplyr方法计算每个位点的Fisher精确度。

methyl_dat <- data.frame(loci = c("site1", "site2", "site3", "site4"), 
           Methy.tumor = c(50, 5, 60, 12), 
           UnMethy.tumor = c(60, 0, 65, 5), 
           Methy.Normal = c(13, 5, 22, 3),
           UnMethy.Normal = c(86, 0, 35, 3))

这里是Fisher精确度的计算策略
对于位点1

              正常组
肿瘤组          Methyl  UnMethy
  Methy         50      13
  UnMethy       60      86
英文:

I have data dataframe with two groups, Tumor and Normal. for each site/row i want calculate fischer exact for using Methyl UnMethy between Tumor and Normal

I'm looking for how to transform data to calculate fisher exact for each site using dplyr approach.

methyl_dat &lt;- data.frame(loci = c(&quot;site1&quot;, &quot;site2&quot;, &quot;site3&quot;, &quot;site4&quot;), 
           Methy.tumor = c(50, 5, 60, 12), 
           UnMethy.tumor = c(60, 0, 65, 5), 
           Methy.Normal = c(13, 5, 22, 3),
           UnMethy.Normal = c(86, 0, 35, 3) )

Here is Fischer exact strategy
for site 1

              Normal
Tumor          Methyl  UnMethy
  Methy         50      13
  UnMethy       60      86


</details>


# 答案1
**得分**: 5

apply(methyl_dat[-1], 1, \(x)fisher.test(matrix(x,2)), simplify = F)

<details>
<summary>英文:</summary>

consider doing:

    apply(methyl_dat[-1], 1, \(x)fisher.test(matrix(x,2)), simplify = F)

</details>



# 答案2
**得分**: 2

添加一个`dplyr`解决方案:
```r
library(dplyr, warn.conflicts = FALSE)
data.frame(loci = c("site1", "site2", "site3", "site4"), 
           Methy.tumor = c(50, 5, 60, 12), 
           UnMethy.tumor = c(60, 0, 65, 5), 
           Methy.Normal = c(13, 5, 22, 3),
           UnMethy.Normal = c(86, 0, 35, 3) ) %>%
  group_by(loci) %>%
  summarise(
    p_val = fisher.test(matrix(c_across(everything()), 2))$p.val
  )
#> # A tibble: 4 × 2
#>   loci        p_val
#>   <chr>       <dbl>
#> 1 site1 0.000000392
#> 2 site2 1          
#> 3 site3 0.263      
#> 4 site4 0.621

创建于2023-06-01,使用reprex v2.0.2

英文:

Adding a dplyr solution:

library(dplyr, warn.conflicts = FALSE)
data.frame(loci = c(&quot;site1&quot;, &quot;site2&quot;, &quot;site3&quot;, &quot;site4&quot;), 
           Methy.tumor = c(50, 5, 60, 12), 
           UnMethy.tumor = c(60, 0, 65, 5), 
           Methy.Normal = c(13, 5, 22, 3),
           UnMethy.Normal = c(86, 0, 35, 3) ) %&gt;% 
  group_by(loci) %&gt;% 
  summarise(
    p_val = fisher.test(matrix(c_across(everything()), 2))$p.val
  )
#&gt; # A tibble: 4 &#215; 2
#&gt;   loci        p_val
#&gt;   &lt;chr&gt;       &lt;dbl&gt;
#&gt; 1 site1 0.000000392
#&gt; 2 site2 1          
#&gt; 3 site3 0.263      
#&gt; 4 site4 0.621

<sup>Created on 2023-06-01 with reprex v2.0.2</sup>

huangapple
  • 本文由 发表于 2023年6月2日 09:26:55
  • 转载请务必保留本文链接:https://go.coder-hub.com/76386620.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定