英文:
fisher exact using tidyr or dplyr approach
问题
我有一个包含两组数据的数据框,分别是肿瘤组和正常组。对于每个位点/行,我想要计算在肿瘤组和正常组之间使用Methyl和UnMethy的Fisher精确度。
我正在寻找如何将数据转换为使用dplyr方法计算每个位点的Fisher精确度。
methyl_dat <- data.frame(loci = c("site1", "site2", "site3", "site4"), 
           Methy.tumor = c(50, 5, 60, 12), 
           UnMethy.tumor = c(60, 0, 65, 5), 
           Methy.Normal = c(13, 5, 22, 3),
           UnMethy.Normal = c(86, 0, 35, 3))
这里是Fisher精确度的计算策略
对于位点1
              正常组
肿瘤组          Methyl  UnMethy
  Methy         50      13
  UnMethy       60      86
英文:
I have data dataframe with two groups, Tumor and Normal. for each site/row i want calculate fischer exact for using Methyl  UnMethy between Tumor and Normal
I'm looking for how to transform data to calculate fisher exact for each site using dplyr approach.
methyl_dat <- data.frame(loci = c("site1", "site2", "site3", "site4"), 
           Methy.tumor = c(50, 5, 60, 12), 
           UnMethy.tumor = c(60, 0, 65, 5), 
           Methy.Normal = c(13, 5, 22, 3),
           UnMethy.Normal = c(86, 0, 35, 3) )
Here is Fischer exact strategy
for site 1
              Normal
Tumor          Methyl  UnMethy
  Methy         50      13
  UnMethy       60      86
</details>
# 答案1
**得分**: 5
apply(methyl_dat[-1], 1, \(x)fisher.test(matrix(x,2)), simplify = F)
<details>
<summary>英文:</summary>
consider doing:
    apply(methyl_dat[-1], 1, \(x)fisher.test(matrix(x,2)), simplify = F)
</details>
# 答案2
**得分**: 2
添加一个`dplyr`解决方案:
```r
library(dplyr, warn.conflicts = FALSE)
data.frame(loci = c("site1", "site2", "site3", "site4"), 
           Methy.tumor = c(50, 5, 60, 12), 
           UnMethy.tumor = c(60, 0, 65, 5), 
           Methy.Normal = c(13, 5, 22, 3),
           UnMethy.Normal = c(86, 0, 35, 3) ) %>%
  group_by(loci) %>%
  summarise(
    p_val = fisher.test(matrix(c_across(everything()), 2))$p.val
  )
#> # A tibble: 4 × 2
#>   loci        p_val
#>   <chr>       <dbl>
#> 1 site1 0.000000392
#> 2 site2 1          
#> 3 site3 0.263      
#> 4 site4 0.621
创建于2023-06-01,使用reprex v2.0.2
英文:
Adding a dplyr solution:
library(dplyr, warn.conflicts = FALSE)
data.frame(loci = c("site1", "site2", "site3", "site4"), 
           Methy.tumor = c(50, 5, 60, 12), 
           UnMethy.tumor = c(60, 0, 65, 5), 
           Methy.Normal = c(13, 5, 22, 3),
           UnMethy.Normal = c(86, 0, 35, 3) ) %>% 
  group_by(loci) %>% 
  summarise(
    p_val = fisher.test(matrix(c_across(everything()), 2))$p.val
  )
#> # A tibble: 4 × 2
#>   loci        p_val
#>   <chr>       <dbl>
#> 1 site1 0.000000392
#> 2 site2 1          
#> 3 site3 0.263      
#> 4 site4 0.621
<sup>Created on 2023-06-01 with reprex v2.0.2</sup>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论