2023年6月8日 01:24:43go评论166阅读模式

英文:

fisher exact test for 2 consecutive rows in data frame R

问题

For the given data frame, you want to perform Fisher exact tests for each position defined by the chromosome, start, and end columns, using the count_unmethylated and count_methylated data for both "tumor" and "normal" groups. Here's the translation of your request:

对于给定的数据框，您想要针对由chromosome、start和end列定义的每个位置，使用"tumor"和"normal"组的count_unmethylated和count_methylated数据执行Fisher精确度测试。以下是您的要求的翻译：

我有一个数据框，在其中对于一个站点，我有tumor和normal计数数据。我想要使用每个位置的tumor和normal的count_unmethylated和count_methylated来进行Fisher精确度测试，位置由chromosome start end定义。

因此，对于第一个位置：

chromosome start   end
1          10469   10469

我希望按以下方式进行Fisher精确度测试：

              count_unmethylated  count_methylated
  norm         0      2
  tum          1      3

并对其余的chromosome start end位置执行相同操作。

我尝试了来自先前代码的解决方案，但进行了修改，但没有成功：

head(tumNorm_dt_merged_long) %>%
  group_by(chromosome, start, end) %>%
  summarise(data = list(row_wise_fisher_test(as.matrix(select(cur_data(), starts_with('count_'))), p.adjust.method = "BH")), ncol=2)) %>%
  unnest_wider(data) %>%
  unnest(c(group:p.adj.signif)) -> Fisher_result

我的数据如下：

 dput(head(tumNorm_dt_merged_long))
structure(list(chromosome = c("1", "1", "1", "1", "1", "1"), 
    start = c(10469L, 10469L, 10470L, 10470L, 10471L, 10471L), 
    end = c(10469L, 10469L, 10470L, 10470L, 10471L, 10471L), 
    group = c("norm", "tum", "norm", "tum", "norm", "tum"), count_methylated = c(2, 
    3, 3, 2, 1, 2), count_unmethylated = c(0, 1, 0, 0, 1, 2), 
    methylation_percentage = c(100, 75, 100, 100, 50, 50)), row.names = c(NA, 
-6L), class = c("data.table", "data.frame"), .internal.selfref = <pointer: 0x130baa0>, sorted = c("chromosome", 
"start", "end", "group"))

英文:

i have data frame where for a 1 site i have tumor and normal count data. I want to do fisher exact test using the count_unmethylated and count_methylated for tumor and normal for each position chromosome start end.

so for the first position;

chromosome start   end
1          10469   10469

i want to conduct fisher extact test this way

              count_unmethylated  count_methylated
  norm         0      2
  tum          1      3

and do it for the rest of loci chromosome start end

i tried solution from previous code with modification but didn't work:
https://stackoverflow.com/questions/66216780/row-wise-fisher-exact-test-grouped-by-samples-in-r

head(tumNorm_dt_merged_long) %&gt;%
  group_by(chromosome,    start,      end) %&gt;% 
  summarise(data = list(row_wise_fisher_test(as.matrix(select(cur_data(), 
                        starts_with(&#39;count_&#39;))), p.adjust.method = &quot;BH&quot;), ncol=2)) %&gt;%
  unnest_wider(data) %&gt;%
  unnest(c(group:p.adj.signif)) -&gt; Fisher_result

my data looks like this

 dput(head(tumNorm_dt_merged_long))
structure(list(chromosome = c(&quot;1&quot;, &quot;1&quot;, &quot;1&quot;, &quot;1&quot;, &quot;1&quot;, &quot;1&quot;), 
    start = c(10469L, 10469L, 10470L, 10470L, 10471L, 10471L), 
    end = c(10469L, 10469L, 10470L, 10470L, 10471L, 10471L), 
    group = c(&quot;norm&quot;, &quot;tum&quot;, &quot;norm&quot;, &quot;tum&quot;, &quot;norm&quot;, &quot;tum&quot;), count_methylated = c(2, 
    3, 3, 2, 1, 2), count_unmethylated = c(0, 1, 0, 0, 1, 2), 
    methylation_percentage = c(100, 75, 100, 100, 50, 50)), row.names = c(NA, 
-6L), class = c(&quot;data.table&quot;, &quot;data.frame&quot;), .internal.selfref = &lt;pointer: 0x130baa0&gt;, sorted = c(&quot;chromosome&quot;, 
&quot;start&quot;, &quot;end&quot;, &quot;group&quot;))

答案1

得分: 1

这是使用基本的R解决方案。根据起始列拆分数据框，假设每个唯一的起始值只有2行。然后使用lapply循环计算第5和第6列的Fisher's测试。

tumNorm_dt_merged_long <- structure(list(chromosome = c("1", "1", "1", "1", "1", "1"), 
               start = c(10469L, 10469L, 10470L, 10470L, 10471L, 10471L), 
               end = c(10469L, 10469L, 10470L, 10470L, 10471L, 10471L), 
               group = c("norm", "tum", "norm", "tum", "norm", "tum"), 
               count_methylated = c(2, 3, 3, 2, 1, 2), 
               count_unmethylated = c(0, 1, 0, 0, 1, 2), 
               methylation_percentage = c(100, 75, 100, 100, 50, 50)), 
          row.names = c(NA, -6L), class = c("data.table", "data.frame"), sorted = c("chromosome", "start", "end", "group"))

dflist <- split(tumNorm_dt_merged_long, tumNorm_dt_merged_long$start)

output <- lapply(dflist, function(x){
   print(x)
   results <- fisher.test(x[, c(5,6)])
   print(results)
   results
})

希望这对你有帮助！

英文:

Here is a solution using base R. Split the data frame based on the start column, assumes just 2 rows per unique start value. The use the lapply loop to calculate the Fisher's test on columns 5 & 6.

tumNorm_dt_merged_long &lt;- structure(list(chromosome = c(&quot;1&quot;, &quot;1&quot;, &quot;1&quot;, &quot;1&quot;, &quot;1&quot;, &quot;1&quot;), 
               start = c(10469L, 10469L, 10470L, 10470L, 10471L, 10471L), 
               end = c(10469L, 10469L, 10470L, 10470L, 10471L, 10471L), 
               group = c(&quot;norm&quot;, &quot;tum&quot;, &quot;norm&quot;, &quot;tum&quot;, &quot;norm&quot;, &quot;tum&quot;), 
               count_methylated = c(2, 3, 3, 2, 1, 2), 
               count_unmethylated = c(0, 1, 0, 0, 1, 2), 
               methylation_percentage = c(100, 75, 100, 100, 50, 50)), 
          row.names = c(NA, -6L), class = c(&quot;data.table&quot;, &quot;data.frame&quot;), sorted = c(&quot;chromosome&quot;, &quot;start&quot;, &quot;end&quot;, &quot;group&quot;))

dflist &lt;- split(tumNorm_dt_merged_long, tumNorm_dt_merged_long$start)

output &lt;-lapply(dflist, function(x){
   print(x)
   results &lt;- fisher.test(x[ , c(5,6)])
   print(results)
   results
})

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

fisher exact test for 2 consecutive rows in data frame R

问题

答案1

将字符串列转换为日期时间变量，使用R中的as.POSIXct函数。

R函数用于修剪数据框。

Heatmaply 在 R Viewer 或 Markdown 中未显示，但在按下 “在新窗口中显示” 时加载。

从不同的数据框中根据group_by函数的运行值获取数值。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论