Find Top5 min and max values per group in dataframe (determine min and max differentially expressed genes)

huangapple go评论65阅读模式
英文:

Find Top5 min and max values per group in dataframe (determine min and max differentially expressed genes)

问题

In a dataframe of differential expression results, find the 5 genes per group that are maximally and minimally regulated (= highest and lowest log fold change).

example data

set.seed(47)
gene_creator <- paste("gene",1:100,sep="")
genes = sample(gene_creator,8)

dex_A <- data.frame(
  gene = genes,
  group = "group_A",
  logFC = sample(c(-5:5), replace=T, size=8),
  FDR = sample(c(0.01,1), replace=T, size=8)
)

dex_B <- data.frame(
  gene = genes,
  group = "group_B",
  logFC = sample(c(-5:5), replace=T, size=8),
  FDR = sample(c(0.01,1), replace=T, size=8)
)

dex_df <- rbind(dex_A, dex_B)

solution (not working)

library(tidyverse)
dex_df %>%
  filter(FDR < 0.05) %>%
  group_by(group) %>%
  mutate(
    top5 = slice_max(logFC, n = 5),
    min5 = slice_min(logFC, n = 5))
英文:

In a dataframe of differential expression results, find the 5 genes per group that are maximally and minimally regulated (= highest and lowest log fold change).
My solution with mutate and slice_max/min somehow wont work.

example data

set.seed(47)
gene_creator <- paste("gene",1:100,sep="")
genes = sample(gene_creator,8)

dex_A <- data.frame(
  gene = genes,
  group = "group_A",
  logFC = sample(c(-5:5), replace=T, size=8),
  FDR = sample(c(0.01,1), replace=T, size=8)
)

dex_B <- data.frame(
  gene = genes,
  group = "group_B",
  logFC = sample(c(-5:5), replace=T, size=8),
  FDR = sample(c(0.01,1), replace=T, size=8)
)

dex_df <- rbind(dex_A, dex_B)

solution (not working)

library(tidyverse)
dex_df %>%
  filter(FDR < 0.05) %>%
  group_by(group) %>%
  mutate(
    top5 = slice_max(logFC, n = 5),
    min5 = slice_min(logFC, n = 5))

答案1

得分: 3

You cannot use slice_max() or slice_min() directly with mutate. You have to use them separately and row bind them, i.e.

您不能直接在mutate中使用slice_max()slice_min()。您必须将它们分开使用,然后将它们行绑定,即:

英文:

You cannot use slice_max() or slice_min() directly with mutate. You have to use them separately and row bind them, i.e.

bind_rows(dex_df %>%
            filter(FDR < 0.05) %>%
            group_by(group) %>%
            slice_max(logFC, n = 5) %>%
            mutate(rank = "top5"),
          dex_df %>%
            filter(FDR < 0.05) %>%
            group_by(group) %>%
            slice_min(logFC, n = 5) %>%
            mutate(rank = "min5"))

huangapple
  • 本文由 发表于 2023年4月17日 16:56:35
  • 转载请务必保留本文链接:https://go.coder-hub.com/76033360.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定