Is there an R function for collapsing characters into one cell if they have a matching character in another cell?

huangapple go评论58阅读模式
英文:

Is there an R function for collapsing characters into one cell if they have a matching character in another cell?

问题

我有一个包含两列字符的数据框,如下所示:

name gene
GO:00001 Gene_1
GO:00001 Gene_2
GO:00002 Gene_3
GO:00002 Gene_4
GO:00002 Gene_5

但我需要合并列,使“name”列不重复,并且“gene”列包含与相同“name”匹配的每个基因,用逗号和空格分隔,如下所示:

name gene
GO:00001 Gene_1, Gene_2
GO:00002 Gene_3, Gene_4, Gene_5

我已经查阅了有关melt、collapse和summarize的文档,但无法弄清楚如何使用字符执行此操作。非常感谢任何帮助!

英文:

I have a dataframe with two columns of characters that looks like this:

name gene
GO:00001 Gene_1
GO:00001 Gene_2
GO:00002 Gene_3
GO:00002 Gene_4
GO:00002 Gene_5

But I need to collapse the columns so that the "name" column isn't repetitive and the "gene" column contains each gene that matches to the same "name", separated by a comma and a space, like so:

name gene
GO:00001 Gene_1, Gene_2
GO:00002 Gene_3, Gene_4, Gene_5

I have looked into the documentation for melt, collapse, and summarize, but I can't figure out how to do this with characters. Any help is much appreciated!!

答案1

得分: 0

Using dplyr:

> df %>%
    group_by(name) %>%
    summarise(gene = paste0(gene, collapse = ","))
# A tibble: 2 × 2
  name     gene                
  <chr>    <chr>               
1 GO:00001 Gene_1,Gene_2       
2 GO:00002 Gene_3,Gene_4,Gene_5

Using R base:

aggregate(gene ~ name, FUN = paste0, data = df)
英文:

Using dplyr:

&gt; df %&gt;% 
    group_by(name) %&gt;% 
    summarise(gene = paste0(gene, collapse = &quot;,&quot;))
# A tibble: 2 &#215; 2
  name     gene                
  &lt;chr&gt;    &lt;chr&gt;               
1 GO:00001 Gene_1,Gene_2       
2 GO:00002 Gene_3,Gene_4,Gene_5

Using R base

aggregate(gene ~ name, FUN= paste0, data=df)

huangapple
  • 本文由 发表于 2023年2月7日 05:02:52
  • 转载请务必保留本文链接:https://go.coder-hub.com/75366549.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定