英文:
R function for transforming comma-separated values in a cell into multiple rows with same row name?
问题
Orthogroup | Sequence |
---|---|
0 | Seq1 |
0 | Seq2 |
0 | Seq3 |
1 | Seq4 |
英文:
I have a two-column dataframe in R: the first column is a broad category, and the second column contains comma-separated items within the broad category. This is what it looks like:
Orthogroup | Sequences |
---|---|
0 | Seq1, Seq2, Seq3 |
1 | Seq4 |
And this is what I would like it to look like:
Orthogroup | Sequence |
---|---|
0 | Seq1 |
0 | Seq2 |
0 | Seq3 |
1 | Seq4 |
To be honest I'm not even really sure where to start... any help is much appreciated!
答案1
得分: 2
你可以使用 tidyr
包中的 separate_rows()
函数来实现这个。
library(tidyverse)
Orthogroup <- c(0, 1)
Sequences <- c("Seq1, Seq2, Seq3", "Seq4")
df <- data.frame(Orthogroup, Sequences)
df %>%
separate_rows(Sequences, sep = ", ")
#> # A tibble: 4 × 2
#> Orthogroup Sequences
#> <dbl> <chr>
#> 1 0 Seq1
#> 2 0 Seq2
#> 3 0 Seq3
#> 4 1 Seq4
英文:
You can accomplish this with separate_rows()
from the package tidyr
.
library(tidyverse)
Orthogroup <- c(0, 1)
Sequences <- c("Seq1, Seq2, Seq3", "Seq4")
df <- data.frame(Orthogroup, Sequences)
df %>%
separate_rows(Sequences, sep = ", ")
#> # A tibble: 4 × 2
#> Orthogroup Sequences
#> <dbl> <chr>
#> 1 0 Seq1
#> 2 0 Seq2
#> 3 0 Seq3
#> 4 1 Seq4
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论