英文:
Split a column into two, using parenthesis as separator in R
问题
I have a weird data format and I need to split a column to two.
col = c("142343-2344343(+)", "546354-4775458(-)", "374637463")
I want to split col to col1
and col2
, using the first parenthesis as separator.
I want something like this
col1 col2
142343-2344343 +
546354-4775458 -
374637463 NA
I'd love your help!
英文:
I have a weird data format and I need to split a column to two.
col=c("142343-2344343(+)", "546354-4775458(-)", "374637463")
I want to split col to col1
and col2
, using the first parenthesis as separator.
I want something like this
col1 col2
142343-2344343 +
546354-4775458 _
374637463 NA
I d love your help!
答案1
得分: 3
尝试 separate
:
library(tidyverse)
data.frame(col) %>%
separate(col,
into = c("col1", "col2"),
sep = "\\(|\\)")
结果如下:
col1 col2
1 142343-2344343 +
2 546354-4775458 -
3 374637463 <NA>
英文:
Try separate
:
library(tidyverse)
data.frame(col) %>%
separate(col,
into = c("col1", "col2"),
sep = "\\(|\\)")
col1 col2
1 142343-2344343 +
2 546354-4775458 -
3 374637463 <NA>
答案2
得分: 2
We may use base R
with read.csv
read.csv(text = sub("(.*)([+-])$", "\,\",
gsub("\\(|\\)", "", col)), header = FALSE, na.strings= "",
col.names = c("col1", "col2"))
-output
col1 col2
1 142343-2344343 +
2 546354-4775458 -
3 374637463 <NA>
With tidyr
, an option is
library(tidyr)
library(dplyr)
library(tibble)
tibble(col) %>%
separate_wider_regex(col, c(col1 = ".*", "\\(", var2 = "[^)]",
"\\)"), too_few = "align_start")
-output
# A tibble: 3 × 2
col1 var2
<chr> <chr>
1 142343-2344343 +
2 546354-4775458 -
3 374637463 <NA>
英文:
We may use base R
with read.csv
read.csv(text = sub("(.*)([+-])$", "\,\",
gsub("\\(|\\)", "", col)), header = FALSE, na.strings= "",
col.names = c("col1", "col2"))
-output
col1 col2
1 142343-2344343 +
2 546354-4775458 -
3 374637463 <NA>
With tidyr
, an option is
library(tidyr)
library(dplyr)
library(tibble)
tibble(col) %>%
separate_wider_regex(col, c(col1 = ".*", "\\(", var2 = "[^)]",
"\\)"), too_few = "align_start")
-output
# A tibble: 3 × 2
col1 var2
<chr> <chr>
1 142343-2344343 +
2 546354-4775458 -
3 374637463 <NA>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论