英文:
Find first occurrence of a character in column of a data frame in R
问题
I can help you with the translation:
在R中处理字符串时感到困惑...
我在R数据框中有一列字符串。每个字符串中都包含一次且仅一次的字符"="
。我想知道每个列元素中"="
字符的位置,这是将该列拆分为两个独立列的步骤之一(一个用于"="
之前的部分,另一个用于"="
之后的部分)。有人能帮忙吗?我相信这很简单,但我一直在努力寻找答案。
例如,如果我有:
x <- data.frame(string = c("aa=1", "aa=2", "aa=3", "b=1", "b=2", "abc=5"))
我想要一段代码返回:
(3, 3, 3, 2, 2, 4)
谢谢。
英文:
Struggling with string handling in R...
I've got a column of strings in an R data frame. Each one contains the "="
character once and only once. I'd like to know the position of the "="
character in each element of the column, as a step to splitting the column into two separate columns (one for the bit before the "="
and one for the bit after the "="
). Can anyone help please? I'm sure it's simple but I'm struggling to find the answer.
For example, if I have:
x <- data.frame(string = c("aa=1", "aa=2", "aa=3", "b=1", "b=2", "abc=5"))
I'd like a bit of code to return
> (3, 3, 3, 2, 2, 4)
Thank you.
答案1
得分: 1
这是一种方法:
library(stringr)
str_locate(x$string, "=")[,1]
英文:
Here's a way to do:
library(stringr)
str_locate(x$string, "=")[,1]
答案2
得分: 1
在基本的 R
中,您可以执行以下操作:
as.numeric(lapply(strsplit(as.character(x$string), ""), function(x) which(x == "=")))
>[1] 3 3 3 2 2 4
英文:
In Base R
you can do:
as.numeric(lapply(strsplit(as.character(x$string), ""), function(x) which(x == "=")))
>[1] 3 3 3 2 2 4
答案3
得分: 1
你可以使用 gregexpr
:
unlist(lapply(gregexpr(pattern = ''='', x$string), min))
[1] 3 3 3 2 2 4
英文:
You can use gregexpr
:
unlist(lapply(gregexpr(pattern = '=', x$string), min))
[1] 3 3 3 2 2 4
答案4
得分: 1
要获取“=”的位置,您可以使用regexp
函数:
regexpr("=", x$string)
#[1] 3 3 3 2 2 4
#attr(,"match.length")
#[1] 1 1 1 1 1 1
#attr(,"useBytes")
#[1] TRUE
但是,正如@Michael所述,如果您的目标是拆分字符串,您可以使用strsplit
:
strsplit(x$string, "=")
#[[1]]
#[1] "aa" "1"
#[[2]]
#[1] "aa" "2"
#[[3]]
#[1] "aa" "3"
#[[4]]
#[1] "b" "1"
#[[5]]
#[1] "b" "2"
#[[6]]
#[1] "abc" "5"
或者使用do.call
和rbind
组合来创建一个新的数据框:
do.call(rbind, strsplit(x$string, "="))
# [,1] [,2]
#[1,] "aa" "1"
#[2,] "aa" "2"
#[3,] "aa" "3"
#[4,] "b" "1"
#[5,] "b" "2"
#[6,] "abc" "5"
英文:
To get the position of "=" you can use the regexp
function:
regexpr("=", x$string)
#[1] 3 3 3 2 2 4
#attr(,"match.length")
#[1] 1 1 1 1 1 1
#attr(,"useBytes")
#[1] TRUE
However, as @Michael stated if your goal is to split the string you can use strsplit
:
strsplit(x$string, "=")
#[[1]]
#[1] "aa" "1"
#
#[[2]]
#[1] "aa" "2"
#
#[[3]]
#[1] "aa" "3"
#
#[[4]]
#[1] "b" "1"
#
#[[5]]
#[1] "b" "2"
#
#[[6]]
#[1] "abc" "5"
Or to combine with do.call
and `rbind to create a new dataframe:
do.call(rbind, strsplit(x$string, "="))
# [,1] [,2]
#[1,] "aa" "1"
#[2,] "aa" "2"
#[3,] "aa" "3"
#[4,] "b" "1"
#[5,] "b" "2"
#[6,] "abc" "5"
答案5
得分: 1
以下是翻译好的部分:
这是获取一个两列数据框的另一种解决方案,第一列包含等号(=
)之前的字符,第二列包含等号之后的字符。您可以在不获取等号位置的情况下完成这个操作。
library(stringr)
t(as.data.frame(strsplit(x$string, "=")))
# [,1] [,2]
#c..aa....1.. "aa" "1"
#c..aa....2.. "aa" "2"
#c..aa....3.. "aa" "3"
#c..b....1.. "b" "1"
#c..b....2.. "b" "2"
#c..abc....5.. "abc" "5"
英文:
Here is another solution to obtain a two column dataframe, the first containing the characters before =
and the second one containing the characters after =
. You can do that without obtaining the positions of the =
character.
library(stringr)
t(as.data.frame(strsplit(x$string, "=")))
# [,1] [,2]
#c..aa....1.. "aa" "1"
#c..aa....2.. "aa" "2"
#c..aa....3.. "aa" "3"
#c..b....1.. "b" "1"
#c..b....2.. "b" "2"
#c..abc....5.. "abc" "5"
答案6
得分: 0
一些人可能会觉得这更容易阅读
library(tidyverse)
x %>%
mutate(
number = string %>%
str_extract('[:digit:]+'),
text = string %>%
str_extract('[:alpha:]+')
) %>%
as_tibble()
# 一个 tibble: 6 x 3
string number text
<fct> <chr> <chr>
1 aa=1 1 aa
2 aa=2 2 aa
3 aa=3 3 aa
4 b=1 1 b
5 b=2 2 b
6 abc=5 5 abc
英文:
Some may find this more readable
library(tidyverse)
x %>%
mutate(
number = string %>% str_extract('[:digit:]+'),
text = string %>% str_extract('[:alpha:]+')
) %>%
as_tibble()
# A tibble: 6 x 3
string number text
<fct> <chr> <chr>
1 aa=1 1 aa
2 aa=2 2 aa
3 aa=3 3 aa
4 b=1 1 b
5 b=2 2 b
6 abc=5 5 abc
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论