英文:
R: save a regex match to a new variable while removing the regex match from the existing variable using `str_extract()`
问题
我想将正则表达式匹配保存到新变量,并在同一个函数中从现有变量中移除正则表达式匹配。
所以在以下示例中,我想从句子中移除“Speaker. ”并将其保存到`new_variable`,同时也从`sentence`中移除“Speaker. ”。
我尝试使用`str_extract()`来实现这一点。我能够匹配所需的单词,但该单词未从`sentence`中移除。(这可能不是`str_extract()`的设计目的,那么`str_extract()`和`str_match()`之间有什么区别?)
library(stringr)
sentence <- "Speaker. I am not sure why this does not work."
for(line in sentence) {
new_variable <- str_extract(line, "^[[:alpha:]]+\. ")
}
我知道可以使用`str_replace()`从句子中删除正则表达式匹配,但如果可能的话,我更愿意使用一个函数来实现这一点。
library(stringr)
sentence <- "Speaker. I am not sure why this does not work."
for(line in sentence) {
new_variable <- str_extract(line, "^[[:alpha:]]+\. ")
sentence <- str_replace(sentence, test, "")
}
英文:
I want to save a regex match to a new variable and remove the regex match from the existing variable using one function.
So in the following example, I want to remove "Speaker. " from sentence and save it to new_variable
while also removing "Speaker. " from sentence
.
I tried to accomplish this with str_extract()
. I am able to match the desired word, but the word is not removed from sentence
. (This may not be what str_extract()
was designed to do, but then what is the difference between str_extract()
and str_match()
??)
library(stringr)
sentence <- "Speaker. I am not sure why this does not work."
for(line in sentence) {
new_variable <- str_extract(line, "^[[:alpha:]]+\\. ")
}
I know I can use str_replace()
to remove the regex match from sentence, but I would prefer to do this with one function if possible.
library(stringr)
sentence <- "Speaker. I am not sure why this does not work."
for(line in sentence) {
new_variable <- str_extract(line, "^[[:alpha:]]+\\. ")
sentence <- str_replace(sentence, test, "")
}
答案1
得分: 1
你可以使用 tidyr::separate_wider_regex
函数:
library(tidyr)
df <- data.frame(sentence = "Speaker. I am not sure why this does not work.")
separate_wider_regex(df, sentence, c(new_variable = "^[[:alpha:]]+\\. ", sentence = ".*"))
# A tibble: 1 × 2
new_variable sentence
<chr> <chr>
1 "Speaker. " I am not sure why this does not work.
英文:
You can use tidyr::separate_wider_regex
library(tidyr)
df <- data.frame(sentence = "Speaker. I am not sure why this does not work.")
separate_wider_regex(df, sentence, c(new_variable = "^[[:alpha:]]+\\. ", sentence = ".*"))
# A tibble: 1 × 2
new_variable sentence
<chr> <chr>
1 "Speaker. " I am not sure why this does not work.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论