英文:
How to detect which strings in a list contain words from a list of keywords in R
问题
非常新于R,并希望得到帮助。
我有一个包含1000个产品名称的列表,还有一个包含80个关键词或短语的列表。我需要确定这1000个产品名称中有多少包含一个或多个这些关键词或短语。
示例:如果1000多个产品名称中的一个是"honey bunches of oats",而80多个关键词之一是"honey",我需要它显示为TRUE,出现在"honey bunches of oats"旁边的新列中。
将两个列表都上传为CSV文件。我为每个列表创建了一个向量,并尝试使用以下代码:
str_detect(products, regex(".keywords.", ignore_case = TRUE))
这返回了全部为false的结果。我还尝试使用grepl(keywords, products)
,但也没有返回任何结果。
我确信应该存在包含这些关键词的情况。它是在寻找精确匹配吗?我需要它显示部分匹配。
英文:
Very new to R and hoping for help.
I have a list of 1000 product names, and I have a list of 80 key words or phrases. I need to determine how many of the 1000 product names contain one or more of those key words or phrases.
Example: if one of the 1000+ product names was "honey bunches of oats" and one of the 80+ keywords is "honey", I need it to show up as TRUE in a new column next to "honey bunches of oats"
Uploaded both lists as csv files. I made a vector for each list, and tried to use the following:
str_detect(products, regex(".keywords.", ignore_case = TRUE))
This came back with all false results. I also tried to use grepl(keywords, products)
which returned zero results as well.
I am confident there should be instances where the keywords are contained within these strings. Is it looking for exact matches? I need it to show partial matches.
答案1
得分: 0
尝试:
products <- c('apple hello', 'banana', 'peach', 'a')
.keywords. <- c('apple', 'each')
library(stringr)
str_detect(products, paste0(.keywords., collapse = '|'))
# [1] TRUE FALSE TRUE FALSE
英文:
Try:
products <- c('apple hello', 'banana', 'peach', 'a')
.keywords. <- c('apple', 'each')
library(stringr)
str_detect(products, paste0(.keywords., collapse = '|'))
# [1] TRUE FALSE TRUE FALSE
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论