英文:
Vectorising stringr::str_count
问题
I'm trying to vectorize the 'pattern' argument of stringr::str_count in R as follows:
library(stringr)
# Define the patterns you want to count
patterns <- c("apple", "banana", "orange")
# Create a vectorized version of str_count
vectorized_str_count <- Vectorize(str_count, vectorize.args = "pattern")
# Input string
string <- "I like apples, bananas, and oranges. Apples are my favorite."
# Count the occurrences of patterns in the string
counts <- vectorized_str_count(string, pattern = patterns)
# Print the counts
print(counts)
I am expecting to get 2, 1, 1 as outputs since there are two occurrences of 'apple' in the original string. However, it returns 1, 1, 1.
How can I amend the code to get what I'm after? I know I could do this by constructing a regex search term but there are various reasons I don't want to do this due to other problems. Many thanks.
英文:
I'm trying to vectorise the 'pattern' argument of stringr::str_count in R as follows:
# Define the patterns you want to count
patterns <- c("apple", "banana", "orange")
# Create a vectorized version of str_count
vectorized_str_count <- Vectorize(str_count, vectorize.args = "pattern")
# Input string
string <- "I like apples, bananas, and oranges. Apples are my favorite."
# Count the occurrences of patterns in the string
counts <- vectorized_str_count(string, pattern = patterns)
# Print the counts
print(counts)
I am expecting to get 2, 1, 1 as outputs since there are two occurences of 'apple' in the original string. However it returns 1, 1, 1
How can I amend the code to get what I'm after? I know I could do this by constructing a regex search term but there are various reasons I don't want to do this due to other problems. Many thanks
答案1
得分: 1
str_count()
函数已对 pattern
参数进行了向量化处理。
对于第一个模式,您只会得到一个匹配,因为模式是区分大小写的:apple
不匹配 Apple
。添加 (?i)
或使用 regex()
使模式大小写不敏感:
library(stringr)
x <- "我喜欢苹果,香蕉和橙子。苹果是我最喜欢的。"
str_count(x, c("(?i)苹果", "香蕉", "橙子"))
#> [1] 1 1 1
str_count(x, regex(c("苹果", "香蕉", "橙子"), ignore_case = TRUE))
#> [1] 1 1 1
英文:
str_count()
is already vectorised over the pattern
argument.
You get only
one match for the first pattern because the pattern is case-sensitive: apple
does not match Apple
. Add (?i)
or use regex()
to make a pattern case-insensitive:
library(stringr)
x <- "I like apples, bananas, and oranges. Apples are my favorite."
str_count(x, c("(?i)apple", "banana", "orange"))
#> [1] 2 1 1
str_count(x, regex(c("apple", "banana", "orange"), ignore_case = TRUE))
#> [1] 2 1 1
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论