英文:
Trying to find a way to use adist() for words instead of characters in R
问题
我希望adist函数在处理字符时与处理单词时的工作方式相同。我的意思是,我希望删除/替换/插入操作应用于整个单词而不是字符。例如,我希望“Alert 12 went off at 3am”和“Alert 17 was heard at 3am”之间的Levenshtein距离为3,因为需要进行三次单词替换才能从一个字符串转换为另一个字符串。谢谢。
英文:
I'd like for the adist function to work the same way it does for words as it does for characters. What I mean by this is I'd like a deletion/substitution/insertion to apply to a whole word instead of characters. For example, I want "Alert 12 went off at 3am" and "Alert 17 was heard at 3am" to have a Levenshtein Distance of 3 because there are three substitutions of words needed to get from one string to another. Thanks
答案1
得分: 0
我猜你可以尝试以下代码来统计不同单词的数量
library(vecsets)
d <- length(vsetdiff(unlist(strsplit(s1," ")),unlist(strsplit(s2," "))))
这样
> d
[1] 3
数据
s1 <- "Alert 12 went off at 3am"
s2 <- "Alert 17 was heard at 3am"
英文:
I guess you can try the following code to count different words
library(vecsets)
d <- length(vsetdiff(unlist(strsplit(s1," ")),unlist(strsplit(s2," "))))
such that
> d
[1] 3
DATa
s1 <- "Alert 12 went off at 3am"
s2 <- "Alert 17 was heard at 3am"
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论