英文:
Check if a string contains an english word in NodeJS
问题
在NodeJS中创建一个函数,该函数返回true,当一个字符串包含一个长度超过3个字母的英文单词时,可以采用以下方式:
function containsLongEnglishWord(inputString) {
const words = inputString.split(/\W+/); // Split the string into words
for (const word of words) {
if (word.length > 3 && /[a-zA-Z]/.test(word)) {
return true;
}
}
return false;
}
这个函数会将输入字符串分割成单词,并检查每个单词的长度是否大于3,并且是否包含英文字母。如果找到满足条件的单词,函数返回true,否则返回false。这个方法相对高效,并且不需要额外的字典文件。
英文:
What is the best way in NodeJS to create a function that returns true iff a string contains an english word longer than 3 letters?
The code will be placed in a lambda, so I'm looking for the most efficient solution. The best solution I've got so far is to use dictionary-en
and iterate over every word calling .includes(word)
on the source string, I was wondering if you can think of better approach to implement this.
Some examples of strings which should return true:
- y89nsdadhomea98qwoi
- :_5678aSD.boTTleads.
- yfugdnuagybdasglassesmidwqihhniwqnhi
Some examples of strings which should return false:
- y89nsdadhasa98qwoi
- :_5678aSD.b0TTle4ds.
- yfugdnuagybdasmidwqihhniwqnhi
答案1
得分: 1
循环遍历一个包含数十万个单词的字典不是一个好主意。由于您的字符串长度在10到200个字符之间,迭代源字符串中的每个字符会在时间复杂度上获得更好的结果。如果您不关心空间复杂度,还有一种更好的方法:
- 预先构建一个特殊的字典哈希表,这会花费
O(m)
的时间(其中m
是您的字典单词的数量)。哈希表将类似于:
// 由于JavaScript中的对象类似于哈希映射
dictionaryMap = {
'hom': 'e',
'cat': '',
'bot': 'tle',
'gla': ['ss', 'cier'], // 包含'glass'和'glacier'
};
- 遍历源字符串中的每个字符,查找单词,这样,迭代的时间复杂度为O(n),查找的时间复杂度为O(1):
for (i=0; i<n.length; i++) {
lookupStr := n[i] + n[i+1] + n[i+2]; // <-- 我知道这有点蠢,只是一个示例 :)))
if (dictionaryMap.hasOwnProperty(lookupStr) {
console.log(lookupStr + dictionaryMap[lookupStr])
return 'hell yeah';
}
}
- 现在您知道源字符串很可能包含一个大于3的英文单词,您可以应用动态规划,构建树状结构或更改dictionaryMap并执行步骤2的递归,如果要查找精确的单词:
dictionaryMap = {
'gla': 'ss|cier'
}
// 应用动态规划或记忆化来查找最长的连续公共子序列...
或者
// 将映射更改为树状结构
dictionaryMap = {
'gla': {
's': {'s': ''},
'c': {'i': {'e': {'r': ''}}}
}
// 继续执行步骤2...
// 或者自己构建一个树并搜索精确的单词
=> 总共:O(m) + O(n) + O(1) = O(m)
时间复杂度和O(m)
或O(m*longestWordCharacters)
的空间复杂度。
英文:
Looping over a dictionary (which may contain hundreds of thousands of words) is not a good idea. As your string ranges 10-200 chars, iterating over every characters in the source string gives a better result of time complexity. And if you don't care about space complexity, there's a better approach:
- Build an ahead-of-time special dictionary hashmap, this costs you
O(m)
(whichm
is the number of your dictionary words). The hashmap will be something like:
// As object is hashmap-like in javascript
dictionaryMap = {
'hom': 'e',
'cat': '',
'bot': 'tle',
'gla': ['ss', 'cier'], // contains 'glass' and 'glacier'
};
- Iterate over every characters in the source string and look for the word, that way, you have O(n) time for the iteration and O(1) for the lookup:
for (i=0; i<n.length; i++) {
lookupStr := n[i] + n[i+1] + n[i+2]; // <-- I know it's dump, just a sample :)))
if (dictionaryMap.hasOwnProperty(lookupStr) {
console.log(lookupStr + dictionaryMap[lookupStr])
return 'hell yeah';
}
}
- As now you know that the source string has a high chance that it contains an English word larger than 3, you can apply dynamic programming, building a tree or change the dictionaryMap and do step 2 recursion if you want to look for an exact word:
dictionaryMap = {
'gla': 'ss|cier'
}
// Apply dynamic programming or memoization to find the longest common continuous subsequence...
OR
// Change the map to be a tree-like structure
dictionaryMap = {
'gla': {
's': {'s': ''},
'c': {'i': {'e': {'r': ''}}}
}
// Continue doing Step 2...
// Or build a tree yourself and search for the exact word
=> Total: O(m) + O(n) + O(1) = O(m)
time complexity and O(m)
or O(m*longestWordCharacters)
space complexity
答案2
得分: -3
Huh? 这不是 node.js 的工作,而是 JavaScript!
什么是 node.js...
https://en.wikipedia.org/wiki/Node.js
接下来是问题...
一个英语单词,超过3个字母... 但有成千上万个这样的单词!
您有一个包含所有这些单词的文本文件,以便我们可以将它们加载到数组中进行操作吗?
没有?好的,在此期间这是我们最好的选择...
const Valid = ['home', 'boTTle', 'glasses', 'GodKnows'];
var S = 'y89nsdadhomea98qwoi'.toLowerCase(), Ok = false;
for (var i = 0; i < Valid.length; i++) {
if (S.includes(Valid[i].toLowerCase())) { Ok = true; break;}
}
if (Ok) {
// 是的,字符串没问题,现在怎么办?
}
抱歉,我喝了几杯苏格兰威士忌,看完您的帖子后哈哈大笑。
英文:
Huh? This is not a job for node.js but JavaScript!
What is node.js...
https://en.wikipedia.org/wiki/Node.js
Onto the problem at hand...
An English word greater than 3 letters... but there are tens of THOUSANDS of them!
Do you have a text file with all these words included so that we can load them into an array to operate?
No? OK, in the meantime here's the best we've got...
const Valid=['home','boTTle','glasses','GodKnows'];
var S='y89nsdadhomea98qwoi'.toLowerCase(), Ok=false;
for(var i=0; i<Valid.length; i++){
if(S.includes(Valid[i].toLowerCase())){Ok=true; break;}
}
if(Ok){
// yeah the string is ok, what now?
}
Sorry I’ve had a few scotches and LMAO after reading your post.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论