2023年6月16日 05:22:42go评论96阅读模式

英文:

Check if a string contains an english word in NodeJS

问题

在NodeJS中创建一个函数，该函数返回true，当一个字符串包含一个长度超过3个字母的英文单词时，可以采用以下方式：

function containsLongEnglishWord(inputString) {
  const words = inputString.split(/\W+/); // Split the string into words
  for (const word of words) {
    if (word.length > 3 && /[a-zA-Z]/.test(word)) {
      return true;
    }
  }
  return false;
}

这个函数会将输入字符串分割成单词，并检查每个单词的长度是否大于3，并且是否包含英文字母。如果找到满足条件的单词，函数返回true，否则返回false。这个方法相对高效，并且不需要额外的字典文件。

英文:

What is the best way in NodeJS to create a function that returns true iff a string contains an english word longer than 3 letters?

The code will be placed in a lambda, so I'm looking for the most efficient solution. The best solution I've got so far is to use dictionary-en and iterate over every word calling .includes(word) on the source string, I was wondering if you can think of better approach to implement this.

Some examples of strings which should return true:

y89nsdadhomea98qwoi
:_5678aSD.boTTleads.
yfugdnuagybdasglassesmidwqihhniwqnhi

Some examples of strings which should return false:

y89nsdadhasa98qwoi
:_5678aSD.b0TTle4ds.
yfugdnuagybdasmidwqihhniwqnhi

答案1

得分: 1

循环遍历一个包含数十万个单词的字典不是一个好主意。由于您的字符串长度在10到200个字符之间，迭代源字符串中的每个字符会在时间复杂度上获得更好的结果。如果您不关心空间复杂度，还有一种更好的方法：

预先构建一个特殊的字典哈希表，这会花费O(m)的时间（其中m是您的字典单词的数量）。哈希表将类似于：

// 由于JavaScript中的对象类似于哈希映射
dictionaryMap = {
   'hom': 'e',
   'cat': '',
   'bot': 'tle',
   'gla': ['ss', 'cier'], // 包含'glass'和'glacier'
};

遍历源字符串中的每个字符，查找单词，这样，迭代的时间复杂度为O(n)，查找的时间复杂度为O(1)：

for (i=0; i<n.length; i++) {
   lookupStr := n[i] + n[i+1] + n[i+2]; // &lt;-- 我知道这有点蠢，只是一个示例 :)))
   if (dictionaryMap.hasOwnProperty(lookupStr) {
      console.log(lookupStr + dictionaryMap[lookupStr])
      return 'hell yeah';
   }
}

现在您知道源字符串很可能包含一个大于3的英文单词，您可以应用动态规划，构建树状结构或更改dictionaryMap并执行步骤2的递归，如果要查找精确的单词：

dictionaryMap = {
   'gla': 'ss|cier'
}
// 应用动态规划或记忆化来查找最长的连续公共子序列...

或者

// 将映射更改为树状结构
dictionaryMap = {
   'gla': {
      's': {'s': ''},
      'c': {'i': {'e': {'r': ''}}}
   }
// 继续执行步骤2...
// 或者自己构建一个树并搜索精确的单词

=> 总共：O(m) + O(n) + O(1) = O(m) 时间复杂度和O(m)或O(m*longestWordCharacters)的空间复杂度。

英文:

Looping over a dictionary (which may contain hundreds of thousands of words) is not a good idea. As your string ranges 10-200 chars, iterating over every characters in the source string gives a better result of time complexity. And if you don't care about space complexity, there's a better approach:

Build an ahead-of-time special dictionary hashmap, this costs you O(m) (which m is the number of your dictionary words). The hashmap will be something like:

// As object is hashmap-like in javascript
dictionaryMap = {
   &#39;hom&#39;: &#39;e&#39;,
   &#39;cat&#39;: &#39;&#39;,
   &#39;bot&#39;: &#39;tle&#39;,
   &#39;gla&#39;: [&#39;ss&#39;, &#39;cier&#39;], // contains &#39;glass&#39; and &#39;glacier&#39;
};

Iterate over every characters in the source string and look for the word, that way, you have O(n) time for the iteration and O(1) for the lookup:

for (i=0; i&lt;n.length; i++) {
   lookupStr := n[i] + n[i+1] + n[i+2]; // &lt;-- I know it&#39;s dump, just a sample :)))
   if (dictionaryMap.hasOwnProperty(lookupStr) {
      console.log(lookupStr + dictionaryMap[lookupStr])
      return &#39;hell yeah&#39;;
   }
}

As now you know that the source string has a high chance that it contains an English word larger than 3, you can apply dynamic programming, building a tree or change the dictionaryMap and do step 2 recursion if you want to look for an exact word:

dictionaryMap = {
   &#39;gla&#39;: &#39;ss|cier&#39;
}
// Apply dynamic programming or memoization to find the longest common continuous subsequence...

// Change the map to be a tree-like structure
dictionaryMap = {
   &#39;gla&#39;: {
      &#39;s&#39;: {&#39;s&#39;: &#39;&#39;},
      &#39;c&#39;: {&#39;i&#39;: {&#39;e&#39;: {&#39;r&#39;: &#39;&#39;}}}
   }
// Continue doing Step 2... 
// Or build a tree yourself and search for the exact word

=> Total: O(m) + O(n) + O(1) = O(m) time complexity and O(m) or O(m*longestWordCharacters) space complexity

答案2

得分: -3

Huh? 这不是 node.js 的工作，而是 JavaScript！

什么是 node.js...

https://en.wikipedia.org/wiki/Node.js

接下来是问题...

一个英语单词，超过3个字母... 但有成千上万个这样的单词！

您有一个包含所有这些单词的文本文件，以便我们可以将它们加载到数组中进行操作吗？

没有？好的，在此期间这是我们最好的选择...

const Valid = ['home', 'boTTle', 'glasses', 'GodKnows'];
var S = 'y89nsdadhomea98qwoi'.toLowerCase(), Ok = false;
for (var i = 0; i < Valid.length; i++) {
 if (S.includes(Valid[i].toLowerCase())) { Ok = true; break;}
}
if (Ok) {
 // 是的，字符串没问题，现在怎么办？
}

抱歉，我喝了几杯苏格兰威士忌，看完您的帖子后哈哈大笑。

英文:

Huh? This is not a job for node.js but JavaScript!

What is node.js...

https://en.wikipedia.org/wiki/Node.js

Onto the problem at hand...

An English word greater than 3 letters... but there are tens of THOUSANDS of them!

Do you have a text file with all these words included so that we can load them into an array to operate?

No? OK, in the meantime here's the best we've got...

const Valid=[&#39;home&#39;,&#39;boTTle&#39;,&#39;glasses&#39;,&#39;GodKnows&#39;];
var S=&#39;y89nsdadhomea98qwoi&#39;.toLowerCase(), Ok=false;
for(var i=0; i&lt;Valid.length; i++){
 if(S.includes(Valid[i].toLowerCase())){Ok=true; break;}
}
if(Ok){
 // yeah the string is ok, what now?
}

Sorry I’ve had a few scotches and LMAO after reading your post.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

检查一个字符串是否包含英语单词在 NodeJS 中

问题

答案1

答案2

代码输出负数和非负数的0都为真。

JavaScript函数的参数对于其回调函数是否不可见？

React不更新网站。

How to 'check for elements', 'click button', 'check elements have increased by one'? – .then() event order isn't clear

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。