2020年8月7日 03:45:30go评论96阅读模式

英文:

Compare two sentences and check if they have a similar word

问题

两个句子中的共同单词是 "test"。

英文:

I'm trying to take two sentences and see if they have words in common. Example:
A- "Hello world this is a test"
B- "Test to create things"

The common word here is "test"

I tried using .contains() but it doesn't work because I can only search for one word.

text1.toLowerCase ().contains(sentence1.toLowerCase ())

答案1

得分: 2

你可以在分割空格后从这两个单词中创建HashSet。你可以使用Set#retainAll来找到交集（共同的单词）。

final String a = "Hello world this is a test", b = "Test to create things";
final Set<String> words = new HashSet<>(Arrays.asList(a.toLowerCase().split("\\s+")));
final Set<String> words2 = new HashSet<>(Arrays.asList(b.toLowerCase().split("\\s+")));
words.retainAll(words2);
System.out.println(words); //[test]

英文:

You can create HashSets from both of the words after splitting on whitespace. You can use Set#retainAll to find the intersection (common words).

final String a = &quot;Hello world this is a test&quot;, b = &quot;Test to create things&quot;;
final Set&lt;String&gt; words = new HashSet&lt;&gt;(Arrays.asList(a.toLowerCase().split(&quot;\\s+&quot;)));
final Set&lt;String&gt; words2 = new HashSet&lt;&gt;(Arrays.asList(b.toLowerCase().split(&quot;\\s+&quot;)));
words.retainAll(words2);
System.out.println(words); //[test]

答案2

得分: 0

你可以按空格拆分句子，并将单词收集到列表中，然后在另一个列表中搜索一个列表项并收集共同的单词。

这里是一个使用Java Stream API的示例。首先将第一个句子的单词收集到Set中，以加快对每个单词的搜索操作（O(1)）

String a = "Hello world this is a test";
String b = "Test to create things";
Set<String> aWords = Arrays.stream(a.toLowerCase().split(" "))
                            .collect(Collectors.toSet());
List<String> commonWords = Arrays.stream(b.toLowerCase().split(" "))
                                 .filter(bw -> aWords.contains(bw))
                                 .collect(Collectors.toList());
System.out.println(commonWords);

输出：test

英文:

You can split the sentence by space and collect the word as list and then search one list item in another list and collect the common words.

Here an example using Java Stream API. Here first sentence words collect as Set to faster the search operation for every words (O(1))

String a = &quot;Hello world this is a test&quot;;
String b = &quot;Test to create things&quot;;
Set&lt;String&gt; aWords = Arrays.stream(a.toLowerCase().split(&quot; &quot;))
                            .collect(Collectors.toSet());
List&lt;String&gt; commonWords = Arrays.stream(b.toLowerCase().split(&quot; &quot;))
                                 .filter(bw -&gt; aWords.contains(bw))
                                 .collect(Collectors.toList());
System.out.println(commonWords);

Output: test

答案3

得分: 0

import java.util.Arrays;
import java.util.HashSet;
import java.util.Set;
public class Sample {
    public static void main(String[] args) {
        // TODO Auto-generated method stub
        String str1 = "Hello world this is a test";
        String str2 = "Test to create things";
        str1 = str1.toLowerCase();
        str2 = str2.toLowerCase();
        String[] str1words = str1.split(" ");
        String[] str2words = str2.split(" ");
        boolean flag = true;
        Set<String> set = new HashSet<String>(Arrays.asList(str1words));
        for(int i = 0; i < str2words.length; i++) {
            flag = set.add(str2words[i]);
            if(flag == false)
                System.out.println(str2words[i] + " is common word");
        }
    }
}

英文:

Spilt the two sentences by space and add each word from first string in a Set. Now in a loop, try adding words from second string in the set. If add operation returns false then it is a common word.

import java.util.Arrays;
import java.util.HashSet;
import java.util.Set;
public class Sample {
	public static void main(String[] args) {
		// TODO Auto-generated method stub
		String str1 = &quot;Hello world this is a test&quot;;
		String str2 = &quot;Test to create things&quot;;
		str1 = str1.toLowerCase();
		str2 = str2.toLowerCase();
		String[] str1words = str1.split(&quot; &quot;);
		String[] str2words = str2.split(&quot; &quot;);
		boolean flag = true;
		Set&lt;String&gt; set = new HashSet&lt;String&gt;(Arrays.asList(str1words));
		for(int i = 0;i&lt;str2words.length;i++) {
			flag = set.add(str2words[i]);
			if(flag == false)
				System.out.println(str2words[i]+&quot; is common word&quot;);
		}
	}
}

答案4

得分: 0

以下是一种方法：

    // 通过空格分割提取句子中的单词
    String[] sentence1Words = sentence1.toLowerCase().split("\\s+");
    String[] sentence2Words = sentence2.toLowerCase().split("\\s+");
        
    // 从两个单词数组创建集合
    Set<String> sentence1WordSet = new HashSet<String>(Arrays.asList(sentence1Words));
    Set<String> sentence2WordSet = new HashSet<String>(Arrays.asList(sentence2Words));
        
    // 获取两个单词集合的交集
    Set<String> commonWords = new HashSet<String>(sentence1WordSet); 
    commonWords.retainAll(sentence2WordSet);

这将生成一个包含两个句子之间共同单词的小写版本的集合。如果集合为空，表示没有相似性。如果您不关心一些词语，比如介词，您可以在最终的相似性集合中过滤掉这些词，或者更好的办法是预处理您的句子以先删除这些词。

请注意，实际世界中（即有用的）相似性检查的实现通常要复杂得多，因为通常要检查具有轻微差异的相似但不同的单词。一些有用的起点，用于这种类型的字符串相似性检查是Levenshtein距离和metaphones。

请注意，在上面的代码中，我在创建commonWords集合时存在一个冗余的副本，因为交集是原地执行的，所以您可以通过在sentence1WordSet上执行交集来提高性能，但我更看重代码清晰度而不是性能。

英文:

Here's one approach:

    // extract the words from the sentences by splitting on white space
    String[] sentence1Words = sentence1.toLowerCase().split(&quot;\\s+&quot;);
    String[] sentence2Words = sentence2.toLowerCase().split(&quot;\\s+&quot;);
        
    // make sets from the two word arrays
    Set&lt;String&gt; sentence1WordSet = new HashSet&lt;String&gt;(Arrays.asList(sentence1Words));
    Set&lt;String&gt; sentence2WordSet = new HashSet&lt;String&gt;(Arrays.asList(sentence2Words));
        
    // get the intersection of the two word sets
    Set&lt;String&gt; commonWords = new HashSet&lt;String&gt;(sentence1WordSet); 
    commonWords.retainAll(sentence2WordSet);

This will yield a Set containing lower case versions of the common words between the two sentences. If it is empty there is no similarity. If you don't care about some words like prepositions you can filter those out of the final similarity set or, better yet, preprocess your sentences to remove those words first.

Note that a real-world (ie. useful) implementation of similarity checking is usually far more complex, as you usually want to check for words that are similar but with minor discrepancies. Some useful starting points to look into for these type of string similarity checking are Levenshtein distance and metaphones.

Note there is a redundant copy of the Set in the code above where I create the commonWords set because intersection is performed in-place, so you could improve performance by simply performing the intersection on sentence1WordSet, but I have favoured code clarity over performance.

答案5

得分: 0

请尝试以下代码。

static boolean contains(String text1, String text2) {
    String text1LowerCase = text1.toLowerCase();
    return Arrays.stream(text2.toLowerCase().split("\\s+"))
        .anyMatch(word -> text1LowerCase.contains(word));
}
String text1 = "Hello world this is a test";
String text2 = "Test to create things";
System.out.println(contains(text1, text2));
输出：
true

英文:

Try this.

static boolean contains(String text1, String text2) {
    String text1LowerCase = text1.toLowerCase();
    return Arrays.stream(text2.toLowerCase().split(&quot;\\s+&quot;))
        .anyMatch(word -&gt; text1LowerCase.contains(word));
}

and

String text1 = &quot;Hello world this is a test&quot;;
String text2 = &quot;Test to create things&quot;;
System.out.println(contains(text1, text2));

output:

true

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

比较两个句子并检查它们是否有相似的词。

问题

答案1

答案2

答案3

答案4

答案5

从ArrayList中移除一个最大值和一个最小值。

如何在switch语句中将字符串与枚举进行比较？

如何在排序算法的每次迭代结束之前重新绘制每次迭代的过程？

如何在应用程序启动时显示自定义对话框？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

发表评论