2020年4月7日 18:21:26go评论110阅读模式

英文:

Count frequency of each word from list of Strings using Java8

问题

import java.util.*;
import java.util.stream.Collectors;
import java.util.stream.Stream;
public class StringOccurrencesMap {
    public static void main(String[] args) {
        String[] listA = {"the", "you", "how"};
        String[] listB = {"the dog ate the food", "how is the weather", "how are you"};
        Set<String> sentenceSet = Stream.of(listB).collect(Collectors.toSet());
        Map<String, Long> frequency1 = Stream.of(listA)
            .filter(e -> sentenceSet.contains(e))
            .collect(Collectors.groupingBy(t -> t, Collectors.counting()));
        System.out.println(frequency1);
    }
}

Note: This is the provided code snippet translated into Chinese. If you need any further assistance or modifications, please let me know.

英文:

I have two lists of Strings. Need to create a map of occurrences of each string of one list in another list of string. If a String is present even more than in a single string, it should be counted as one occurrence.

For example:

String[] listA={&quot;the&quot;, &quot;you&quot; , &quot;how&quot;}; 
String[] listB = {&quot;the dog ate the food&quot;, &quot;how is the weather&quot; , &quot;how are you&quot;};

The Map<String, Integer> map will take keys as Strings from listA, and value as the occurence. So map will have key-values as : ("the",2)("you",1)("how",2).

Note: Though "the" is repeated twice in "the dog ate the food", it counted as only one occurrence as it is in the same string.

How do I write this using [tag:java-stream]? I tried this approach but does not work:

Set&lt;String&gt; sentenceSet = Stream.of(listB).collect(Collectors.toSet());
		
Map&lt;String, Long&gt; frequency1 =	Stream.of(listA)
    .filter(e -&gt; sentenceSet.contains(e))
    .collect(Collectors.groupingBy(t -&gt; t, Collectors.counting()));

答案1

得分: 2

你需要从listB中提取所有单词，并仅保留那些也在listA中列出的单词。然后，你只需将单词 -> 计数对收集到Map<String，Long>中：

String[] listA = {"the", "you", "how"};
String[] listB = {"the dog ate the food", "how is the weather", "how are you"};
Set<String> qualified = new HashSet<>(Arrays.asList(listA));   // 使搜索更简便
Map<String, Long> map = Arrays.stream(listB)   // 将句子转换为流
    .map(sentence -> sentence.split("\\s+"))   // 按单词拆分为流<String[]>
    .flatMap(words -> Arrays.stream(words)     // flatmap为流<String>
                            .distinct())       // ...作为句子的不同单词
    .filter(qualified::contains)               // 仅保留合格的单词
    .collect(Collectors.groupingBy(            // 收集到Map中
        Function.identity(),                   // ...键是单词本身
        Collectors.counting()));               // ...值是其频率

输出：

> {the=2, how=2, you=1}

英文:

You need to extract all the words from listB and keep only these that are also listed in listA. Then you simply collect the pairs word -> count to the Map<String, Long>:

String[] listA={&quot;the&quot;, &quot;you&quot;, &quot;how&quot;};
String[] listB = {&quot;the dog ate the food&quot;, &quot;how is the weather&quot; , &quot;how are you&quot;};
Set&lt;String&gt; qualified = new HashSet&lt;&gt;(Arrays.asList(listA));   // make searching easier
Map&lt;String, Long&gt; map = Arrays.stream(listB)   // stream the sentences
    .map(sentence -&gt; sentence.split(&quot;\\s+&quot;))   // split by words to Stream&lt;String[]&gt;
    .flatMap(words -&gt; Arrays.stream(words)     // flatmap to Stream&lt;String&gt;
                            .distinct())       // ... as distinct words by sentence
    .filter(qualified::contains)               // keep only the qualified words
    .collect(Collectors.groupingBy(            // collect to the Map
        Function.identity(),                   // ... the key is the words itself
        Collectors.counting()));               // ... the value is its frequency

Output:

> {the=2, how=2, you=1}

答案2

得分: 0

建议您在第一个字符串中创建一个哈希表。然后循环遍历第二个列表中的项目，检查它是否在哈希表中。在添加第一个列表中的元素时，测试是否已经存在，然后决定是否要保留计数。您可以将一个单词所在的句子存储为键的值，例如。

英文:

Suggest you create a hash table of the items in the first string. Then loop through the items in the second list checking if it is in the hash table or not. When adding the elements in the first list, test to see if it’s already there and decide if you want to keep a count or not. You can store which sentence a word is in as the value for the key, for instance.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

使用Java8计算字符串列表中每个单词的频率。

问题

答案1

答案2

Java运行时问题：传递包含空格的参数？

移除Spring Integration DSL流程中已处理的文件。

Java – emoji4j静态方法调用结束/消失/出现错误

空指针异常在复制实例变量后出现？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。