从哈希映射中取出具有最高值的10个字符串。

huangapple go评论88阅读模式
英文:

Taking 10 Strings with highest values from hashMap

问题

我想把一个网站标题中的所有单词保存到一个文件中。然后,我想从中提取出出现频率最高的10个单词,并将它们保存到另一个文件中。
所以我已经完成了保存到文件的部分。
但是我在寻找那10个单词时遇到了困难。我的代码只能找到出现频率最高的1个单词,仅限于此。肯定有比我目前所做的更好的方法。如果你能给我一些建议,我将非常感激。我已经阅读了这里关于这个问题的大部分热门主题,但它们都只是关于寻找出现频率最高的单词。

以下是统计单词数量的部分:

while (currentLine != null) {
    String[] words = currentLine.toLowerCase().split(" ");

    for (String word : words) {
        if (!wordsMap.containsKey(word) && word.length() > 3) {
            wordsMap.put(word, 1);
        } else if (word.length() > 3) {
            int value = wordsMap.get(word);
            value++;
            wordsMap.replace(word, value);
        }
    }
    currentLine = reader.readLine();
}

以下是查找最常见单词的部分:

List<String> mostRepeatedWords = new ArrayList<>();
Set<Map.Entry<String, Integer>> entrySet = wordsMap.entrySet();
int max = 0;
for (int i = 0; i < entrySet.size(); i++) {
    for (Map.Entry<String, Integer> entry : entrySet) {   // 在这里我正在查找映射中值最高的单词
        if (entry.getValue() > max) {
            max = entry.getValue();
        }
    }
    for (Object o : wordsMap.keySet()) {     // 在这里我将这个单词写入列表中
        if (wordsMap.get(o).equals(max)) {
            mostRepeatedWords.add(o.toString());
        }
    }
}
英文:

I want to save all words from titles from a site to a file. Then I want to take 10 most frequent words and save them to the other file.
So I've got saving to the file.
But I've stucked on looking for those 10 words. My code is only looking for 1 most frequent word and that's it. There're for sure better ways to do that than the one I've done. I'd be really grateful if you show me some tips. I've made through the most popular topics here, but all of them are about looking for the one most frequent word.

List&lt;String&gt; mostRepeatedWords = new ArrayList&lt;&gt;();
Set&lt;Map.Entry&lt;String, Integer&gt;&gt; entrySet = wordsMap.entrySet();
int max = 0;
for (int i = 0; i &lt; entrySet.size(); i++) {
    for (Map.Entry&lt;String, Integer&gt; entry : entrySet) {   //here I&#39;m looking for the word with the highest value in the map
        if (entry.getValue() &gt; max) {
            max = entry.getValue();
            }
     }
     for (Object o : wordsMap.keySet()) {     //here I write this word to a list
         if (wordsMap.get(o).equals(max)) {
             mostRepeatedWords.add(o.toString());
         }
    }
}

@Edit
Here's how I've counted the words:

while (currentLine != null) {
    String[] words = currentLine.toLowerCase().split(&quot; &quot;);

    for (String word : words) {
        if (!wordsMap.containsKey(word) &amp;&amp; word.length() &gt; 3) {
            wordsMap.put(word, 1);
        } else if (word.length() &gt; 3) {
            int value = wordsMap.get(word);
            value++;
            wordsMap.replace(word, value);
        }
    }
    currentLine = reader.readLine();
}

答案1

得分: 3

这对您是否有帮助?

首先,根据出现频率的逆序,对地图中的单词(即键)进行排序。

List<String> words = mapOfWords.entrySet().stream()
    .sorted(Entry.comparingByValue(Comparator.reverseOrder()))
    .limit(10)
    .map(Entry::getKey)
    .collect(Collectors.toList());

然后使用这些键来打印前10个按降序排列的单词。

for (String word : words) {
    System.out.println(word + " " + mapOfWords.get(word));
}

另一种更传统的方法,不使用流,如下所示:

测试数据

Map<String, Integer> mapOfWords =
    Map.of("A", 10, "B", 3, "C", 8, "D", 9);

创建地图条目列表

List<Entry<String, Integer>> mapEntries =
    new ArrayList<>(mapOfWords.entrySet());

定义一个Comparator来根据频率对条目进行排序

Comparator<Entry<String, Integer>> comp = new Comparator<>() {
    @Override
    public int compare(Entry<String, Integer> e1,
            Entry<String, Integer> e2) {
        Objects.requireNonNull(e1);
        Objects.requireNonNull(e2);
        // 注意,e2和e1的顺序相反,以便按降序排序。
        return Integer.compare(e2.getValue(), e1.getValue());
    }
};

上述内容等效于Map.Entry类中定义的以下内容

Comparator<Entry<String, Integer>> comp =
   Entry.comparingByValue(Comparator.reverseOrder());

现在使用任一比较器对列表进行排序。

mapEntries.sort(comp);

现在只需打印条目列表。如果超过10个,则需要放入限制计数器或将mapEntries.subList(0, 10)用作for循环的目标。

for (Entry<?, ?> e : mapEntries) {
    System.out.println(e);
}
英文:

Does this do it for you?

First, sort the words (i.e. keys) of the map based on the frequency of occurrence in reverse order.

List&lt;String&gt; words = mapOfWords.entrySet().stream()
		.sorted(Entry.comparingByValue(Comparator.reverseOrder()))
        .limit(10)
        .map(Entry::getKey)
		.collect(Collectors.toList());

Then use those keys to print the first 10 words in decreasing frequency.

for (String word : words) {
	System.out.println(word + &quot; &quot; + mapOfWords.get(word));
}

Another more traditional approach not using streams is the following:

Test data

Map&lt;String, Integer&gt; mapOfWords =
		Map.of(&quot;A&quot;, 10, &quot;B&quot;, 3, &quot;C&quot;, 8, &quot;D&quot;, 9);

Create a list of map entries

List&lt;Entry&lt;String, Integer&gt;&gt; mapEntries =
		new ArrayList&lt;&gt;(mapOfWords.entrySet());

define a Comparator to sort the entries based on the frequency

Comparator&lt;Entry&lt;String, Integer&gt;&gt; comp = new Comparator&lt;&gt;() {
	@Override
	public int compare(Entry&lt;String, Integer&gt; e1,
			Entry&lt;String, Integer&gt; e2) {
            Objects.requireNonNull(e1);
            Objects.requireNonNull(e2);
        // notice e2 and e1 order is reversed to sort in descending order.
		return Integer.compare(e2.getValue(), e1.getValue());
	}
};

The above does the equivalent of the following which is defined in the Map.Entry class

Comparator&lt;Entry&lt;String,Integer&gt;&gt; comp =
   Entry.comparingByValue(Comparator.reverseOrder());

Now sort the list with either comparator.

mapEntries.sort(comp);

Now just print the list of entries. If there are more than 10 you will need to put in a limiting counter or use a mapEntries.subList(0, 10) as the target of the for loop.

for (Entry&lt;?,?&gt; e : mapEntries) {
     System.out.println(e);
}

</details>



# 答案2
**得分**: 0

你可以将最常见的单词保存到一个数组中,并检查你找到的下一个单词是否已经存在于该数组中。然后,你可以搜索下一个在该数组中不存在的最常见的单词。

<details>
<summary>英文:</summary>

You could save the most frequent word to an array and check if the next word you found already exists in that array. Then you search for the next most frequent word that does not exist in that array.


</details>



# 答案3
**得分**: 0

假设您已经有了类似以下的频率映射:

```java
Map<String, Integer> wordsMap = Map.of("foo", 2,
                                      "bar", 7,
                                      "baz", 5,
                                      "doo", 9,
                                      "tot", 2,
                                      "gee", 12);

您可以创建另一个映射,即前十个映射(在我的示例中是前三个),通过按值的逆序对您的映射进行排序,并将其限制为前十个条目:

Map<String, Integer> topThree = wordsMap.entrySet()
                                       .stream()
                                       .sorted(Collections.reverseOrder(Map.Entry.comparingByValue()))
                                       .limit(3)
                                       .collect(Collectors.toMap(
                                          Map.Entry::getKey, Map.Entry::getValue, (e1, e2) -> e2, LinkedHashMap::new));

System.out.println(topThree);

// 输出:{gee=12, doo=9, bar=7}
英文:

Assuming you have already your frequency map which might look something like:

Map&lt;String,Integer&gt; wordsMap = Map.of( &quot;foo&quot;, 2,
                                       &quot;bar&quot;, 7,
                                       &quot;baz&quot;, 5,
                                       &quot;doo&quot;, 9,
                                       &quot;tot&quot;, 2,
                                       &quot;gee&quot;, 12);

You could create another map, i.e a top ten map (in my demo below top three), by sorting your map by value in reverse order and limit it to the first ten entries

Map&lt;String,Integer&gt; topThree = wordsMap.entrySet()
                                       .stream()
                                       .sorted(Collections.reverseOrder(Map.Entry.comparingByValue()))
                                       .limit(3)
                                       .collect(Collectors.toMap(
                                          Map.Entry::getKey, Map.Entry::getValue, (e1, e2) -&gt; e2,LinkedHashMap::new));

System.out.println(topThree);

//{gee=12, doo=9, bar=7}

huangapple
  • 本文由 发表于 2020年9月30日 22:01:25
  • 转载请务必保留本文链接:https://go.coder-hub.com/64139231.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定