在文本中找到最频繁出现的字符。

huangapple go评论85阅读模式
英文:

Find the most frequent character in text

问题

我需要实现一个带有接口的包,其中的方法接受文本文件并对其进行分析 - 统计字符的总数并找到最常见的符号和单词。为了找到最常见的字符,我遍历文本中的每个符文,将其转换为字符串并将其附加为map的键。值是一个递增的计数器,用于统计给定文本中该字符出现的次数。现在我在以下问题上遇到了一点困难 - 我无法弄清楚如何获取具有最高值的键。以下是代码:

package textscanner

import (
    "fmt"
    "log"
    "io/ioutil"
    "unicode/utf8"
    "strconv"
)

// 初始化我的扫描器
type Scanner interface {
    countChar(text string) int

    frequentSym(text string) // 返回值尚未实现

    Scan()

    Run()
}

/* 计算字符的方法 */
func countChar(sc Scanner, text string) int { ... }

func frequentSym(sc Scanner, text string) {
    // 创建一个带有字符串键和整数值的映射
    symbols := make(map[string]int)

    // 遍历文本中的每个字符
    for _, sym := range text {
        // 将符文转换为字符串
        char := strconv.QuoteRune(sym)
        // 将该字符串设置为映射中的键,并分配一个计数器值
        count := symbols[char]

        if count == symbols[char] {
            // 增加值
            symbols[char] = count + 1
        } else {
            symbols[char] = 1
        }
    }
}

所以,基本上我需要找到具有最高int值的键值对,并返回与之对应的string键,即文本中最常见的字符。

英文:

I need to implement a package with interface with methods that take text file and performs analysis on it - counts the total amount of characters and finds the most frequent symbol and word. To find the most frequent character I loop through each rune in the text, convert it to string and append it as a key to map. The value is an incremented counter which counts how often this character occurs in the given text. Now I'm stuck a little with the following problem -- I can't figure out how to get the key with the highest value in my map. Here's the code:

package textscanner

import (
    "fmt"
    "log"
    "io/ioutil"
    "unicode/utf8"
    "strconv"
)

// Initializing my scanner
type Scanner interface {
    countChar(text string) int

    frequentSym(text string) // Return value is not yet implemented

    Scan()

    Run()
}

/* method counting characters */
func countChar(sc Scanner, text string) int { ... }

func frequentSym(sc Scanner, text string) {
    // Make a map with string key and integer value
    symbols := make(map[string] int)

    // Iterate through each char in text
    for _, sym := range text {
        // Convert rune to string
        char := strconv.QuoteRune(sym)
        // Set this string as a key in map and assign a counter value 
        count := symbols[char]

        if count == symbols[char] {
            // increment the value
            symbols[char] = count + 1
        } else {
            symbols[char] = 1
        }
    }
}

So, basically I need to find a pair with the highest int value and return a string key that corresponds to it, that is the most frequent character in text

答案1

得分: 3

只需遍历地图:

maxK := ""
maxV := 0
for k, v := range symbols {
    if v > maxV {
        maxV = v
        maxK = k
    }
}
// maxK 是具有最大值的键。
英文:

Just iterate over the map:

maxK := ""
maxV := 0
for k, v := range symbols {
    if v > maxV {
        maxV = v
        maxK = k
    }
}
// maxK is the key with the maximum value.

答案2

得分: 1

根据@Ainar-G的回答进行扩展,如果你的映射可能包含多个出现次数相同的键,那么@Ainar-G的代码每次可能返回不同的结果,因为Go的映射本质上是无序的;换句话说,第一个在映射中具有比所有先前值更高的值的键成为最高键,但你并不总是知道该值是否会首先出现在映射中。可以参考这个链接作为示例。

为了使代码具有确定性,你需要处理两个键具有相同值的情况。一个简单的实现方法是,如果值相同,则进行字符串比较。

maxK := ""
maxV := 0
for k, v := range symbols {
    if v > maxV || (v == maxV && k < maxK) {
        maxV = v
        maxK = k
    }
}
英文:

Expanding on @Ainar-G answer, if there is a possibility that your map could contain multiple keys that occur the same number of times, then @Ainar-G code could return different results every time because Go maps are inherently unordered; in other words, the first key in your map to have a value higher then all previous values becomes the highest key, but you don't always know whether that value will occur first in the map. See this as an example.

In order for the code to be deterministic, you will need to address the case where two keys have the same value. A simple implementation would be to do a string comparison if the value is the same.

maxK := &quot;&quot;
maxV := 0
for k, v := range symbols {
    if v &gt; maxV || (v == maxV &amp;&amp; k &lt; maxK) {
        maxV = v
        maxK = k
    }
}

huangapple
  • 本文由 发表于 2017年3月17日 20:23:53
  • 转载请务必保留本文链接:https://go.coder-hub.com/42857479.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定