从rune获取Unicode类别

huangapple go评论89阅读模式
英文:

Get unicode category from rune

问题

我正在寻找一种在Go语言中从rune获取Unicode类别(RangeTable)的方法。例如,字符a映射到Ll类别。unicode包指定了所有的类别(http://golang.org/pkg/unicode/#pkg-variables),但我没有找到从给定的rune查找类别的方法。我需要手动使用适当的偏移量构建RangeTable吗?

英文:

I'm looking for a way to get the unicode category (RangeTable) from a rune in Go. For example, the character a maps to the Ll category. The unicode package specifies all of the categories (http://golang.org/pkg/unicode/#pkg-variables), but I don't see any way to lookup the category from a given rune. Do I need to manually construct the RangeTable from the rune using the appropriate offsets?

答案1

得分: 8

“unicode”包的文档中没有提供一个返回符文范围的方法,但构建一个并不是很复杂:

func cat(r rune) (names []string) {
    names = make([]string, 0)
    for name, table := range unicode.Categories {
        if unicode.Is(table, r) {
            names = append(names, name)
        }
    }
    return
}
英文:

The docs for the "unicode" package does not have a method that returns ranges for the rune but it is not very tricky to build one:

func cat(r rune) (names []string) {
    names = make([]string, 0)
    for name, table := range unicode.Categories {
        if unicode.Is(table, r) {
            names = append(names, name)
        }
    }
    return
}

答案2

得分: 0

这是基于接受的答案的另一种版本,它返回给定符文的Unicode类别:

// UnicodeCategory 返回给定符文的Unicode字符类别。
func UnicodeCategory(r rune) string {
for name, table := range unicode.Categories {
if len(name) == 2 && unicode.Is(table, r) {
return name
}
}
return "Cn"
}

英文:

Here is an alternative version based on the accepted answer, that returns the Unicode Category:

// UnicodeCategory returns the Unicode Character Category of the given rune.
func UnicodeCategory(r rune) string {
    for name, table := range unicode.Categories {
	    if len(name) == 2 && unicode.Is(table, r) {
		    return name
	    }
    }
    return "Cn"
}

huangapple
  • 本文由 发表于 2014年9月12日 03:26:06
  • 转载请务必保留本文链接:https://go.coder-hub.com/25795557.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定