忽略字符的重音符号在排序字符串时。

huangapple go评论85阅读模式
英文:

Ignore character accents when sorting strings

问题

我正在编写一个使用Go语言的程序,它接受一个字符串列表,并将它们按照字符串的首字母分组到桶列表中。然而,我希望它能将带重音符号的字符与最相似的无重音字符分组在一起。所以,如果我有一个以字母A为键的桶,我希望以Á开头的字符串也能被包括进去。

Go语言中是否有内置的方法可以确定这一点,或者我最好的选择是使用一个包含所有字符及其带重音变体的大型switch语句?

英文:

I'm writing a golang program, which takes a list of strings and sorts them into bucket lists by the first character of string. However, I want it to group accented characters with the unaccented character that it most resembles. So, if I have a bucket for the letter A, then I want strings that start with Á to be included.

Does Go have anything built-in for determining this, or is my best bet to just have a large switch statement with all characters and their accented variations?

答案1

得分: 10

看起来有一些附加包可以使用。这里是一个示例...

package main

import (
   "fmt"
   "golang.org/x/text/collate"
   "golang.org/x/text/language"
)

func main() {
   strs := []string{"abc", "áab", "aaa"}
   cl := collate.New(language.English, collate.Loose)
   cl.SortStrings(strs)
   fmt.Println(strs) 
}

输出结果:

[aaa áab abc]

此外,可以查看以下关于文本规范化的参考资料:
http://blog.golang.org/normalization

英文:

Looks like there are some addon packages for this. Here's an example...

package main

import (
   "fmt"
   "golang.org/x/text/collate"
   "golang.org/x/text/language"
)

func main() {
   strs := []string{"abc", "áab", "aaa"}
   cl := collate.New(language.English, collate.Loose)
   cl.SortStrings(strs)
   fmt.Println(strs) 
}

outputs:

[aaa áab abc]

Also, check out the following reference on text normalization:
http://blog.golang.org/normalization

huangapple
  • 本文由 发表于 2014年1月3日 06:27:28
  • 转载请务必保留本文链接:https://go.coder-hub.com/20893112.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定