英文:
Golang complex fold grüßen
问题
我正在尝试使三种语言(C++、Python和Golang)之间的大小写折叠一致,因为我需要能够检查字符串是否与保存的字符串匹配,无论使用哪种语言。
一个例子是德语单词"grüßen",其大写形式为"GRÜSSEN"(注意,"ß"变成了两个字符"SS")。
- C++使用boost::locale可以很好地实现文本转换。文本转换文档
- Python 3也可以通过str.casefold()实现。casefold文档
- 然而,Golang似乎没有合适的方法来进行正确的大小写折叠。Golang示例
我是否遗漏了某种方法来实现这一点,或者unicode文档末尾提到的bug是否适用于Golang中所有的文本转换用法?如果是这样,除了使用cgo编写之外,我还有哪些选项可以进行大小写折叠?
英文:
I'm trying to get case folding to be consistent between three languages (C++, Python and Golang) because I need to be able to check if a string matches the one saved no matter the language.
An example problematic word is the German word "grüßen" which in uppercase is "GRÜSSEN" (Note the 'ß' becomes two characters as 'SS').
- C++ works well using boost::locale text conversion docs
- Python 3 also works through str.casefold() casefold docs
- However, Golang doesn't seem to have a way to do proper case folding. golang
playground example
Is there some way to do this that I'm missing, or does this bug at the end of unicode's documentation apply to all usages of text conversion in golang? If so, what are my options for case folding other than writing it in cgo?
1: http://www.boost.org/doc/libs/1_63_0/libs/locale/doc/html/conversions.html "boost locale grüßen"
2: https://docs.python.org/3/library/stdtypes.html#str.casefold
3: https://play.golang.org/p/eYku0fCIpu
4: https://golang.org/pkg/unicode/#pkg-note-BUG
答案1
得分: 10
高级(支持Unicode)文本处理不是Go标准库的一部分,而是以许多(“受保护的”)第三方包的形式存在于golang.org/x/text/
下的伞下。
正如Shawn自己发现的那样,可以这样做:
import (
"golang.org/x/text/cases"
)
c := cases.Fold()
c.String("grüßen")
就可以得到"grüssen"。
这是因为无论在标准库中发货的是什么,都受到Go 1兼容性承诺的约束,在Go 1发布时,某些功能不可用或不完整,或者其API处于不稳定状态等等,因此这些部分被排除在核心之外,以便让它们成熟。
英文:
Advanced (Unicode-enabled) text processing is not part of the Go stdlib,¹
and exists in the form of a host of ("blessed") third-party packages
under the golang.org/x/text/
umbrella.
As Shawn figured out by himself, one can do
import (
"golang.org/x/text/cases"
)
c := cases.Fold()
c.String("grüßen")
to get "grüssen" back.
¹ That's because whatever is shipped in the stdlib is subject to the
Go 1 compatibility promise,
and at the time Go 1 was shipped certain functionality wasn't available
or was incomplete or its APIs were in flux etc, so such bits were kept out
of the core to let them mature.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论