英文:
Golang: How to correctly parse UTF-8 string from C
问题
我是一个对Go语言不太熟悉的新手,所以这可能很明显。
我有一个Go函数,我正在使用go build -buildmode=c-shared
和相应的//export funcName
注释将其暴露给C语言。
(你可以在这里看到:https://github.com/udl/bmatch/blob/master/ext/levenshtein.go#L42)
我的转换目前是这样的:
func distance(s1in, s2in *C.char) int {
s1 := C.GoString(s1in)
s2 := C.GoString(s2in)
在这里,我该如何处理UTF-8输入?
我看到有一个UTF-8包,但我不太明白它是如何工作的。https://golang.org/pkg/unicode/utf8/
谢谢!
英文:
I'm a newbie to the go world, so maybe this is obvious.
I have a Go function which I'm exposing to C with the go build -buildmode=c-shared
and corresponding //export funcName
comment.
(You can see it here: https://github.com/udl/bmatch/blob/master/ext/levenshtein.go#L42)
My conversion currently works like this:
func distance(s1in, s2in *C.char) int {
s1 := C.GoString(s1in)
s2 := C.GoString(s2in)
How would I handle UTF-8 input here?
I've seen there is a UTF-8 package but I don't quite get how it works. https://golang.org/pkg/unicode/utf8/
Thank you!
答案1
得分: 6
你不需要做任何特殊处理。UTF-8是Go的“本地”字符编码,所以你可以使用你提到的utf8
包中的函数,比如utf8.RuneCountInString
来获取字符串中Unicode字符的数量。请记住,len(s)
仍然会返回字符串中的字节数。
有关详细信息,请参阅官方博客中的这篇文章或这篇文章。
英文:
You don't need to do anything special. UTF-8 is Go's "native" character encoding, so you can use the functions from the utf8
package you mentioned, e.g. utf8.RuneCountInString
to get the number of Unicode runes in a string. Keep in mind that len(s)
will still return the number of bytes in the string.
See this post in the official blog or this article for some details.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论