英文:
How to get a single Unicode character from string
问题
我想知道如何从字符串中获取一个 Unicode 字符。例如,如果字符串是"你好",我如何获取第一个字符"你"?
我从另一个地方找到了一种方法:
var str = "你好"
runes := []rune(str)
fmt.Println(string(runes[0]))
这个方法是有效的。但我还有一些问题:
-
是否有其他方法可以实现这个目的?
-
为什么在 Go 语言中,
str[0]
不能从字符串中获取一个 Unicode 字符,而是获取字节数据?
英文:
I wonder how I can I get a Unicode character from a string. For example, if the string is "你好", how can I get the first character "你"?
From another place I get one way:
var str = "你好"
runes := []rune(str)
fmt.Println(string(runes[0]))
It does work.
But I still have some questions:
-
Is there another way to do it?
-
Why in Go does
str[0]
not get a Unicode character from a string, but it gets byte data?
答案1
得分: 44
首先,你可能想阅读https://blog.golang.org/strings,它会回答你部分问题。
在Go中,字符串可以包含任意字节。当你写str[i]时,结果是一个字节,索引始终是字节数。
大多数情况下,字符串是以UTF-8编码的。你有多种方法来处理字符串中的UTF-8编码。
例如,你可以使用for...range语句逐个字符地迭代字符串。
var first rune
for _, c := range str {
first = c
break
}
// first现在包含字符串的第一个字符
你还可以利用unicode/utf8包。例如:
r, size := utf8.DecodeRuneInString(str)
// r包含字符串的第一个字符
// size是字符的字节数
如果字符串以UTF-8编码,没有直接的方法来访问字符串的第n个字符,因为字符的大小(以字节为单位)是不固定的。如果你需要这个功能,你可以很容易地编写自己的辅助函数来实现它(可以使用for...range或unicode/utf8包)。
英文:
First, you may want to read https://blog.golang.org/strings
It will answer part of your questions.
A string in Go can contains arbitrary bytes. When you write str[i], the result is a byte, and the index is always a number of bytes.
Most of the time, strings are encoded in UTF-8 though. You have multiple ways to deal with UTF-8 encoding in a string.
For instance, you can use the for...range statement to iterate on a string rune by rune.
var first rune
for _,c := range str {
first = c
break
}
// first now contains the first rune of the string
You can also leverage the unicode/utf8 package. For instance:
r, size := utf8.DecodeRuneInString(str)
// r contains the first rune of the string
// size is the size of the rune in bytes
If the string is encoded in UTF-8, there is no direct way to access the nth rune of the string, because the size of the runes (in bytes) is not constant. If you need this feature, you can easily write your own helper function to do it (with for...range, or with the unicode/utf8 package).
答案2
得分: 2
你可以使用utf8string
包:
package main
import "golang.org/x/exp/utf8string"
func main() {
s := utf8string.NewString("ÄÅàâäåçèéêëìîïü")
// 示例 1
r := s.At(1)
println(r == 'Å')
// 示例 2
t := s.Slice(1, 3)
println(t == "Åà")
}
https://pkg.go.dev/golang.org/x/exp/utf8string
英文:
You can use the utf8string
package:
package main
import "golang.org/x/exp/utf8string"
func main() {
s := utf8string.NewString("ÄÅàâäåçèéêëìîïü")
// example 1
r := s.At(1)
println(r == 'Å')
// example 2
t := s.Slice(1, 3)
println(t == "Åà")
}
答案3
得分: -2
你可以这样做:
func main() {
str := "cat"
var s rune
for i, c := range str {
if i == 2 {
s = c
}
}
}
现在,s的值等于'a'。
英文:
you can do this:
func main() {
str := "cat"
var s rune
for i, c := range str {
if i == 2 {
s = c
}
}
}
s is now equal to a
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论