英文:
Does the conversion from string to rune slice make a copy?
问题
我正在教自己从C语言背景下学习Go语言。
下面的代码按照我的预期工作(前两个Printf()
将访问字节,后两个Printf()
将访问码点)。
我不清楚的是这是否涉及任何数据的复制。
package main
import "fmt"
var a string
func main() {
a = "èe"
fmt.Printf("%d\n", a[0])
fmt.Printf("%d\n", a[1])
fmt.Println("")
fmt.Printf("%d\n", []rune(a)[0])
fmt.Printf("%d\n", []rune(a)[1])
}
换句话说:
[]rune("string")
是创建一个rune数组并用与"string"
对应的runes填充它,还是编译器只是找出如何从字符串字节中获取runes?
英文:
I'm teaching myself Go from a C background.
The code below works as I expect (the first two Printf()
will access bytes, the last two Printf()
will access codepoints).
What I am not clear is if this involves any copying of data.
package main
import "fmt"
var a string
func main() {
a = "èe"
fmt.Printf("%d\n", a[0])
fmt.Printf("%d\n", a[1])
fmt.Println("")
fmt.Printf("%d\n", []rune(a)[0])
fmt.Printf("%d\n", []rune(a)[1])
}
In other words:
> does []rune("string")
create an array of runes and fill it with the runes corresponding to "string"
, or it's just the compiler that figures out how to get runes from the string bytes?
答案1
得分: 7
将[]uint8(即字符串)转换为[]int32([]rune的别名)而不分配数组是不可能的。
此外,Go中的字符串是不可变的,但切片不是,因此将字符串转换为[]byte和[]rune都必须以某种方式复制字符串的字节。
英文:
It is not possible to turn []uint8 (i.e. a string) into []int32 (an alias for []rune) without allocating an array.
Also, strings are immutable in Go but slices are not, so the conversion to both []byte and []rune must copy the string's bytes in some way or another.
答案2
得分: 6
这涉及到了复制,因为:
- 字符串是不可变的;如果转换
[]rune(s)
不进行复制,你就可以索引rune切片并改变字符串内容。 string
类型的值是一个“(可能为空的)字节序列”,其中byte
是uint8
的别名,而rune
是“标识Unicode码点的整数值”的别名,类型不同,甚至长度也可能不同:
a := "èe"
r := []rune(a)
fmt.Println(len(a)) // 3(3个字节)
fmt.Println(len(r)) // 2(2个Unicode码点)
英文:
It involves a copy because:
- strings are immutable; if the conversion
[]rune(s)
didn't make a copy, you would be able to index the rune slice and change the string contents - a
string
value is a "(possibly empty) sequence of bytes", wherebyte
is an alias ofuint8
, whereas arune
is a "an integer value identifying a Unicode code point" and an alias ofint32
. The types are not identical and even the lengths may not be the same:
a = "èe"
r := []rune(a)
fmt.Println(len(a)) // 3 (3 bytes)
fmt.Println(len(r)) // 2 (2 Unicode code points)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论