英文:
Does the conversion from string to rune slice make a copy?
问题
我正在教自己从C语言背景下学习Go语言。
下面的代码按照我的预期工作(前两个Printf()将访问字节,后两个Printf()将访问码点)。
我不清楚的是这是否涉及任何数据的复制。
package main
import "fmt"
var a string
func main() {
	a = "èe"
	fmt.Printf("%d\n", a[0])
	fmt.Printf("%d\n", a[1])
	fmt.Println("")
	fmt.Printf("%d\n", []rune(a)[0])
	fmt.Printf("%d\n", []rune(a)[1])
}
换句话说:
[]rune("string")是创建一个rune数组并用与"string"对应的runes填充它,还是编译器只是找出如何从字符串字节中获取runes?
英文:
I'm teaching myself Go from a C background.
The code below works as I expect (the first two Printf() will access bytes, the last two Printf() will access codepoints).
What I am not clear is if this involves any copying of data.
package main
import "fmt"
var a string
func main() {
	a = "èe"
	fmt.Printf("%d\n", a[0])
	fmt.Printf("%d\n", a[1])
	fmt.Println("")
	fmt.Printf("%d\n", []rune(a)[0])
	fmt.Printf("%d\n", []rune(a)[1])
}
In other words:
> does []rune("string") create an array of runes and fill it with the runes corresponding to "string", or it's just the compiler that figures out how to get runes from the string bytes?
答案1
得分: 7
将[]uint8(即字符串)转换为[]int32([]rune的别名)而不分配数组是不可能的。
此外,Go中的字符串是不可变的,但切片不是,因此将字符串转换为[]byte和[]rune都必须以某种方式复制字符串的字节。
英文:
It is not possible to turn []uint8 (i.e. a string) into []int32 (an alias for []rune) without allocating an array.
Also, strings are immutable in Go but slices are not, so the conversion to both []byte and []rune must copy the string's bytes in some way or another.
答案2
得分: 6
这涉及到了复制,因为:
- 字符串是不可变的;如果转换
[]rune(s)不进行复制,你就可以索引rune切片并改变字符串内容。 string类型的值是一个“(可能为空的)字节序列”,其中byte是uint8的别名,而rune是“标识Unicode码点的整数值”的别名,类型不同,甚至长度也可能不同:
    a := "èe"
    r := []rune(a)
    fmt.Println(len(a)) // 3(3个字节)
    fmt.Println(len(r)) // 2(2个Unicode码点)
英文:
It involves a copy because:
- strings are immutable; if the conversion 
[]rune(s)didn't make a copy, you would be able to index the rune slice and change the string contents - a 
stringvalue is a "(possibly empty) sequence of bytes", wherebyteis an alias ofuint8, whereas aruneis a "an integer value identifying a Unicode code point" and an alias ofint32. The types are not identical and even the lengths may not be the same: 
    a = "èe"
    r := []rune(a)
    fmt.Println(len(a)) // 3 (3 bytes)
    fmt.Println(len(r)) // 2 (2 Unicode code points)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论