2014年1月8日 09:46:33go评论120阅读模式

英文:

Does Golang do any conversion when casting a byte slice to a string?

问题

Golang在将字节切片转换为字符串时是否进行任何转换或尝试解释字节？我刚刚尝试了一个包含空字节的字节切片，看起来它仍然保持字符串的原样。

var test []byte
test = append(test, 'a')
test = append(test, 'b')
test = append(test, 0)
test = append(test, 'd')
fmt.Println(test[2] == 0) // OK

但是对于包含无效Unicode点或UTF-8编码的字符串，转换会失败或数据会损坏吗？

英文:

Does Golang do any conversion or somehow try to interpret the bytes when casting a byte slice to a string? I've just tried with a byte slice containing a null byte and it looks like it still keep the string as it is.

var test []byte
test = append(test, &#39;a&#39;)
test = append(test, &#39;b&#39;)
test = append(test, 0)
test = append(test, &#39;d&#39;)
fmt.Println(test[2] == 0) // OK

But how about strings with invalid unicode points or UTF-8 encoding. Could the casting fail or the data be corrupted?

答案1

得分: 10

《Go编程语言规范》

字符串类型

字符串类型表示一组字符串值。字符串值是一个（可能为空的）字节序列。

转换

与字符串类型之间的转换

将字节切片转换为字符串类型会产生一个字符串，其连续的字节是切片的元素。

string([]byte{'h', 'e', 'l', 'l', '\xc3', '\xb8'})   // "hellø"
string([]byte{})                                     // ""
string([]byte(nil))                                  // ""
type MyBytes []byte
string(MyBytes{'h', 'e', 'l', 'l', '\xc3', '\xb8'})  // "hellø"

将字符串类型的值转换为字节切片类型会产生一个切片，其连续的元素是字符串的字节。

[]byte("hellø")   // []byte{'h', 'e', 'l', 'l', '\xc3', '\xb8'}
[]byte("")        // []byte{}
MyBytes("hellø")  // []byte{'h', 'e', 'l', 'l', '\xc3', '\xb8'}

字符串值是一个（可能为空的）字节序列。字符串值可能表示以UTF-8编码的Unicode字符，也可能不表示。在从字节切片到字符串的转换过程中，字节不会被解释，从字符串到字节切片的转换也是如此。因此，字节不会被更改，转换也不会失败。

参考链接：
1: http://golang.org/ref/spec
2: http://golang.org/ref/spec#String_types
3: http://golang.org/ref/spec#Conversions

英文:

> The Go Programming Language Specification
>
> String types
>
> A string type represents the set of string values. A string value is a
> (possibly empty) sequence of bytes.
>
> Conversions
>
> Conversions to and from a string type
>
> Converting a slice of bytes to a string type yields a string whose
> successive bytes are the elements of the slice.
>
> string([]byte{'h', 'e', 'l', 'l', '\xc3', '\xb8'}) // "hellø"
> string([]byte{}) // ""
> string([]byte(nil)) // ""
>
> type MyBytes []byte
> string(MyBytes{'h', 'e', 'l', 'l', '\xc3', '\xb8'}) // "hellø"
>
> Converting a value of a string type to a slice of bytes type yields a
> slice whose successive elements are the bytes of the string.
>
> []byte("hellø") // []byte{'h', 'e', 'l', 'l', '\xc3', '\xb8'}
> []byte("") // []byte{}
>
> MyBytes("hellø") // []byte{'h', 'e', 'l', 'l', '\xc3', '\xb8'}

A string value is a (possibly empty) sequence of bytes. A string value may or may not represent Unicode characters encoded in UTF-8. There is no interpretation of the bytes during the conversion from byte slice to string nor from string to byte slice. Therefore, the bytes will not be changed and the conversions will not fail.

答案2

得分: 5

不，类型转换不会失败。以下是一个示例，展示了这一点（在Go Playground中运行）：

b := []byte{0x80}
s := string(b)
fmt.Println(s)
fmt.Println([]byte(s))
for _, c := range s {
    fmt.Println(c)
}

这将打印出：

�
[128]
65533

请注意，根据Go规范，在无效的UTF-8序列上进行迭代是被明确定义的：

对于字符串值，"range"子句从字节索引0开始迭代字符串中的Unicode代码点。在后续的迭代中，索引值将是字符串中连续的UTF-8编码代码点的第一个字节的索引，第二个值（类型为rune）将是相应代码点的值。如果迭代遇到无效的UTF-8序列，则第二个值将为0xFFFD，即Unicode替换字符，并且下一次迭代将在字符串中前进一个字节。

英文:

No, the casting can't fail. Here's an example showing this (run in the Go Playground):

b := []byte{0x80}
s := string(b)
fmt.Println(s)
fmt.Println([]byte(s))
for _, c := range s {
	fmt.Println(c)
}

This prints:

�
[128]
65533

Note that ranging over invalid UTF-8 is well defined according to the Go spec:

> For a string value, the "range" clause iterates over the Unicode code
> points in the string starting at byte index 0. On successive
> iterations, the index value will be the index of the first byte of
> successive UTF-8-encoded code points in the string, and the second
> value, of type rune, will be the value of the corresponding code
> point. If the iteration encounters an invalid UTF-8 sequence, the
> second value will be 0xFFFD, the Unicode replacement character, and
> the next iteration will advance a single byte in the string.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Does Golang do any conversion when casting a byte slice to a string?

问题

答案1

答案2

在构建日期时，使用整数作为月份。

“Memory used”指标：Go工具pprof与docker stats的比较。

检测变量的状态变化

如何加速Google App Engine Go单元测试？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。