如何将字节缓冲区中的以空字符结尾的字符串转换为Go中的字符串?

huangapple go评论136阅读模式
英文:

How can I convert a null-terminated string in a byte buffer to a string in Go?

问题

这个:

  1. label := string([]byte{97, 98, 99, 0, 0, 0, 0})
  2. fmt.Printf("%s\n", label)

变成了这样(^@ 是空字节):

  1. go run test.go
  2. abc^@^@^@
英文:

This:

  1. label := string([]byte{97, 98, 99, 0, 0, 0, 0})
  2. fmt.Printf("%s\n", label)

does this (^@ is the null-byte):

  1. go run test.go
  2. abc^@^@^@

答案1

得分: 16

在Go的syscall包中有一个隐藏的函数,它可以找到第一个空字节([]byte{0})并返回其长度。我假设它被称为clen,表示C长度。

抱歉我回答晚了一年,但我认为这个方法比其他两个方法简单得多(没有不必要的导入等)。

  1. func clen(n []byte) int {
  2. for i := 0; i < len(n); i++ {
  3. if n[i] == 0 {
  4. return i
  5. }
  6. }
  7. return len(n)
  8. }

所以,

  1. label := []byte{97, 98, 99, 0, 0, 0, 0}
  2. s := label[:clen(label)]
  3. fmt.Println(string(s))

上面的代码将s设置为从label的开头到clen(label)索引的字节切片。

结果将是长度为3的abc

英文:

There's this function hidden inside Go's syscall package that finds the first null byte ([]byte{0}) and returns the length. I'm assuming it's called clen for C-Length.

Sorry I'm a year late on this answer, but I think it's a lot simpler than the other two (no unnecessary imports, etc.)

  1. func clen(n []byte) int {
  2. for i := 0; i &lt; len(n); i++ {
  3. if n[i] == 0 {
  4. return i
  5. }
  6. }
  7. return len(n)
  8. }

So,

  1. label := []byte{97, 98, 99, 0, 0, 0, 0}
  2. s := label[:clen(label)]
  3. fmt.Println(string(s))

What that ^ says is to set s to the slice of bytes in label from the beginning to the index of clen(label).

The result would be abc with a length of 3.

答案2

得分: 12

请注意,第一个答案只适用于在空终止符后只有一串零的字符串;然而,一个正确的C风格的空终止字符串在第一个\0处结束,即使后面跟着垃圾字符。例如,[]byte{97,98,99,0,99,99,0}应该被解析为abc,而不是abc^@cc

为了正确解析这个字符串,可以使用string.Index函数来找到第一个\0,然后使用它来切片原始的字节切片:

  1. package main
  2. import (
  3. "fmt"
  4. "strings"
  5. )
  6. func main() {
  7. label := []byte{97,98,99,0,99,99,0}
  8. nullIndex := strings.Index(string(label), "\x00")
  9. if (nullIndex < 0) {
  10. fmt.Println("Buffer did not hold a null-terminated string")
  11. os.Exit(1)
  12. }
  13. fmt.Println(string(label[:nullIndex]))
  14. }

编辑:之前将缩短的版本打印为[]byte而不是string。感谢 @serbaut 的指正。

编辑2:之前没有处理缓冲区没有空终止符的错误情况。感谢 @snap 的指正。

英文:

Note that the first answer will only work with strings that have only a run of zeroes after the null terminator; however, a proper C-style null-terminated string ends at the first \0 even if it's followed by garbage. For example, []byte{97,98,99,0,99,99,0} should be parsed as abc, not abc^@cc.

To properly parse this, use string.Index, as follows, to find the first \0 and use it to slice the original byte-slice:

  1. package main
  2. import (
  3. &quot;fmt&quot;
  4. &quot;strings&quot;
  5. )
  6. func main() {
  7. label := []byte{97,98,99,0,99,99,0}
  8. nullIndex := strings.Index(string(label), &quot;\x00&quot;)
  9. if (nullIndex &lt; 0) {
  10. fmt.Println(&quot;Buffer did not hold a null-terminated string&quot;)
  11. os.Exit(1)
  12. }
  13. fmt.Println(string(label[:nullIndex]))
  14. }

EDIT: Was printing the shortened version as a []byte instead of as a string. Thanks to @serbaut for the catch.

EDIT 2: Was not handling the error case of a buffer without a null terminator. Thanks to @snap for the catch.

答案3

得分: 1

你可以使用sys包:

  1. package main
  2. import "golang.org/x/sys/windows"
  3. func main() {
  4. b := []byte{97, 98, 99, 0, 0, 0, 0}
  5. s := windows.ByteSliceToString(b)
  6. println(s == "abc")
  7. }

或者你可以自己实现它:

  1. package main
  2. import "bytes"
  3. func byteSliceToString(s []byte) string {
  4. n := bytes.IndexByte(s, 0)
  5. if n >= 0 {
  6. s = s[:n]
  7. }
  8. return string(s)
  9. }
  10. func main() {
  11. b := []byte{97, 98, 99, 0, 0, 0, 0}
  12. s := byteSliceToString(b)
  13. println(s == "abc")
  14. }
英文:

You can use the sys package:

  1. package main
  2. import &quot;golang.org/x/sys/windows&quot;
  3. func main() {
  4. b := []byte{97, 98, 99, 0, 0, 0, 0}
  5. s := windows.ByteSliceToString(b)
  6. println(s == &quot;abc&quot;)
  7. }

Or you can just implement it yourself:

  1. package main
  2. import &quot;bytes&quot;
  3. func byteSliceToString(s []byte) string {
  4. n := bytes.IndexByte(s, 0)
  5. if n &gt;= 0 {
  6. s = s[:n]
  7. }
  8. return string(s)
  9. }
  10. func main() {
  11. b := []byte{97, 98, 99, 0, 0, 0, 0}
  12. s := byteSliceToString(b)
  13. println(s == &quot;abc&quot;)
  14. }

答案4

得分: 1

1. strings.TrimSpace .TrimRight

//trim tail '\0', but can't handle bytes like "abc\x00def\x00".

can't edit @orelli answer, so wrote here:

  1. package main
  2. import (
  3. "fmt"
  4. "strings"
  5. )
  6. func main() {
  7. label := string([]byte{97, 98, 99, 0, 0, 0, 0})
  8. s1 := strings.TrimSpace(label)
  9. fmt.Println(len(s1), s1)
  10. s2 := strings.TrimRight(label, "\x00")
  11. fmt.Println(len(s2), s2)
  12. }

output:

  1. 7 abc????
  2. 3 abc

// ? is '\0' which can't display here.

<br>

So
.TrimSpace can't trim '\0', but
.TrimRight with "\x00" can.

<br><br>

2. bytes.IndexByte

search for first '\0', maybe not support utf-8

  1. package main
  2. import (
  3. "bytes"
  4. "fmt"
  5. "strings"
  6. )
  7. func main() {
  8. b_arr := []byte{97, 98, 99, 0, 100, 0, 0}
  9. label := string(b_arr)
  10. s1 := strings.TrimSpace(label)
  11. fmt.Println(len(s1), s1) //7 abc?d??
  12. s2 := strings.TrimRight(label, "\x00")
  13. fmt.Println(len(s2), s2) //5 abc?d
  14. n := bytes.IndexByte([]byte(label), 0)
  15. fmt.Println(n, label[:n]) //3 abc
  16. s_arr := b_arr[:bytes.IndexByte(b_arr, 0)]
  17. fmt.Println(len(s_arr), string(s_arr)) //3 abc
  18. }

equivalent

  1. n1 := bytes.IndexByte(b_arr, 0)
  2. n2 := bytes.Index(b_arr, []byte{0})
  3. n3, c := 0, byte(0)
  4. for n3, c = range b_arr {
  5. if c == 0 {
  6. break
  7. }
  8. }
英文:

1. strings <del>.TrimSpace</del> .TrimRight

//trim tail '\0', but can't handle bytes like "abc\x00def\x00".

can't edit @orelli answer, so wrote here:

  1. package main
  2. import (
  3. &quot;fmt&quot;
  4. &quot;strings&quot;
  5. )
  6. func main() {
  7. label := string([]byte{97, 98, 99, 0, 0, 0, 0})
  8. s1 := strings.TrimSpace(label)
  9. fmt.Println(len(s1), s1)
  10. s2 := strings.TrimRight(label, &quot;\x00&quot;)
  11. fmt.Println(len(s2), s2)
  12. }

output:

  1. 7 abc????
  2. 3 abc

// ? is '\0' which can't display here.

<br>

So
.TrimSpace can't trim '\0', but
.TrimRight with "\x00" can.

<br><br>

2. bytes.IndexByte

search for first '\0', maybe not support utf-8

  1. package main
  2. import (
  3. &quot;bytes&quot;
  4. &quot;fmt&quot;
  5. &quot;strings&quot;
  6. )
  7. func main() {
  8. b_arr := []byte{97, 98, 99, 0, 100, 0, 0}
  9. label := string(b_arr)
  10. s1 := strings.TrimSpace(label)
  11. fmt.Println(len(s1), s1) //7 abc?d??
  12. s2 := strings.TrimRight(label, &quot;\x00&quot;)
  13. fmt.Println(len(s2), s2) //5 abc?d
  14. n := bytes.IndexByte([]byte(label), 0)
  15. fmt.Println(n, label[:n]) //3 abc
  16. s_arr := b_arr[:bytes.IndexByte(b_arr, 0)]
  17. fmt.Println(len(s_arr), string(s_arr)) //3 abc
  18. }

equivalent

  1. n1 := bytes.IndexByte(b_arr, 0)
  2. n2 := bytes.Index(b_arr, []byte{0})
  3. n3, c := 0, byte(0)
  4. for n3, c = range b_arr {
  5. if c == 0 {
  6. break
  7. }
  8. }

答案5

得分: 0

你可以使用bytes.SplitN并让它返回第一个子切片:

  1. import (
  2. "bytes"
  3. )
  4. func bytesToStr(in []byte) string {
  5. str := bytes.SplitN(in, []byte{0}, 2)[0]
  6. return string(str)
  7. }

在Go 1.18+中,你也可以使用bytes.Cut

  1. func bytesToStr(in []byte) string {
  2. str, _, _ := bytes.Cut(in, []byte{0})
  3. return string(str)
  4. }
英文:

You can use bytes.SplitN and have it return the first subslice:

  1. import (
  2. &quot;bytes&quot;
  3. )
  4. func bytesToStr(in []byte) string {
  5. str := bytes.SplitN(in, []byte{0}, 2)[0]
  6. return string(str)
  7. }

In go 1.18+, you can also use bytes.Cut:

  1. func bytesToStr(in []byte) string {
  2. str, _, _ := bytes.Cut(in, []byte{0})
  3. return string(str)
  4. }

huangapple
  • 本文由 发表于 2012年9月11日 05:30:56
  • 转载请务必保留本文链接:https://go.coder-hub.com/12359777.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定