从字节中删除NUL字符

huangapple go评论72阅读模式
英文:

Removing NUL characters from bytes

问题

为了学习Go语言,我正在构建一个简单的服务器,它接收一些输入,进行一些处理,然后将输出发送回客户端(包括原始输入)。

输入的长度可以从大约5到13个字符不等,还包括换行符和客户端发送的其他内容。

输入被读入一个字节数组中,然后转换为字符串进行一些处理。另一个字符串被附加到这个字符串上,然后整个字符串再次转换为字节数组以发送回客户端。

问题是输入被一堆NUL字符填充,我不知道如何去掉它们。

所以我可以遍历数组,当遇到一个NUL字符时,记录长度(n),创建一个新的字节数组,长度为n,并将前n个字符复制到新的字节数组中并使用它。这是最好的方法吗,还是有其他更简单的方法?

一些简化的代码:

data := make([]byte, 16)
c.Read(data)

s := strings.Replace(string(data[:]), "an", "", -1)
s = strings.Replace(s, "\r", "", -1)
s += "some other string"
response := []byte(s)
c.Write(response)
c.close()

如果我在这里做了其他明显愚蠢的事情,也希望能知道。

英文:

To teach myself Go I'm building a simple server that takes some input, does some processing, and sends output back to the client (that includes the original input).

The input can vary in length from around 5 - 13 characters + endlines and whatever other guff the client sends.

The input is read into a byte array and then converted to a string for some processing. Another string is appended to this string and the whole thing is converted back into a byte array to get sent back to the client.

The problem is that the input is padded with a bunch of NUL characters, and I'm not sure how to get rid of them.

So I could loop through the array and when I come to a nul character, note the length (n), create a new byte array of that length, and copy the first n characters over to the new byte array and use that. Is that the best way, or is there something to make this easier for me?

Some stripped down code:

data := make([]byte, 16)
c.Read(data)

s := strings.Replace(string(data[:]), "an", "", -1)
s = strings.Replace(s, "\r", "", -1)
s += "some other string"
response := []byte(s)
c.Write(response)
c.close()

Also if I'm doing anything else obviously stupid here it would be nice to know.

答案1

得分: 75

在包“bytes”中,func Trim(s []byte, cutset string) []byte 是你的朋友:

> Trim通过切片剪掉包含在cutset中的所有前导和尾随的UTF-8编码的Unicode码点,返回s的子切片。

// 从'b'中移除任何NULL字符
b = bytes.Trim(b, "\x00")
英文:

In package "bytes", func Trim(s []byte, cutset string) []byte is your friend:

> Trim returns a subslice of s by slicing off all leading and trailing UTF-8-encoded Unicode code points contained in cutset.

// Remove any NULL characters from 'b'
b = bytes.Trim(b, "\x00")

答案2

得分: 6

你的方法基本上是正确的。一些建议:

  1. 当你找到data中第一个空字节的索引后,你不需要复制,只需截取切片:data[:idx]

  2. bytes.Index 可以帮你找到该索引。

  3. 还有 bytes.Replace 方法,所以你不需要转换为字符串。

英文:

Your approach sounds basically right. Some remarks:

  1. When you have found the index of the first nul byte in data, you don't need to copy, just truncate the slice: data[:idx].

  2. bytes.Index should be able to find that index for you.

  3. There is also bytes.Replace so you don't need to convert to string.

答案3

得分: 4

io.Reader文档中写道:

Read将最多len(p)个字节读入p。它返回读取的字节数(0 <= n <= len(p))和遇到的任何错误。

如果应用程序中的Read调用未读取16个字节,则data将具有尾随的零字节。使用读取的字节数来修剪缓冲区中的零字节。

data := make([]byte, 16)
n, err := c.Read(data)
if err != nil {
   // 处理错误
}
data = data[:n]

还有另一个问题。不能保证Read会一次性读取对等方发送的整个“消息”。应用程序可能需要多次调用Read才能获取完整的消息。

你在问题中提到了换行符。如果来自客户端的消息以换行符结尾,则可以使用bufio.Scanner从连接中读取行:

 s := bufio.NewScanner(c)
 if s.Scan() {
     data = s.Bytes() // data是下一行,不包括换行符等
 }
 if s.Err() != nil {
     // 处理错误
 }
英文:

The io.Reader documentation says:

> Read reads up to len(p) bytes into p. It returns the number of bytes read (0 <= n <= len(p)) and any error encountered.

If the call to Read in the application does not read 16 bytes, then data will have trailing zero bytes. Use the number of bytes read to trim the zero bytes from the buffer.

data := make([]byte, 16)
n, err := c.Read(data)
if err != nil {
   // handle error
}
data = data[:n]

There's another issue. There's no guarantee that Read slurps up all of the "message" sent by the peer. The application may need to call Read more than once to get the complete message.

You mention endlines in the question. If the message from the client is terminated but a newline, then use bufio.Scanner to read lines from the connection:

 s := bufio.NewScanner(c)
 if s.Scan() {
     data = s.Bytes() // data is next line, not including end lines, etc.
 }
 if s.Err() != nil {
     // handle error
 } 

答案4

得分: 0

你可以利用Read的返回值:

package main
import "strings"

func main() {
   r, b := strings.NewReader("north east south west"), make([]byte, 16)
   n, e := r.Read(b)
   if e != nil {
      panic(e)
   }
   b = b[:n]
   println(string(b) == "north east south")
}

https://golang.org/pkg/io#Reader

英文:

You could utilize the return value of Read:

package main
import &quot;strings&quot;

func main() {
   r, b := strings.NewReader(&quot;north east south west&quot;), make([]byte, 16)
   n, e := r.Read(b)
   if e != nil {
      panic(e)
   }
   b = b[:n]
   println(string(b) == &quot;north east south&quot;)
}

https://golang.org/pkg/io#Reader

huangapple
  • 本文由 发表于 2013年3月15日 19:30:44
  • 转载请务必保留本文链接:https://go.coder-hub.com/15431283.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定