Golang中的函数bytes.Contains()无法正常工作。

huangapple go评论85阅读模式
英文:

Golang and not working function bytes.Contains()

问题

我有一个关于函数bytes.Contains(b, subslice []byte) bool的奇怪问题。它无法在通过函数(c *IPConn) Read(b []byte) (int, error)接收的字节数组中找到字符。应用程序是一个简单的服务器。

所以我有一个通过服务器接收到的字节数组,存储在变量buf中:

buf := make([]byte, 1024)
Len, err := c.conn.Read(buf)
// 下面是在buf中接收到的内容
// {"abc":[{"b":5,"bca":14,"xyz":0}]}{"abc":[{"b":7,"hjk":14,"qwe":0}]}

现在我想使用下面的函数在数组buf中查找**}{**字符:

if bytes.Contains(buf, []byte(`}{`) != false {
    fmt.Printf("I got you")
}

但是函数总是返回false。为什么?

我在程序中进行了一些实验,如下所示:

worker := []byte(`{"abc":[{"b":5,"bca":14,"xyz":0}]}{"abc":[{"b":7,"hjk":14,"qwe":0}]}`)

// 尝试查找}{
if bytes.Contains(worker, []byte(`}{`) != false {
    fmt.Printf("I got you")
}

**是正确的!**我不明白这个...因为这意味着通过服务器接收到的数据和程序中通常附加的数据之间的内容必须不同。

英文:

I have strange problem with function bytes.Contains(b, subslice []byte) bool. It doesn't find characters in byte array which was received in function (c *IPConn) Read(b []byte) (int, error). Application is a simple server.
So I have byte array, which was received via server to variable buf

buf := make([]byte, 1024)
Len, err := c.conn.Read(buf)
// below received content in buf
//{"abc":[{"b":5,"bca":14,"xyz":0}]}{"abc":[{"b":7,"hjk":14,"qwe":0}]}

Now I wanted use below function to find }{ characters in array buf

if bytes.Contains(buf, []byte(`}{`) != false {
    fmt.Printf("I got you")
}

But function always return false. Why ?

I did some experiment in my program as below:

worker := []byte('{"abc":[{"b":5,"bca":14,"xyz":0}]}{"abc":[{"b":7,"hjk":14,"qwe":0}]}')

// try find }{

if bytes.Contains(worker, []byte(`}{`) != false {
    fmt.Printf("I got you")
}

is CORRECT !!! I do not understand this... because it fallows that contents must be different between data which was received via server, and data form usually attachment in program.

答案1

得分: 10

c.conn.Read(buf)完成后,你实际上检查了errLen吗?

你程序中的主要缺陷(如所示)是你在使用buf搜索数据时,读取操作在接收到1到1024个字节之间的任意数量的字节后可以成功返回,并在接收到0到1024个字节之间的任意数量的字节后返回错误。

因此,你必须执行两个操作:

  • 检查错误
  • 要访问读取操作结束后在缓冲区开头可用的实际数据,你必须使用数据的实际长度Len

为此,通常会构建一个新的切片:

data := buf[:Len]

然后使用data变量:

if bytes.Contains(data, []byte("}{")) {
   ...
}

如果你不这样做,很容易访问到缓冲区中的旧数据,即从上一次调用c.conn.Read(buf)留下的数据。

如果你再仔细考虑一下这种情况,你会发现没有任何保证下一次对你的套接字的Read()调用会将}{序列带入缓冲区,你必须准备好累积你的数据:

  1. 每次Read()调用应将其Len字节添加到代码中要考虑的缓冲区字节的数量中。

    这意味着如果第N次读取操作未提供你要查找的数据,第(N+1)次操作必须将其字节放在上一次读取操作的最后一个字节之后;在Go中,这通常意味着为下一次读取操作构造另一个切片。

  2. 你应该使用当前累积字节的总数来搜索“}{”。

请考虑从这本书开始,以掌握网络编程的基础知识(包括Go的特定内容)。


正如你所看到的,正确处理这个任务看起来很复杂。
那么为什么不让Go自己进行缓冲呢?

你可以将你的算法重新表述如下:

  1. 读取输入数据,直到找到一个}字符。累积这些数据。

  2. 一旦找到},读取下一个字符,如果它是{,那么我们找到了我们感兴趣的位置。

    否则返回步骤(1)。

这可以使用bytes.Buffer及其方法来实现:

  • ReadBytes(delim byte) — 用于读取直到}字节为止的数据。
  • ReadByte() — 用于读取单个字节以检查是否跟随一个{
  • UnreadByte(c byte) — 如果}后面不是{,则将字节放回缓冲区。

现在让我们从更一般的角度来看待你的问题。
你在示例中呈现的数据看起来像一系列的JSON对象。那么,为什么不直接使用JSON解码器来立即解码你的数据,或者至少正确地跳过流中的对象呢?

英文:

Do you actually check err and Len after c.conn.Read(buf) finishes?

The chief flaw in your program (as presented) is that you're using buf to search for data while the read operation on your socket is free to return successfuuly after receiving any number of bytes between 1 and 1024, and return with an error after receiving any number of bytes between 0 and 1024.

So, you must do two things:

  • Check for error;
  • To access the actual data available at the beginning of the buffer after the read operation ends you have to use the actual length of data, Len.

To do the latter, you usually construct a new slice:

data := buf[:Len]

And then use the data variable:

if bytes.Contains(data, []byte("}{")) {
   ...
}

If you don't do this, you might easily access stale data in your buffer — that is, the data left there from the previous call to c.conn.Read(buf).

If you'll think of the situation a bit more, you'll see that nothing guarantees that the next call to Read() on your socket will bring the }{ sequence into the buffer, and you have to be prepared for accumulating your data: that is,

  1. Each call to Read() should add its Len bytes to the number of bytes in the buffer to consider by your code.

    This means that if the Nth read operation failed to provide the data you're looking for, the (N+1)th operation must put its bytes right after the last byte of the previous read operation; in Go, this typically means constructing another slice for that next read operation.

  2. You should use the total current number of accumulated bytes to search for "}{".

Please consider starting with this book to grasp the basics of networking programming (with Go specifics).


As you can see, properly dealing with this task looks complicated.
So why not let Go do buffering itself?

You could restate your algorythm like this:

  1. Read the input data until a } character is found. Accumulate this data.

  2. Once } is found, read the next character and if it's a {, we've found the spot we're interested in.

    Otherwise return to step (1).

This is doable using bytes.Buffer and its methods:

  • ReadBytes(delim byte) — for reading up to a } byte.
  • ReadByte() — for reading a single byte to check if a { follows.
  • UnreadByte(c byte) — for putting the byte back into the buffer if it's not a { following }.

Now let's look at your problem from a more general perspective.
The data you've presented in your example looks like a series of JSON objects to me. So why are you trying to apply some low-tech approach for finding boundaries between those objects instead of just using JSON decoder to decode your data right away or at least properly skip over objects in the stream?

答案2

得分: 2

你的代码有一些问题,缺少一些括号等。这个代码看起来是有效的:

package main

import (
	"bytes"
	"fmt"
)

const data = `{"abc":[{"b":5,"bca":14,"xyz":0}]}{"abc":[{"b":7,"hjk":14,"qwe":0}]}`

func main() {

	buf := []byte(data)
	fmt.Printf("buf = %s\n", string(buf))

	if bytes.Contains(buf, []byte("}{")) {
		fmt.Printf("I got you\n")
	}

}

你的连接应用程序可能存在接收数据的编码问题,这是一个棘手的问题,有时我会打印接收到的数据的十六进制值来查看实际传输的内容。

编辑:

尝试像这样打印接收到的数据:

for _, b := range buf {
	fmt.Printf("%X ", b)
}

然后与测试数据进行比较,看看是否有差异,这是你所说的唯一可能出错的地方。

英文:

There is a problem with you code missing some brackets etc. This seems to work :

package main

import (
	"bytes"
	"fmt"
)

const data = `{"abc":[{"b":5,"bca":14,"xyz":0}]}{"abc":[{"b":7,"hjk":14,"qwe":0}]}`

func main() {

	buf := []byte(data)
	fmt.Printf("buf = %s\n", string(buf))

	if bytes.Contains(buf, []byte("}{")) {
		fmt.Printf("I got you\n")
	}

}

There may be an encoding problem receiving the data in your connected application, thats a tricky one to show and I have resorted to printing the hex vals of received data on occasion to really see what came across the wire.

EDIT :

Try to print out the received data like this :

for _, b := range buf {
	fmt.Printf("%X ", b)
}

And then compare with the test data to see if there are differences, this is as you say the only place it could go wrong ?

huangapple
  • 本文由 发表于 2015年9月22日 19:02:18
  • 转载请务必保留本文链接:https://go.coder-hub.com/32715166.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定