如何保存并再次使用 io.Reader 类型的数据?

huangapple go评论90阅读模式
英文:

How to save, and then serve again data of type io.Reader?

问题

我想通过HTTP调用检索到的gocal数据进行多次解析。由于我想避免为每次解析都进行调用,所以我想保存这些数据并重复使用。

我从http.Get获取的Body的类型是io.ReadClosergocal解析器需要io.Reader才能正常工作。

由于我只能检索Body一次,我可以使用body, _ := io.ReadAll(get.Body)将其保存,但是我不知道如何将[]byte作为io.Reader返回(以便多次传递给gocal解析器以考虑不同的解析条件)。

英文:

I would like to parse several times with gocal data I retrieve through a HTTP call. Since I would like to avoid making the call for each of the parsing, I would like to save this data and reuse it.

The Body I get from http.Get is of type io.ReadCloser. The gocal parser requires io.Reader so it works.

Since I can retrieve Body only once, I can save it with body, _ := io.ReadAll(get.Body) but then I do not know how to serve []byte as io.Reader back (to the gocal parser, several times to account for different parsing conditions)

答案1

得分: 2

正如你所了解的,http.Response.Body 被公开为一个 io.Reader,这个读取器是不可重用的,因为它直接连接到底层连接(可能是 tcp/utp 或任何其他 net 包下的 流式读取器)。一旦你读取了连接中的字节,新的字节就会等待下一次读取。

为了保存响应,确实需要先将其读取完毕,并将结果保存在一个变量中。

body, _ := io.ReadAll(get.Body)

为了在 Go 编程语言中多次重用这个字节切片,标准 API 提供了一个带缓冲的读取器 bytes.NewReader

这个缓冲区提供了 Reset([]byte) 方法来重置缓冲区的状态。

bytes.Reader.Reset 非常有用,可以多次读取相同的字节缓冲区而不进行分配。相比之下,每次调用 bytes.NewReader 都会进行分配。

最后,在两次连续调用 c.Parser 之间,你应该使用之前收集到的字节缓冲区重置缓冲区。

例如:

buf := bytes.NewReader(body)
// 初始化解析器
c.Parse()
// 处理结果

// 重置 buf,再次解析
buf.Reset(body)
c.Parse()

你可以尝试这个版本 https://play.golang.org/p/YaVtCTZHZEP 它使用了 strings.NewReader 缓冲区,但接口和行为类似。

  • 这个原则并不是非常明显,一般原则是传输读取头部,除非你消费了它,否则不会触及主体。参见 这里

英文:

As you have figured, the http.Response.Body is exposed as an io.Reader, this reader is not re usable because it is connected straight to the underlying connection* (might be tcp/utp/or any other stream like reader under the net package).
Once you read the bytes out of the connection, new bytes are sitting their waiting for another read.

In order to save the response, indeed, you need to drain it first, and save that result within a variable.

body, _ := io.ReadAll(get.Body)

To re use that slice of bytes many time using the Go programming language, the standard API provides a buffered reader bytes.NewReader.

This buffer adequately offers the Reset([]byte) method to reset the state of the buffer.

The bytes.Reader.Reset is very useful to read multiple times the same bytes buffer with no allocations. In comparison, bytes.NewReader allocates every time it is called.

Finally, between two consecutive calls to c.Parser, you should reset the buffer with bytes buffer you have collected previously.

such as :

buf := bytes.NewReader(body)
// initialize the parser
c.Parse()
// process the result

// reset the buf, parse again
buf.Reset(body)
c.Parse()

You can try this version https://play.golang.org/p/YaVtCTZHZEP It uses the strings.NewReader buffer, but the interface and behavior are similar.

  • not super obvious, that is the general principle, the transport reads the headers, and leave the body untouched unless you consume it. see also that.

huangapple
  • 本文由 发表于 2021年8月14日 23:02:57
  • 转载请务必保留本文链接:https://go.coder-hub.com/68784480.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定