What is the smallest number of times I can copy the data in order to return the contents of an io.Reader as a string pointer?

huangapple go评论80阅读模式
英文:

What is the smallest number of times I can copy the data in order to return the contents of an io.Reader as a string pointer?

问题

一个同事实现了一个函数,该函数发起一个HTTP调用并将响应体作为字符串返回。为了简洁起见(不,我们并不真的忽略所有的错误):

func getStuff(id string) string {
    response, _ := http.Get(fmt.Sprintf("/some/url/%s", id))
    body, _ := ioutil.ReadAll(response.Body)
    return string(body)
}

响应通常相当大,所以我想避免不必要的复制。据我了解,按照当前的写法,我们正在进行三次数据复制:

  1. io.ReadAll 将数据从传入的 HTTP 连接复制到一个字节切片中。
  2. string(body) 将字节切片复制到一个字符串中。
  3. return 为调用函数中使用的字符串创建一个新的副本。

所以,首先,我对当前状态的理解是正确的吗?

简单的第一步是返回一个指针:

response, _ := http.Get(fmt.Sprintf("/some/url/%s", id))
body, _ := ioutil.ReadAll(response.Body)
result := string(body)
return &result

这样就避免了第三次复制。很好。但是我仍然复制了两次数据,我希望只复制一次。

我可以让他将返回类型改为 *[]byte,然后我们只需 return &body。但是,所有的调用者都需要自己将结果转换为字符串,这样我所做的只是将进行第二次复制的逻辑分散到多个其他地方,而不是将其集中在这里。

我可以使用 strings.Builderio.Copy

builder := new(strings.Builder)
_, _ := io.Copy(buf, response.Body)
result := buf.String()
return &result

这可能会更高效一点(我不太确定;是吗?),但我仍然得到了两次数据复制。

是否可能只使用一次数据复制来实现这个目标?

我认为不可能;只是想知道我是否错了!

英文:

A coworker implemented a function that makes an HTTP call and returns the response body as a string. Simplifying a bit for brevity (no, we're not really ignoring all the errors):

func getStuff(id string) string {
    response, _ := http.Get(fmt.Sprintf("/some/url/%s", id))
    body, _ := ioutil.ReadAll(response.Body)
    return string(body)
}

The response is typically fairly large, so I want to avoid unnecessary copying. As I understand it, as written, we are making three copies of the response data:

  1. io.ReadAll copies the data from the incoming HTTP connection to a byte slice.
  2. string(body) copies the byte slice into a string.
  3. return makes a new copy of the string for use in the calling function.

So, first of all, do I understand the current state correctly?

The easy first step is to return a pointer:

response, _ := http.Get(fmt.Sprintf("/some/url/%s", id))
body, _ := ioutil.ReadAll(response.Body)
result := string(body)
return &result

That avoids the third copy. Cool. But I'm still making two copies of the data, and I'd like to make just one.

I could have him change the return type to *[]byte, and then we can just return &body. But then all of the callers would need to convert the result to string themselves, and then all I've accomplished is to spread the logic that makes the second copy around to multiple other places instead of keeping it consolidated here.

I could use strings.Builder and io.Copy:

builder := new(strings.Builder)
_, _ := io.Copy(buf, response.Body)
result buf.String()
return &result

And that might be a tiny bit more efficient (I don't really know; is it?), but I still end up with two copies of the data.

Is it possible to do this with just a single copy of the data?

I think it's not; just wondering if I'm wrong!

答案1

得分: 3

复制字符串只会复制字符串头部,其中包含两个字:指向包含字符串数据的数组的指针和长度。它不会复制字符串内容。因此,从函数返回字符串不会复制字符串。

如果您将该字符串传递给类似于JSON解组的东西,您可以返回[]byte,甚至是来自主体的读取器,并对其进行处理。如果您需要将其作为字符串使用,那么两次复制是您能够实现的最佳方式:一次从主体中读取它,第二次将其转换为字符串。

英文:

Copying a string only copies the string header, which contains two words: pointer to the array containing string data, and the length. It does not copy the string contents. Thus, returning a string from a function will not copy the string.

If you are passing that string to something like json unmarshaling, you can return the []byte, or even, the reader from the body, and process it. If you need it as a string, then two-copies is the best you can have: once to read it from the body, and second, to convert it into a string.

huangapple
  • 本文由 发表于 2022年4月22日 03:44:53
  • 转载请务必保留本文链接:https://go.coder-hub.com/71960114.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定