Golang中的多部分表单上传和内存泄漏问题?

huangapple go评论89阅读模式
英文:

Multipart form uploads + memory leaks in golang?

问题

以下是服务器代码的翻译:

package main

import (
  "fmt"
  "net/http"
)

func handler(w http.ResponseWriter, r *http.Request) {
  file, _, err := r.FormFile("file")
  if err != nil {
    fmt.Fprintln(w, err)
    return
  }
  defer file.Close()

  return
}

func main() {
  http.ListenAndServe(":8081", http.HandlerFunc(handler))
}

运行上述代码,并使用以下命令进行调用:

curl -i -F "file=@./large-file" --form hello=world http://localhost:8081/

其中large-file大约为80MB,在darwin/amd64和linux/amd64上的Go 1.4.2中似乎存在某种内存泄漏问题。

当我连接pprof时,我发现在多次调用服务后(最终由上述代码中的r.FormFile调用),bytes.makeSlice使用了96MB的内存。

如果我不断调用curl,进程的内存使用量会随时间增长,最终在我的机器上稳定在约300MB左右。

你有什么想法?我猜这不是预期的结果,我可能做错了什么?

英文:

The following server code:

package main

import (
  "fmt"
  "net/http"
)

func handler(w http.ResponseWriter, r *http.Request) {
  file, _, err := r.FormFile("file")
  if err != nil {
    fmt.Fprintln(w, err)
    return
  }
  defer file.Close()

  return
}

func main() {
  http.ListenAndServe(":8081", http.HandlerFunc(handler))
}

being run and then calling it with:

curl -i -F "file=@./large-file" --form hello=world http://localhost:8081/

Where the large-file is about 80MB seems to have some form of memory leak in Go 1.4.2 on darwin/amd64 and linux/amd64.

When I hook up pprof, I see that bytes.makeSlice uses 96MB of memory after calling the service a few times (eventually called by r.FormFile in my code above).

If I keep calling curl, the memory usage of the process grow slows over time, eventually seeming to stick around 300MB on my machine.

Thoughts? I assume this isn't expected/ I'm doing something wrong?

答案1

得分: 13

如果内存使用量停滞在一个“最大值”,我不会真的称之为内存泄漏。我更愿意说是垃圾回收器不积极,而是懒惰。或者只是不想在内存频繁重新分配/需要的情况下物理释放内存。如果真的是内存泄漏,使用的内存不会停留在300MB。

r.FormFile("file")将调用Request.ParseMultipartForm(),并将32MB用作maxMemory参数的值(在request.go中定义的defaultMaxMemory变量的值)。由于您上传了一个较大的文件(80MB),至少会创建一个大小为32MB的缓冲区 - 最终(这是在multipart.Reader.ReadFrom()中实现的)。由于使用bytes.Buffer来读取内容,读取过程将从一个小的或空的缓冲区开始,并在需要更大的缓冲区时重新分配。

缓冲区重新分配的策略和缓冲区大小是与实现相关的(还取决于从请求中读取/解码的块的大小),但为了有一个大致的概念,可以这样想象:0字节,4KB,16KB,64KB,256KB,1MB,4MB,16MB,64MB。再次强调,这只是理论上的,但说明了即使在内存中读取文件的前32MB时,总和甚至可以增长到超过100MB,然后决定将其移动/存储在文件中。有关详细信息,请参阅multipart.Reader.ReadFrom()的实现。这合理地解释了96MB的分配。

多次执行此操作,如果垃圾回收器不立即释放已分配的缓冲区,很容易就会达到300MB。如果有足够的空闲内存,垃圾回收器就没有释放内存的压力。你看到它变得相对较大的原因是因为后台使用了大型缓冲区。如果你上传一个1MB的文件,你可能不会遇到这个问题。

如果这对你很重要,你也可以手动调用Request.ParseMultipartForm()并使用较小的maxMemory值,例如:

r.ParseMultipartForm(2 << 20) // 2MB
file, _, err := r.FormFile("file")
// ...处理程序的其余部分

这样做会在后台分配更小(且更少)的缓冲区。

英文:

If the memory usage stagnates at a "maximum", I wouldn't really call that a memory leak. I would rather say the GC not being eager and being lazy. Or just don't want to physically free memory if it is frequently reallocated / needed. If it would be really a memory leak, used memory wouldn't stop at 300 MB.

r.FormFile(&quot;file&quot;) will result in a call to Request.ParseMultipartForm(), and 32 MB will be used as the value of maxMemory parameter (the value of defaultMaxMemory variable defined in request.go). Since you upload a larger file (80 MB), a buffer of size 32 MB at least will be created - eventually (this is implemented in multipart.Reader.ReadFrom()). Since bytes.Buffer is used to read the content, the reading process will start with a small or empty buffer, and reallocate whenever a bigger is needed.

The strategy of buffer reallocations and the buffer sizes are implementation dependent (and also depends on the size of the chunks being read/decoded from the request), but just to have a rough picture, imagine it like this: 0 bytes, 4 KB, 16 KB, 64 KB, 256 KB, 1 MB, 4 MB, 16 MB, 64 MB. Again, this is just theoretical, but illustrates that the sum can even grow beyond 100 MB just to read the first 32 MB of the file in memory at which point it will be decided that it will be moved/stored in file. See the implementation of multipart.Reader.ReadFrom() for details. This reasonably explains the 96 MB allocation.

Do this a couple of times, and without the GC releasing the allocated buffers immediately, you can easily end up with 300 MB. And if there is enough free memory, there is no pressure on the GC to hurry with releasing memory. The reason why you see it growing relatively big is because large buffers are used in the background. Would you do the same with uploading a 1MB file, you would probably not experience this.

If it is important to you, you can also call Request.ParseMultipartForm() manually with a smaller maxMemory value, e.g.

r.ParseMultipartForm(2 &lt;&lt; 20) // 2 MB
file, _, err := r.FormFile(&quot;file&quot;)
// ... rest of your handler

Doing so much smaller (and fewer) buffers will be allocated in the background.

huangapple
  • 本文由 发表于 2015年6月5日 10:32:51
  • 转载请务必保留本文链接:https://go.coder-hub.com/30657454.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定