如何使用Go高效地下载大文件?

huangapple go评论112阅读模式
英文:

How can I efficiently download a large file using Go?

问题

有没有一种使用Go语言下载大文件的方法,可以直接将内容存储到文件中,而不是在写入文件之前将其全部存储在内存中?因为文件太大,将其全部存储在内存中再写入文件会使用掉所有的内存。

英文:

Is there a way to download a large file using Go that will store the content directly into a file instead of storing it all in memory before writing it to a file? Because the file is so big, storing it all in memory before writing it to a file is going to use up all the memory.

答案1

得分: 272

import ("net/http"; "io"; "os")
...
out, err := os.Create("output.txt")
defer out.Close()
...
resp, err := http.Get("http://example.com/")
defer resp.Body.Close()
...
n, err := io.Copy(out, resp.Body)

http.Response的Body是一个Reader,所以你可以使用任何接受Reader的函数,例如,每次读取一个块而不是一次性读取全部。在这个特定的情况下,io.Copy()会为你完成大部分工作。

英文:

I'll assume you mean download via http (error checks omitted for brevity):

  1. import ("net/http"; "io"; "os")
  2. ...
  3. out, err := os.Create("output.txt")
  4. defer out.Close()
  5. ...
  6. resp, err := http.Get("http://example.com/")
  7. defer resp.Body.Close()
  8. ...
  9. n, err := io.Copy(out, resp.Body)

The http.Response's Body is a Reader, so you can use any functions that take a Reader, to, e.g. read a chunk at a time rather than all at once. In this specific case, io.Copy() does the gruntwork for you.

答案2

得分: 87

更详细的版本Steve M的答案

  1. import (
  2. "os"
  3. "net/http"
  4. "io"
  5. )
  6. func downloadFile(filepath string, url string) (err error) {
  7. // 创建文件
  8. out, err := os.Create(filepath)
  9. if err != nil {
  10. return err
  11. }
  12. defer out.Close()
  13. // 获取数据
  14. resp, err := http.Get(url)
  15. if err != nil {
  16. return err
  17. }
  18. defer resp.Body.Close()
  19. // 检查服务器响应
  20. if resp.StatusCode != http.StatusOK {
  21. return fmt.Errorf("bad status: %s", resp.Status)
  22. }
  23. // 将响应体写入文件
  24. _, err = io.Copy(out, resp.Body)
  25. if err != nil {
  26. return err
  27. }
  28. return nil
  29. }
英文:

A more descriptive version of Steve M's answer.

  1. import (
  2. "os"
  3. "net/http"
  4. "io"
  5. )
  6. func downloadFile(filepath string, url string) (err error) {
  7. // Create the file
  8. out, err := os.Create(filepath)
  9. if err != nil {
  10. return err
  11. }
  12. defer out.Close()
  13. // Get the data
  14. resp, err := http.Get(url)
  15. if err != nil {
  16. return err
  17. }
  18. defer resp.Body.Close()
  19. // Check server response
  20. if resp.StatusCode != http.StatusOK {
  21. return fmt.Errorf("bad status: %s", resp.Status)
  22. }
  23. // Writer the body to file
  24. _, err = io.Copy(out, resp.Body)
  25. if err != nil {
  26. return err
  27. }
  28. return nil
  29. }

答案3

得分: 17

上面选择的答案使用io.Copy正是你所需要的,但如果你对其他功能感兴趣,比如恢复中断的下载、自动命名文件、校验和验证或监控多个下载的进度,请查看grab包。

英文:

The answer selected above using io.Copy is exactly what you need, but if you are interested in additional features like resuming broken downloads, auto-naming files, checksum validation or monitoring progress of multiple downloads, checkout the grab package.

答案4

得分: 0

我也认为有一个进度指示器很好,特别是对于较大的文件。所以在实现一个简单的进度指示器时,我想提出我的建议来解决这个问题。(为了简洁起见,大部分错误处理被省略了)。

  1. package main
  2. import (
  3. "fmt"
  4. "io"
  5. "log"
  6. "net/http"
  7. "os"
  8. )
  9. func main() {
  10. temp_path := ".tmp"
  11. req, _ := http.NewRequest("GET", "http://212.183.159.230/200MB.zip", nil)
  12. resp, _ := http.DefaultClient.Do(req)
  13. defer resp.Body.Close()
  14. f, _ := os.OpenFile(temp_path, os.O_CREATE|os.O_WRONLY, 0644)
  15. defer f.Close()
  16. buf := make([]byte, 32*1024)
  17. var downloaded int64
  18. for {
  19. n, err := resp.Body.Read(buf)
  20. if err != nil {
  21. if err == io.EOF {
  22. break
  23. }
  24. log.Fatalf("下载时出错:%v", err)
  25. }
  26. if n > 0 {
  27. f.Write(buf[:n])
  28. downloaded += int64(n)
  29. fmt.Printf("\r下载中... %.2f%%", float64(downloaded)/float64(resp.ContentLength)*100)
  30. }
  31. }
  32. os.Rename(temp_path, "wordpress.zip")
  33. }

为了使用io.Copy,我们可以实现一个io.Reader。这可能是在实际情况下首选的方法,以使其可重用并更容易进行测试。所以这是第二个版本:

  1. package main
  2. import (
  3. "fmt"
  4. "io"
  5. "log"
  6. "net/http"
  7. "os"
  8. "time"
  9. )
  10. type ProgressReader struct {
  11. Reader io.Reader
  12. Size int64
  13. Pos int64
  14. }
  15. func (pr *ProgressReader) Read(p []byte) (int, error) {
  16. n, err := pr.Reader.Read(p)
  17. if err == nil {
  18. pr.Pos += int64(n)
  19. fmt.Printf("\r下载中... %.2f%%", float64(pr.Pos)/float64(pr.Size)*100)
  20. }
  21. return n, err
  22. }
  23. func main() {
  24. start := time.Now().UnixMilli()
  25. tempPath := ".tmp"
  26. outPath := "200MB.zip"
  27. req, _ := http.NewRequest("GET", "http://212.183.159.230/200MB.zip", nil)
  28. resp, _ := http.DefaultClient.Do(req)
  29. if resp.StatusCode != 200 {
  30. log.Fatalf("下载时出错:%v", resp.StatusCode)
  31. }
  32. defer resp.Body.Close()
  33. f, _ := os.OpenFile(tempPath, os.O_CREATE|os.O_WRONLY, 0644)
  34. defer f.Close()
  35. progressReader := &ProgressReader{
  36. Reader: resp.Body,
  37. Size: resp.ContentLength,
  38. }
  39. if _, err := io.Copy(f, progressReader); err != nil {
  40. log.Fatalf("下载时出错:%v", err)
  41. }
  42. os.Rename(tempPath, outPath)
  43. fmt.Println(" - 下载完成!")
  44. fmt.Printf("耗时:%.2fs\n", float64(time.Now().UnixMilli()-start)/1000)
  45. }
英文:

I also think it's nice to have a progress indicator, especially for larger files. So I want to throw in my two cents for a solution to this problem while implementing a simple progress indicator.
(Most error handling also omitted for brevety).

  1. package main
  2. import (
  3. "fmt"
  4. "io"
  5. "log"
  6. "net/http"
  7. "os"
  8. )
  9. func main() {
  10. temp_path := ".tmp"
  11. req, _ := http.NewRequest("GET", "http://212.183.159.230/200MB.zip", nil)
  12. resp, _ := http.DefaultClient.Do(req)
  13. defer resp.Body.Close()
  14. f, _ := os.OpenFile(temp_path, os.O_CREATE|os.O_WRONLY, 0644)
  15. defer f.Close()
  16. buf := make([]byte, 32*1024)
  17. var downloaded int64
  18. for {
  19. n, err := resp.Body.Read(buf)
  20. if err != nil {
  21. if err == io.EOF {
  22. break
  23. }
  24. log.Fatalf("Error while downloading: %v", err)
  25. }
  26. if n > 0 {
  27. f.Write(buf[:n])
  28. downloaded += int64(n)
  29. fmt.Printf("\rDownloading... %.2f%%", float64(downloaded)/float64(resp.ContentLength)*100)
  30. }
  31. }
  32. os.Rename(temp_path, "wordpress.zip")
  33. }

To use io.Copy we can implement an io.Reader . Which probably will be the preferred approach in a real world scenario to make it reusable and easier to test. So here is the second version:

  1. package main
  2. import (
  3. "fmt"
  4. "io"
  5. "log"
  6. "net/http"
  7. "os"
  8. "time"
  9. )
  10. type ProgressReader struct {
  11. Reader io.Reader
  12. Size int64
  13. Pos int64
  14. }
  15. func (pr *ProgressReader) Read(p []byte) (int, error) {
  16. n, err := pr.Reader.Read(p)
  17. if err == nil {
  18. pr.Pos += int64(n)
  19. fmt.Printf("\rDownloading... %.2f%%", float64(pr.Pos)/float64(pr.Size)*100)
  20. }
  21. return n, err
  22. }
  23. func main() {
  24. start := time.Now().UnixMilli()
  25. tempPath := ".tmp"
  26. outPath := "200MB.zip"
  27. req, _ := http.NewRequest("GET", "http://212.183.159.230/200MB.zip", nil)
  28. resp, _ := http.DefaultClient.Do(req)
  29. if resp.StatusCode != 200 {
  30. log.Fatalf("Error while downloading: %v", resp.StatusCode)
  31. }
  32. defer resp.Body.Close()
  33. f, _ := os.OpenFile(tempPath, os.O_CREATE|os.O_WRONLY, 0644)
  34. defer f.Close()
  35. progressReader := &ProgressReader{
  36. Reader: resp.Body,
  37. Size: resp.ContentLength,
  38. }
  39. if _, err := io.Copy(f, progressReader); err != nil {
  40. log.Fatalf("Error while downloading: %v", err)
  41. }
  42. os.Rename(tempPath, outPath)
  43. fmt.Println(" - Download completed!")
  44. fmt.Printf("Took: %.2fs\n", float64(time.Now().UnixMilli()-start)/1000)
  45. }

答案5

得分: -6

  1. 这是一个示例。https://github.com/thbar/golang-playground/blob/master/download-files.go

  2. 我还给你一些可能会帮到你的代码。

代码:

  1. func HTTPDownload(uri string) ([]byte, error) {
  2. fmt.Printf("HTTPDownload 来自: %s.\n", uri)
  3. res, err := http.Get(uri)
  4. if err != nil {
  5. log.Fatal(err)
  6. }
  7. defer res.Body.Close()
  8. d, err := ioutil.ReadAll(res.Body)
  9. if err != nil {
  10. log.Fatal(err)
  11. }
  12. fmt.Printf("ReadFile: 下载的大小: %d\n", len(d))
  13. return d, err
  14. }
  15. func WriteFile(dst string, d []byte) error {
  16. fmt.Printf("WriteFile: 下载的大小: %d\n", len(d))
  17. err := ioutil.WriteFile(dst, d, 0444)
  18. if err != nil {
  19. log.Fatal(err)
  20. }
  21. return err
  22. }
  23. func DownloadToFile(uri string, dst string) {
  24. fmt.Printf("DownloadToFile 来自: %s.\n", uri)
  25. if d, err := HTTPDownload(uri); err == nil {
  26. fmt.Printf("下载完成 %s.\n", uri)
  27. if WriteFile(dst, d) == nil {
  28. fmt.Printf("保存 %s 为 %s\n", uri, dst)
  29. }
  30. }
  31. }
英文:
  1. Here is a sample. https://github.com/thbar/golang-playground/blob/master/download-files.go

  2. Also I give u some codes might help you.

code:

  1. func HTTPDownload(uri string) ([]byte, error) {
  2. fmt.Printf("HTTPDownload From: %s.\n", uri)
  3. res, err := http.Get(uri)
  4. if err != nil {
  5. log.Fatal(err)
  6. }
  7. defer res.Body.Close()
  8. d, err := ioutil.ReadAll(res.Body)
  9. if err != nil {
  10. log.Fatal(err)
  11. }
  12. fmt.Printf("ReadFile: Size of download: %d\n", len(d))
  13. return d, err
  14. }
  15. func WriteFile(dst string, d []byte) error {
  16. fmt.Printf("WriteFile: Size of download: %d\n", len(d))
  17. err := ioutil.WriteFile(dst, d, 0444)
  18. if err != nil {
  19. log.Fatal(err)
  20. }
  21. return err
  22. }
  23. func DownloadToFile(uri string, dst string) {
  24. fmt.Printf("DownloadToFile From: %s.\n", uri)
  25. if d, err := HTTPDownload(uri); err == nil {
  26. fmt.Printf("downloaded %s.\n", uri)
  27. if WriteFile(dst, d) == nil {
  28. fmt.Printf("saved %s as %s\n", uri, dst)
  29. }
  30. }
  31. }

huangapple
  • 本文由 发表于 2012年7月28日 01:38:52
  • 转载请务必保留本文链接:https://go.coder-hub.com/11692860.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定