新手:在Go语言中正确地调整[]byte大小(分块)

huangapple go评论332阅读模式
英文:

Newbie: Properly sizing a []byte size in GO (Chunking)

问题

新手警报!

不太确定如何做到这一点 - 我想制作一个“文件分块器”,从二进制文件中获取固定的片段,以便稍后上传为一个学习项目。

我目前有以下代码:

type (
   fileChunk  []byte
   fileChunks []fileChunk
)

func NumChunks(fi os.FileInfo, chunkSize int) int {
  chunks :=  fi.Size() / int64(chunkSize)
  if rem := fi.Size() % int64(chunkSize) != 0; rem {
    chunks++
  }
  return int(chunks)
}

// 省略了错误检查以简洁起见
func chunker(filePtr *string) fileChunks {
  f, err := os.Open(*filePtr)
  defer f.Close()

  // 创建初始容器以保存片段
  file_chunks := make(fileChunks, 0)

  fi, err := f.Stat()  
  // 显示原始文件的大小
  fmt.Printf("文件名:%s,大小:%d\n", fi.Name(), fi.Size())

  // 将文件分割成10000字节的片段
  chunkSize := 10000
  chunks :=  NumChunks(fi, chunkSize)

  fmt.Printf("需要 %d 个片段来处理该文件\n", chunks)

  for i := 0; i < chunks; i++ {
    b := make(fileChunk, chunkSize) // 分配一个大小为10000字节的片段

    n1, err := f.Read(b)
    fmt.Printf("片段:%d,读取了 %d 字节\n", i, n1)

    // 将片段添加到容器中
    file_chunks = append(file_chunks, b)
  }

  fmt.Println(len(file_chunks))

  return  file_chunks
}

这基本上都可以正常工作,但是如果我的文件大小为31234字节,那么我将得到三个片段,其中前30000字节来自文件,最后一个“片段”由1234个“文件字节”和“填充”组成,以达到10000字节的片段大小 - 我希望“剩余部分”的文件片段([]byte)的大小为1234,而不是完整的容量 - 应该如何正确处理这个问题?在接收端,我将把所有的片段“拼接”在一起,以重新创建原始文件。

英文:

Go Newbie alert!

Not quite sure how to do this - I want to make a "file chunker" where I grab fixed slices out of a binary file for later upload as a learning project.

I currently have this:

    type (
       fileChunk  []byte
       fileChunks []fileChunk
    )


    func NumChunks(fi os.FileInfo, chunkSize int) int {
	  chunks :=  fi.Size() / int64(chunkSize)
	  if rem := fi.Size() % int64(chunkSize) != 0; rem {
        chunks++
	  }
	  return int(chunks)
    }

    // left out err checks for brevity
    func chunker(filePtr *string) fileChunks {
	  f, err := os.Open(*filePtr)
	  defer f.Close()

      // create the initial container to hold the slices
	  file_chunks := make(fileChunks, 0)

	  
	  fi, err := f.Stat()  
      // show me how big the original file is	
	  fmt.Printf(&quot;File Name: %s,  Size: %d\n&quot;, fi.Name(), fi.Size())

      // let&#39;s partition it into 10000 byte pieces
	  chunkSize := 10000
	  chunks :=  NumChunks(fi, chunkSize)

	  fmt.Printf(&quot;Need %d chunks for this file&quot;, chunks)

	  for i := 0; i &lt; chunks; i++ {
		b := make(fileChunk, chunkSize) // allocate a chunk, 10000 bytes

		n1, err := f.Read(b)
		fmt.Printf(&quot;Chunk: %d, %d bytes read\n&quot;, i, n1)

            // add chunk to &quot;container&quot;
		file_chunks = append(file_chunks, b)
	  }

	  fmt.Println(len(file_chunks))

	  return  file_chunks
    }

This all works mostly fine, but here's what happens if my fize size is 31234 bytes, then I'll end up with three slices full of the first 30000 bytes from the file, the final "chunk" will consist of 1234 "file bytes" followed by "padding" to the 10000 byte chunk size - I'd like the "remainder" filechunk ([]byte) to be sized to 1234, not the full capacity - what would the proper way to do this be? On the receiving side I would then "stitch" together all the pieces to recreate the original file.

答案1

得分: 1

你需要重新切片剩余的块,使其长度只等于最后一个块读取的长度:

n1, err := f.Read(b)
fmt.Printf("Chunk: %d, %d bytes read\n", i, n1)
b = b[:n1]

这段代码对所有的块进行了重新切片。通常情况下,非剩余块的n1值都为10000,但不能保证。文档中说:“Read从文件中最多读取len(b)个字节。”所以需要一直关注n1的值。

英文:

You need to re-slice the remainder chunk to be just the length of the last chunk read:

n1, err := f.Read(b)
fmt.Printf(&quot;Chunk: %d, %d bytes read\n&quot;, i, n1)
b = b[:n1]

This does the re-slicing for all chunks. Normally, n1 will be 10000 for all the non-remainder chunks, but there is no guarantee. The docs say "Read reads up to len(b) bytes from the File." So it's good to pay attention to n1 all the time.

huangapple
  • 本文由 发表于 2013年11月15日 23:07:41
  • 转载请务必保留本文链接:https://go.coder-hub.com/20004134.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定