问题

我正在解决一个涉及生产者-消费者模式的问题。我有一个生产者负责生成任务，还有n个消费者负责消费任务。消费者的任务是从文件中读取一些数据，然后将数据上传到S3。一个消费者可以读取多达xMB（8/16/32）的数据，然后将其上传到S3。将所有数据保存在内存中导致内存消耗超出了程序的预期，所以我改为从文件中读取数据，然后将其写入临时文件，最后将文件上传到S3。虽然这在内存方面表现更好，但CPU的负担增加了。我想知道是否有办法在不同的goroutine之间分配一定大小的内存，并在每个goroutine调用中重复使用它？

我希望的是，如果我有4个goroutine，那么我可以分配4个不同大小为xMB的数组，并在每个goroutine调用中使用相同的数组，这样goroutine就不需要每次都分配内存，也不需要依赖垃圾回收来释放内存。

编辑：添加了我的代码要点。我的Go消费者代码如下：

type Block struct {
    offset int64
    size   int64
}

func consumer(blocks []Block) {
    var dataArr []byte
    for _, block := range blocks {
        data := file.Read(block.offset, block.size)
        dataArr = append(dataArr, data)
    }
    upload(dataArr)
}

我根据块从文件中读取数据，这个块可以包含多个由xMB限制的小块或一个大小为xMB的大块。

编辑2：根据评论中的建议尝试了sync.Pool，但是内存消耗没有改善。我做错了什么吗？

var pool *sync.Pool
func main() {
    pool = &sync.Pool{
        New: func() interface{} {
            return make([]byte, 16777216)
        },
    }
    for i := 0; i < 4; i++ {
        // blocks是一个二维数组，每个索引包含一组块。
        go consumer(blocks[i])
    }
}

func consumer(blocks []Block) {
    var dataArr []byte
    d := pool.Get().([]byte)
    for _, block := range blocks {
        file.Read(block.offset, block.size, d[block.offset:block.size])
    }
    upload(data)
    pool.Put(data)
}

英文:

I am working on a problem that involves a producer-consumer pattern. I have one producer who produces the task and 'n' consumers that consumes the task. A consumer task is to read some data from a file and then upload that data to S3. One consumer can read up to xMB(8/16/32) of data and then uploads it to s3. keeping all the data in memory was causing memory consumption that was more than what is expected from the program so I switched to reading the data from file and then writing it to some temporary file and then uploading the file to S3, though this performed better in terms of memory but CPU took a hit. I wonder if there is any way to allocate a fixed size of memory once and then use it among different goroutines?
What I would want is that if I have 4 goroutines then I can allocate 4 different array of xMB and then use the same array in each goroutine invocation, so that a goroutine doesn't allocate for memory every time and also doesn't depend on GC to free the memory?

Edit: Adding a crux of my code. My go consumer looks like:

type struct Block {
   offset int64
   size int64
}

func consumer (blocks []Block) {
   var dataArr []byte
   for _, block := range blocks {
  	  data := file.Read(block.offset, block.size)
	  dataArr = append(dataArr, data)
   }
   upload(dataArr)	
}

I read the data from file based on Blocks, this block can contain several small chunks limited by xMB or one big chunk of xMB.

Edit2: Tried sync.Pool based on suggestions in comment. but I did not see any improvement in memory consumption. Am I doing something wrong?

var pool *sync.Pool
func main() {
  pool = &amp;sync.Pool{
	New: func()interface{} {
		return make([]byte, 16777216)
	},
  }
  for i:=0; i &lt; 4; i++ {
     // blocks is 2-d array each index contains array of blocks.
     go consumer(blocks[i])
  }
 
}
  
go consumer(blocks []Blocks) {
    var dataArr []byte
    d := pool.(Get).([]byte)
    for _, block := range blocks {
     file.Read(block.offset,block.size,d[block.offset:block.size])
    }
    upload(data)  
    pool.put(data)
}

答案1

得分: 1

请看一下StaticCheck的SA6002，关于sync.Pool的内容。你也可以使用pprof工具。

英文:

Take a look at SA6002 of StaticCheck, about sync.Pool. You can also use pprof tool.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

修复使用goroutine的Go程序的内存消耗问题

问题

答案1

Creating generic code using database/sql package?

Golang方法的通用返回类型

没有使用丰富查询 – Hyeperledger Fabric v1.0 的结果。

从访问令牌中获取Google登录配置文件信息 – Golang

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论