英文:
Use go worker pool implementation to write files in parallel?
问题
我有一个切片clientFiles
,我正在顺序迭代它,并将其逐个写入S3,如下所示:
for _, v := range clientFiles {
err := writeToS3(v.FileContent, s3Connection, v.FileName, bucketName, v.FolderName)
if err != nil {
fmt.Println(err)
}
}
上述代码运行良好,但我想并行地写入S3,以加快速度。在这里,使用工作池实现是否更好,或者还有其他更好的选择?
我找到了下面的代码,它使用了等待组(wait group),但我不确定这是否是在这里使用的更好选项:
wg := sync.WaitGroup{}
for _, v := range clientFiles {
wg.Add(1)
go func(v ClientMapFile) {
err := writeToS3(v.FileContent, s3Connection, v.FileName, bucketName, v.FolderName)
if err != nil {
fmt.Println(err)
}
}(v)
}
英文:
I have a slice clientFiles
which I am iterating over sequentially and writing it in S3 one by one as shown below:
for _, v := range clientFiles {
err := writeToS3(v.FileContent, s3Connection, v.FileName, bucketName, v.FolderName)
if err != nil {
fmt.Println(err)
}
}
The above code works fine but I want to write in S3 in parallel so that I can speed things up. Does worker pool implementation works better here or is there any other better option here?
I got below code which uses wait group but I am not sure if this is better option here to work with?
wg := sync.WaitGroup{}
for _, v := range clientFiles {
wg.Add(1)
go func(v ClientMapFile) {
err := writeToS3(v.FileContent, s3Connection, v.FileName, bucketName, v.FolderName)
if err != nil {
fmt.Println(err)
}
}(v)
}
答案1
得分: 2
是的,并行化应该会有所帮助。
在关于使用WaitGroup的更改后,你的代码应该能够正常工作。你需要在for循环后标记工作为已完成,并等待所有goroutine完成。
var wg sync.WaitGroup
for _, v := range clientFiles {
wg.Add(1)
go func(v ClientMapFile) {
defer wg.Done()
err := writeToS3(v.FileContent, s3Connection, v.FileName, bucketName, v.FolderName)
if err != nil {
fmt.Println(err)
}
}(v)
}
wg.Wait()
请注意,你的解决方案为N个文件创建了N个goroutine,如果文件数量非常大,这可能不是最优的。在这种情况下,可以使用这种模式https://gobyexample.com/worker-pools,并尝试不同数量的工作线程,找出在性能方面最适合你的数量。
英文:
Yes, parallelising should help.
Your code should work well after changes regarding usage of WaitGroup. You need to mark work as Done and wait for all goroutines to finish after for-loop.
var wg sync.WaitGroup
for _, v := range clientFiles {
wg.Add(1)
go func(v ClientMapFile) {
defer wg.Done()
err := writeToS3(v.FileContent, s3Connection, v.FileName, bucketName, v.FolderName)
if err != nil {
fmt.Println(err)
}
}(v)
}
wg.Wait()
Be aware that your solution creates N goroutines for N files, which can be not optimal if number of files is very big. In such case use this pattern https://gobyexample.com/worker-pools and try different number of workers to find which works best for you in terms of performance.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论