英文:
Golang order output of go routines
问题
我有16个Go例程,它们返回的输出通常是一个结构体。
type output struct {
index int
description string
}
现在,这16个Go例程都是并行运行的,所有Go例程的预期输出结构体总数为100万。我已经使用了Go语言的基本排序方法,但这样做非常昂贵。有人可以帮我提供一种排序输出的方法,基于索引,并且我需要根据索引的顺序将"description"字段写入文件。
例如,如果一个Go例程的输出是{2, "Hello"}, {9, "Hey"}, {4, "Hola"},那么我的输出文件应该包含:
Hello
Hola
Hey
所有这些Go例程都是并行运行的,我无法控制执行的顺序,因此我将索引传递给最终对输出进行排序。
英文:
I have 16 go routines which return output , which is typically a struct.
struct output{
index int,
description string,
}
Now all these 16 go routines run in parallel, and the total expected output structs from all the go routines is expected to be a million. I have used the basic sorting of go lang it is very expensive to do that, could some one help me with the approach to take to sort the output based on the index and I need to write the "description" field on to a file based on the order of index.
For instance ,
if a go routine gives output as {2, "Hello"},{9,"Hey"},{4,"Hola"}, my output file should contain
Hello
Hola
Hey
All these go routines run in parallel and I have no control on the order of execution , hence I am passing the index to finally order the output.
答案1
得分: 6
在回答之前,有一件事需要考虑,你提供的示例代码无法编译通过。在Go语言中定义结构体类型,你需要将语法更改为:
type output struct {
index int
description string
}
关于你的问题的潜在解决方案——如果你已经可靠地拥有唯一的索引以及结果集的预期计数,你就不需要进行任何排序。相反,通过通道同步goroutine,并在相应的索引处将输出插入到分配的切片中。然后,你可以迭代该切片将内容写入文件。例如:
ch := make(chan output) //每个goroutine将向该通道写入数据
wg := new(sync.WaitGroup) //用于同步所有goroutine的等待组
//执行16个goroutine
for i := 0; i < 16; i++ {
wg.Add(1)
go worker(ch, wg) //期望每个worker函数在完成工作部分时调用wg.Done()
}
//创建一个“quit”通道,用于向下面的select语句发出信号,表示所有goroutine都已完成
quit := make(chan bool)
go func() {
wg.Wait()
quit <- true
}()
//初始化一个长度和容量都为1000000的切片,即你问题中提到的预期结果大小
sorted := make([]string, 1000000, 1000000)
//使用for循环和select模式同步来自16个goroutine的结果,并将它们插入到sorted切片中
for {
select {
case output := <-ch:
//这种方式不够健壮,请参考下面的示例说明
sorted[output.index] = output.description
case <-quit:
//实现一个函数,将sorted切片传递给它,用于写入结果
//例如:writeToFile(sorted)
return
}
}
关于这个解决方案的一些说明:它依赖于你知道预期结果集的大小。如果你不知道结果集的大小,在select语句中,你需要检查从ch
中读取的索引是否超过了sorted
切片的长度,并在插入之前分配额外的空间,否则你的程序将因为越界错误而崩溃。
英文:
One thing to consider before getting into the answer is your example code will not compile. To define a type of struct in Go, you would need to change your syntax to
type output struct {
index int
description string
}
In terms of a potential solution to your problem - if you already reliably have unique index's as well as the expected count of the result set - you should not have to do any sorting at all. Instead synchronize the go routines over a channel and insert the output in an allocated slice at the respective index. You can then iterate over that slice to write the contents to a file. For example:
ch := make(chan output) //each go routine will write to this channel
wg := new(sync.WaitGroup) //wait group to sync all go routines
//execute 16 goroutines
for i := 0; i < 16; i++ {
wg.Add(1)
go worker(ch, wg) //this is expecting each worker func to call wg.Done() when completing its portion of work
}
//create a "quit" channel that will be used to signal to the select statement below that your go routines are all done
quit := make(chan bool)
go func() {
wg.Wait()
quit <- true
}()
//initialize a slice with length and capacity to 1mil, the expected result size mentioned in your question
sorted := make([]string, 1000000, 1000000)
//use the for loop, select pattern to sync the results from your 16 go routines and insert them into the sorted slice
for {
select {
case output := <-ch:
//this is not robust - check notes below example
sorted[output.index] = output.description
case <-quit:
//implement a function you could pass the sorted slice to that will write the results
// Ex: writeToFile(sorted)
return
}
}
A couple notes on this solution: it is dependent upon you knowing the size of the expected result set. If you do not know what the size of the result set is - in the select statement you will need to check if the index is read from ch
exceeds the length of the sorted
slice and allocate additional space before inserting our you program will crash as a result of an out of bounds error
答案2
得分: 0
你可以使用Ordered-concurrently模块来合并你的输入,然后按顺序打印它们。
你可以在这里找到该模块的GitHub链接:https://github.com/tejzpr/ordered-concurrently
以下是一个示例代码:https://play.golang.org/p/hkcIuRHj63h
package main
import (
concurrently "github.com/tejzpr/ordered-concurrently/v2"
"log"
"math/rand"
"time"
)
type loadWorker int
// 需要执行的工作
// 输入类型应实现WorkFunction接口
func (w loadWorker) Run() interface{} {
time.Sleep(time.Millisecond * time.Duration(rand.Intn(10)))
return w
}
func main() {
max := 10
inputChan := make(chan concurrently.WorkFunction)
output := concurrently.Process(inputChan, &concurrently.Options{PoolSize: 10, OutChannelBuffer: 10})
go func() {
for work := 0; work < max; work++ {
inputChan <- loadWorker(work)
}
close(inputChan)
}()
for out := range output {
log.Println(out.Value)
}
}
免责声明:我是该模块的创建者。
英文:
You could use the module Ordered-concurrently to merge your inputs and then print them in order.
https://github.com/tejzpr/ordered-concurrently
Example - https://play.golang.org/p/hkcIuRHj63h
package main
import (
concurrently "github.com/tejzpr/ordered-concurrently/v2"
"log"
"math/rand"
"time"
)
type loadWorker int
// The work that needs to be performed
// The input type should implement the WorkFunction interface
func (w loadWorker) Run() interface{} {
time.Sleep(time.Millisecond * time.Duration(rand.Intn(10)))
return w
}
func main() {
max := 10
inputChan := make(chan concurrently.WorkFunction)
output := concurrently.Process(inputChan, &concurrently.Options{PoolSize: 10, OutChannelBuffer: 10})
go func() {
for work := 0; work < max; work++ {
inputChan <- loadWorker(work)
}
close(inputChan)
}()
for out := range output {
log.Println(out.Value)
}
}
Disclaimer: I'm the module creator
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论