英文:
Golang worker pool implementation working unexpectedly
问题
我已经实现了一个如下所示的 Golang 工作池,其中 sem
和 work
是通道。sem
是一个用于跟踪当前活动的工作器(goroutine)数量的通道。work
是一个将函数传递给活动工作器执行的通道。timeout
将返回任何空闲超过指定时间的工作器。
package main
import (
"time"
)
type Pool struct {
sem chan struct{}
work chan func()
timeout time.Duration
}
func NewPool(max, size, spawn int, timeout time.Duration) *Pool {
if spawn <= 0 {
panic("workpool spawn is <= 0")
}
if spawn > max {
panic("workpool spawn > max workers")
}
p := &Pool{
sem: make(chan struct{}, max),
work: make(chan func(), size),
timeout: timeout,
}
for i := 0; i < spawn; i++ {
p.sem <- struct{}{}
go p.worker(func() {})
}
return p
}
func (p *Pool) AddTask(task func()) {
select {
case p.work <- task:
return
case p.sem <- struct{}{}:
go p.worker(task)
return
}
}
func (p *Pool) worker(task func()) {
t := time.NewTimer(p.timeout)
defer func() {
t.Stop()
<-p.sem
}()
task()
for {
select {
case task := <-p.work:
t.Reset(p.timeout)
task()
case <-t.C:
return
}
}
}
我通过将其包装在匿名函数中并将其传递给工作池来测试在 for 循环中打印 i 的值,如下所示:
package main
import (
"fmt"
"time"
)
func main() {
fmt.Println("Hello, world!")
p := NewPool(3, 10, 1, time.Duration(5)*time.Second)
for i := 0; i < 30; i++ {
p.AddTask(func() {
fmt.Print(i, " ")
})
}
time.Sleep(10 * time.Second)
fmt.Println("End")
}
预期输出应该是从 0 到 29 的连续数字,但实际输出是:
Hello, world!
12 12 12 12 12 12 12 12 12 12 12 12 13 25 25 25 25 25 25 25 25 25 25 25 26 25 30 30 30 30 End
我无法理解为什么输出会是这样的。
英文:
I have implemented a golang worker pool as below where sem and work are channels. sem is a channel to keep track of number of workers(goroutines) currently active. work is channel to pass functions to active workers to execute. timeout will return any worker idle for the timeout duration.
package main
import (
"time"
)
type Pool struct {
sem chan struct{}
work chan func()
timeout time.Duration
}
func NewPool(max, size, spawn int, timeout time.Duration) *Pool {
if spawn <= 0 {
panic("workpool spawn is <= 0")
}
if spawn > max {
panic("workpool spawn > max workers")
}
p := &Pool{
sem: make(chan struct{}, max),
work: make(chan func(), size),
timeout: timeout,
}
for i := 0; i < spawn; i++ {
p.sem <- struct{}{}
go p.worker(func() {})
}
return p
}
func (p *Pool) AddTask(task func()) {
select {
case p.work <- task:
return
case p.sem <- struct{}{}:
go p.worker(task)
return
}
}
func (p *Pool) worker(task func()) {
t := time.NewTimer(p.timeout)
defer func() {
t.Stop()
<- p.sem
}()
task()
for {
select {
case task := <- p.work:
t.Reset(p.timeout)
task()
case <- t.C:
return
}
}
}
I am testing by printing the value of i in a for loop by passing it into the pool wrapped in an anonymous function as below:
package main
import (
"fmt"
"time"
)
func main() {
fmt.Println("Hello, world!")
p := NewPool(3, 10, 1, time.Duration(5) * time.Second)
for i:=0; i<30; i++ {
p.AddTask(func () {
fmt.Print(i, " ")
})
}
time.Sleep(10 * time.Second)
fmt.Println("End")
}
The expected output should be serial numbers from 0 to 29 but instead output is
Hello, world!
12 12 12 12 12 12 12 12 12 12 12 12 13 25 25 25 25 25 25 25 25 25 25 25 26 25 30 30 30 30 End
I cannot understand why the output is like the above.
答案1
得分: 2
你的函数闭包都引用了相同的i
值。这会导致竞态条件,因为函数被调度时,它们读取的是一个不断变化的值,因此你看到的输出是不可预测的。
为了确保闭包获得唯一的值,在循环内部声明变量。一个简单的技巧是通过使用相同的变量名i := i
进行阴影声明。
for i:=0; i<30; i++ {
i:= i // <- 添加这行
p.AddTask(func () {
fmt.Print(i, " ")
})
}
https://go.dev/play/p/o0Nyx5A46tp
顺便说一下,这个技巧在Effective Go文档中有介绍,可以参考这一节:
写下面这样的代码可能看起来有点奇怪:
req := req
但在 Go 语言中,这是合法且惯用的做法。你会得到一个同名的新变量,有意地在本地遮蔽了循环变量,但对每个 goroutine 来说是唯一的。
英文:
Your function closures are all referencing the same value of i
. This creates a race condition, as the functions are dispatched, they are reading a changing value - hence the unpredictable output you are seeing.
To ensure closure gets a unique value, declare the variable within the loop. A simple trick to do this is by shadow declaring the same variable name i := i
for i:=0; i<30; i++ {
i:= i // <- add this
p.AddTask(func () {
fmt.Print(i, " ")
})
}
https://go.dev/play/p/o0Nyx5A46tp
BTW this technique is covered in the Effective Go docs, see this section:
> It may seem odd to write
>
> req := req
>
> but it's legal and idiomatic in Go to do this. You get a fresh version
> of the variable with the same name, deliberately shadowing the loop
> variable locally but unique to each goroutine.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论