Golang工作池实现出现了意外情况。

huangapple go评论81阅读模式
英文:

Golang worker pool implementation working unexpectedly

问题

我已经实现了一个如下所示的 Golang 工作池,其中 semwork 是通道。sem 是一个用于跟踪当前活动的工作器(goroutine)数量的通道。work 是一个将函数传递给活动工作器执行的通道。timeout 将返回任何空闲超过指定时间的工作器。

package main

import (
	"time"
)

type Pool struct {
	sem     chan struct{}
	work    chan func()
	timeout time.Duration
}

func NewPool(max, size, spawn int, timeout time.Duration) *Pool {
	if spawn <= 0 {
		panic("workpool spawn is <= 0")
	}
	if spawn > max {
		panic("workpool spawn > max workers")
	}
	p := &Pool{
		sem:     make(chan struct{}, max),
		work:    make(chan func(), size),
		timeout: timeout,
	}
	for i := 0; i < spawn; i++ {
		p.sem <- struct{}{}
		go p.worker(func() {})
	}
	return p

}

func (p *Pool) AddTask(task func()) {
	select {
	case p.work <- task:
		return
	case p.sem <- struct{}{}:
		go p.worker(task)
		return
	}
}

func (p *Pool) worker(task func()) {
	t := time.NewTimer(p.timeout)
	defer func() {
		t.Stop()
		<-p.sem
	}()

	task()

	for {
		select {
		case task := <-p.work:
			t.Reset(p.timeout)
			task()
		case <-t.C:
			return
		}
	}
}

我通过将其包装在匿名函数中并将其传递给工作池来测试在 for 循环中打印 i 的值,如下所示:

package main

import (
	"fmt"
	"time"
)

func main() {
	fmt.Println("Hello, world!")

	p := NewPool(3, 10, 1, time.Duration(5)*time.Second)

	for i := 0; i < 30; i++ {
		p.AddTask(func() {
			fmt.Print(i, " ")
		})
	}
	time.Sleep(10 * time.Second)
	fmt.Println("End")

}

预期输出应该是从 0 到 29 的连续数字,但实际输出是:

Hello, world!
12 12 12 12 12 12 12 12 12 12 12 12 13 25 25 25 25 25 25 25 25 25 25 25 26 25 30 30 30 30 End

我无法理解为什么输出会是这样的。

英文:

I have implemented a golang worker pool as below where sem and work are channels. sem is a channel to keep track of number of workers(goroutines) currently active. work is channel to pass functions to active workers to execute. timeout will return any worker idle for the timeout duration.

package main
import (
&quot;time&quot;
)
type Pool struct {
sem chan struct{}
work chan func()
timeout time.Duration
}
func NewPool(max, size, spawn int, timeout time.Duration) *Pool {
if spawn &lt;= 0 {
panic(&quot;workpool spawn is &lt;= 0&quot;)
}
if spawn &gt; max {
panic(&quot;workpool spawn &gt; max workers&quot;)
}
p := &amp;Pool{
sem: make(chan struct{}, max),
work: make(chan func(), size),
timeout: timeout,
}
for i := 0; i &lt; spawn; i++ {
p.sem &lt;- struct{}{}
go p.worker(func() {})
}
return p
}
func (p *Pool) AddTask(task func()) {
select {
case p.work &lt;- task:
return
case p.sem &lt;- struct{}{}:
go p.worker(task)
return
}
}
func (p *Pool) worker(task func()) {
t := time.NewTimer(p.timeout)
defer func() {
t.Stop()
&lt;- p.sem
}()
task()
for {
select {
case task := &lt;- p.work:
t.Reset(p.timeout)
task()
case &lt;- t.C:
return
}
}
}

I am testing by printing the value of i in a for loop by passing it into the pool wrapped in an anonymous function as below:

package main
import (
&quot;fmt&quot;
&quot;time&quot;
)
func main() {
fmt.Println(&quot;Hello, world!&quot;)
p := NewPool(3, 10, 1, time.Duration(5) * time.Second)
for i:=0; i&lt;30; i++ {
p.AddTask(func () {
fmt.Print(i, &quot; &quot;)
})
}
time.Sleep(10 * time.Second)
fmt.Println(&quot;End&quot;)
}

The expected output should be serial numbers from 0 to 29 but instead output is

Hello, world!
12 12 12 12 12 12 12 12 12 12 12 12 13 25 25 25 25 25 25 25 25 25 25 25 26 25 30 30 30 30 End

I cannot understand why the output is like the above.

答案1

得分: 2

你的函数闭包都引用了相同的i值。这会导致竞态条件,因为函数被调度时,它们读取的是一个不断变化的值,因此你看到的输出是不可预测的。

为了确保闭包获得唯一的值,在循环内部声明变量。一个简单的技巧是通过使用相同的变量名i := i进行阴影声明。

for i:=0; i<30; i++ {
    i:= i                // <- 添加这行
    p.AddTask(func () {
        fmt.Print(i, " ")
    })
}

https://go.dev/play/p/o0Nyx5A46tp


顺便说一下,这个技巧在Effective Go文档中有介绍,可以参考这一节:

写下面这样的代码可能看起来有点奇怪:

req := req

但在 Go 语言中,这是合法且惯用的做法。你会得到一个同名的新变量,有意地在本地遮蔽了循环变量,但对每个 goroutine 来说是唯一的。

英文:

Your function closures are all referencing the same value of i. This creates a race condition, as the functions are dispatched, they are reading a changing value - hence the unpredictable output you are seeing.

To ensure closure gets a unique value, declare the variable within the loop. A simple trick to do this is by shadow declaring the same variable name i := i

for i:=0; i&lt;30; i++ {
i:= i                // &lt;- add this
p.AddTask(func () {
fmt.Print(i, &quot; &quot;)
})
}

https://go.dev/play/p/o0Nyx5A46tp


BTW this technique is covered in the Effective Go docs, see this section:

> It may seem odd to write
>
> req := req
>
> but it's legal and idiomatic in Go to do this. You get a fresh version
> of the variable with the same name, deliberately shadowing the loop
> variable locally but unique to each goroutine.

huangapple
  • 本文由 发表于 2022年8月21日 20:09:00
  • 转载请务必保留本文链接:https://go.coder-hub.com/73434264.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定