2022年8月21日 20:09:00go评论68阅读模式

英文:

Golang worker pool implementation working unexpectedly

问题

我已经实现了一个如下所示的 Golang 工作池，其中 sem 和 work 是通道。sem 是一个用于跟踪当前活动的工作器（goroutine）数量的通道。work 是一个将函数传递给活动工作器执行的通道。timeout 将返回任何空闲超过指定时间的工作器。

package main

import (
	"time"
)

type Pool struct {
	sem     chan struct{}
	work    chan func()
	timeout time.Duration
}

func NewPool(max, size, spawn int, timeout time.Duration) *Pool {
	if spawn <= 0 {
		panic("workpool spawn is <= 0")
	}
	if spawn > max {
		panic("workpool spawn > max workers")
	}
	p := &Pool{
		sem:     make(chan struct{}, max),
		work:    make(chan func(), size),
		timeout: timeout,
	}
	for i := 0; i < spawn; i++ {
		p.sem <- struct{}{}
		go p.worker(func() {})
	}
	return p

}

func (p *Pool) AddTask(task func()) {
	select {
	case p.work <- task:
		return
	case p.sem <- struct{}{}:
		go p.worker(task)
		return
	}
}

func (p *Pool) worker(task func()) {
	t := time.NewTimer(p.timeout)
	defer func() {
		t.Stop()
		<-p.sem
	}()

	task()

	for {
		select {
		case task := <-p.work:
			t.Reset(p.timeout)
			task()
		case <-t.C:
			return
		}
	}
}

我通过将其包装在匿名函数中并将其传递给工作池来测试在 for 循环中打印 i 的值，如下所示：

package main

import (
	"fmt"
	"time"
)

func main() {
	fmt.Println("Hello, world!")

	p := NewPool(3, 10, 1, time.Duration(5)*time.Second)

	for i := 0; i < 30; i++ {
		p.AddTask(func() {
			fmt.Print(i, " ")
		})
	}
	time.Sleep(10 * time.Second)
	fmt.Println("End")

}

预期输出应该是从 0 到 29 的连续数字，但实际输出是：

Hello, world!
12 12 12 12 12 12 12 12 12 12 12 12 13 25 25 25 25 25 25 25 25 25 25 25 26 25 30 30 30 30 End

我无法理解为什么输出会是这样的。

英文:

I have implemented a golang worker pool as below where sem and work are channels. sem is a channel to keep track of number of workers(goroutines) currently active. work is channel to pass functions to active workers to execute. timeout will return any worker idle for the timeout duration.

package main
import (
&quot;time&quot;
)
type Pool struct {
sem chan struct{}
work chan func()
timeout time.Duration
}
func NewPool(max, size, spawn int, timeout time.Duration) *Pool {
if spawn &lt;= 0 {
panic(&quot;workpool spawn is &lt;= 0&quot;)
}
if spawn &gt; max {
panic(&quot;workpool spawn &gt; max workers&quot;)
}
p := &amp;Pool{
sem: make(chan struct{}, max),
work: make(chan func(), size),
timeout: timeout,
}
for i := 0; i &lt; spawn; i++ {
p.sem &lt;- struct{}{}
go p.worker(func() {})
}
return p
}
func (p *Pool) AddTask(task func()) {
select {
case p.work &lt;- task:
return
case p.sem &lt;- struct{}{}:
go p.worker(task)
return
}
}
func (p *Pool) worker(task func()) {
t := time.NewTimer(p.timeout)
defer func() {
t.Stop()
&lt;- p.sem
}()
task()
for {
select {
case task := &lt;- p.work:
t.Reset(p.timeout)
task()
case &lt;- t.C:
return
}
}
}

I am testing by printing the value of i in a for loop by passing it into the pool wrapped in an anonymous function as below:

package main
import (
&quot;fmt&quot;
&quot;time&quot;
)
func main() {
fmt.Println(&quot;Hello, world!&quot;)
p := NewPool(3, 10, 1, time.Duration(5) * time.Second)
for i:=0; i&lt;30; i++ {
p.AddTask(func () {
fmt.Print(i, &quot; &quot;)
})
}
time.Sleep(10 * time.Second)
fmt.Println(&quot;End&quot;)
}

The expected output should be serial numbers from 0 to 29 but instead output is

Hello, world!
12 12 12 12 12 12 12 12 12 12 12 12 13 25 25 25 25 25 25 25 25 25 25 25 26 25 30 30 30 30 End

I cannot understand why the output is like the above.

答案1

得分: 2

你的函数闭包都引用了相同的i值。这会导致竞态条件，因为函数被调度时，它们读取的是一个不断变化的值，因此你看到的输出是不可预测的。

为了确保闭包获得唯一的值，在循环内部声明变量。一个简单的技巧是通过使用相同的变量名i := i进行阴影声明。

for i:=0; i<30; i++ {
    i:= i                // <- 添加这行
    p.AddTask(func () {
        fmt.Print(i, " ")
    })
}

https://go.dev/play/p/o0Nyx5A46tp

顺便说一下，这个技巧在Effective Go文档中有介绍，可以参考这一节：

写下面这样的代码可能看起来有点奇怪：
req := req
但在 Go 语言中，这是合法且惯用的做法。你会得到一个同名的新变量，有意地在本地遮蔽了循环变量，但对每个 goroutine 来说是唯一的。

英文:

Your function closures are all referencing the same value of i. This creates a race condition, as the functions are dispatched, they are reading a changing value - hence the unpredictable output you are seeing.

To ensure closure gets a unique value, declare the variable within the loop. A simple trick to do this is by shadow declaring the same variable name i := i

for i:=0; i&lt;30; i++ {
i:= i                // &lt;- add this
p.AddTask(func () {
fmt.Print(i, &quot; &quot;)
})
}

https://go.dev/play/p/o0Nyx5A46tp

BTW this technique is covered in the Effective Go docs, see this section:

> It may seem odd to write
>
> req := req
>
> but it's legal and idiomatic in Go to do this. You get a fresh version
> of the variable with the same name, deliberately shadowing the loop
> variable locally but unique to each goroutine.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Golang工作池实现出现了意外情况。

问题

答案1

客户端执行时，错误会返回状态码吗？

Revel调试 – 如何进行调试

有没有一个 Golang Websocket 库可以让我修改初始请求/选择 http.Client？

使用移动包的Golang跨平台游戏引擎？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论