2021年5月28日 15:47:48go评论143阅读模式

英文:

Why go func in Go function needs waitgroup to exit correctly?

问题

抱歉，这个标题可能有误导性。实际上完整的代码如下所示：

package main
import (
	"fmt"
	"sync"
)
type Button struct {
	Clicked *sync.Cond
}
func main() {
	button := Button{
		Clicked: sync.NewCond(&sync.Mutex{}),
	}
	subscribe := func(c *sync.Cond, fn func()) {
		var wg sync.WaitGroup
		wg.Add(1)
		go func() {
			wg.Done()
			c.L.Lock()
			defer c.L.Unlock()
			c.Wait()
			fn()
		}()
		wg.Wait()
	}
	var clickRegistered sync.WaitGroup
	clickRegistered.Add(2)
	subscribe(button.Clicked, func() {
		fmt.Println("maximizing window")
		clickRegistered.Done()
	})
	subscribe(button.Clicked, func() {
		fmt.Println("displaying dialog")
		clickRegistered.Done()
	})
	button.Clicked.Broadcast()
	clickRegistered.Wait()
}

当我注释掉一些行并重新运行时，它会抛出一个致命错误 "all goroutines are asleep - deadlock!"。

修改后的 subscribe 函数如下所示：

subscribe := func(c *sync.Cond, fn func()) {
	//var wg sync.WaitGroup
	//wg.Add(1)
	go func() {
		//wg.Done()
		c.L.Lock()
		defer c.L.Unlock()
		c.Wait()
		fn()
	}()
	//wg.Wait()
}

让我困惑的是，go func 是否在外部的 subscribe 函数返回之前执行。我认为，即使外部函数已经返回，go func 也会作为守护进程运行，所以 wg 变量是不必要的。但事实证明我完全错了。所以，如果 go func 有可能不被调度，这是否意味着我们必须在每个函数或代码块中使用 sync.WaitGroup 来确保 goroutine 在函数或代码块返回之前被调度执行？

谢谢大家。

英文:

Sry this title might be misleading. Actually the full code is here below:

package main
import (
	&quot;fmt&quot;
	&quot;sync&quot;
)
type Button struct {
	Clicked *sync.Cond
}
func main() {
	button := Button{
		Clicked: sync.NewCond(&amp;sync.Mutex{}),
	}
	subscribe := func(c *sync.Cond, fn func()) {
		var wg sync.WaitGroup
		wg.Add(1)
		go func() {
			wg.Done()
			c.L.Lock()
			defer c.L.Unlock()
			c.Wait()
			fn()
		}()
		wg.Wait()
	}
	var clickRegistered sync.WaitGroup
	clickRegistered.Add(2)
	subscribe(button.Clicked, func() {
		fmt.Println(&quot;maximizing window&quot;)
		clickRegistered.Done()
	})
	subscribe(button.Clicked, func() {
		fmt.Println(&quot;displaying dialog&quot;)
		clickRegistered.Done()
	})
	button.Clicked.Broadcast()
	clickRegistered.Wait()
}

When I comment some lines and run it again, it throws a fatal error "all goroutines are asleep - deadlock!"
The subscribe function altered looks like as below:

subscribe := func(c *sync.Cond, fn func()) {
		//var wg sync.WaitGroup
		//wg.Add(1)
		go func() {
			//wg.Done()
			c.L.Lock()
			defer c.L.Unlock()
			c.Wait()
			fn()
		}()
		//wg.Wait()
	}

What makes me confused is that whether go func is executed before the outer subscribe function returns. In my thought, the go func will run as a daemon though the outer function has returned, so the wg variable is unnecessary. But it shows I'm totally wrong. So if the go func has the possibility of not being scheduled, does it mean that we must use the sync.WaitGroup in every function or code block to make sure the goroutine is scheduled to be executed before the function or code block returns?
Thank you all.

答案1

得分: 1

使用wg等待组（如您当前组中的代码）：当subscribe函数返回时，您知道等待的goroutine至少已经开始执行。

因此，当您的主函数达到button.Clicked.Broadcast()时，很有可能这两个goroutine实际上正在等待它们的button.Clicked.Wait()调用。

如果没有wg，您无法保证goroutine甚至已经开始，而且您的代码可能会过早地调用button.Clicked.Broadcast()。

请注意，您对wg的使用只是降低了死锁发生的概率，但并不能在所有情况下防止死锁。

尝试使用-race编译您的二进制文件，并在循环中运行它（例如从bash中运行：for i in {1..100}; do ./myprogram; done），我认为您会发现相同的问题有时会发生。

英文:

With the wg waitgroup (as coded in your current group) : when the subscribe function returns, you know that the waiting goroutine has at least started its execution.

So when your main function reaches button.Clicked.Broadcast(), there's a good chance the two goroutines are actually waiting on their button.Clicked.Wait() call.

Without the wg, you have no guarantee that the goroutines have even started, and your code may call button.Clicked.Broadcast() too soon.

Note that your use of wg merely makes it less probable for the deadlock to happen, but it won't prevent it in all cases.

Try compiling your binary with -race, and run it in a loop (e.g from bash : for i in {1..100}; do ./myprogram; done), I think you will see that the same problem happens sometimes.

答案2

得分: 1

问题在于无法保证c.Wait()在任何一次调用中都会在button.Clicked.Broadcast()之前运行；即使你的原始代码使用了WaitGroup也不能保证（因为重要的是c.Wait()部分，而不是goroutine的生成部分）。

修改后的订阅代码：

subscribe := func(c *sync.Cond, subWG *sync.WaitGroup, fn func()) {
    go func() {
        c.L.Lock()
        defer c.L.Unlock()
        subWG.Done() // [2]
        c.Wait()
        fn()
    }()
}

等待的代码：

subWG.Done()
button.Clicked.L.Lock()
button.Clicked.L.Unlock()

这是基于以下观察结果的：[2]只能在之前的所有执行了[2]的goroutine中的一个在c.Wait上等待时发生，因为它们共享锁。因此，subWG.Wait()表示执行了2（或订阅的数量）次[2]，只有可能有一个goroutine没有在c.Wait上等待，这可以通过再次请求锁来解决。

Playground链接：https://play.golang.org/p/6mjUEcn3ec5

英文:

The problem is that c.Wait() in either call is not guaranteed to run before button.Clicked.Broadcast(); and even your original code's use of WaitGroup does not guarantees it either (since it is the c.Wait() part, not the spawn of the goroutine that is important)

modified subscribe:

subscribe := func(c *sync.Cond, subWG *sync.WaitGroup, fn func()) {
		go func() {
			c.L.Lock()
			defer c.L.Unlock()
			subWG.Done() // [2]
			c.Wait()
			fn()
		}()
	}

code of waiting:

subWG.Done()
button.Clicked.L.Lock()
button.Clicked.L.Unlock()

This is based on the observation that [2] can only happen either at the beginning or after the all previous goroutines that execute [2] is holding on c.Wait, due to the locker they shared. So subWG.Wait(), meaning that 2 (or number of the subscriptions) [2] is executed, it is only possible that one goroutine is not holding on c.Wait, which can be solved by asking for the locker to Lock another time.

Playground: https://play.golang.org/p/6mjUEcn3ec5

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

为什么在Go函数中使用waitgroup可以正确退出go func？

问题

答案1

答案2

Golang RethinkDB ChangeFeed 结构

在VSCode中更严格地检查Golang代码的规范性。

Gmail API Pub/Sub Push没有停止。

将文件解析为Helm模板。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。