Getting a fatal error: all goroutines are asleep – deadlock! with a simple test scenario

huangapple go评论79阅读模式
英文:

Getting a fatal error: all goroutines are asleep - deadlock! with a simple test scenario

问题

我正在尝试重现一个问题,并使用以下代码得到了一个最小使用案例。如果我关闭所有通道(绕过 i == 0 的测试),则一切正常工作。等待组(WaitGroup)状态递减并触发 done,主程序正常退出。当我故意跳过关闭其中一个通道时,我期望主程序会等待,而等待组的信号量在这种情况下会无限期地阻塞。然而,我得到了一个错误:"fatal error: all goroutines are asleep - deadlock!"。为什么会这样?我肯定漏掉了一些基本的东西,或者是运行时过于严格了?

package main

import (
	"fmt"
	"sync"
)

const N int = 4

func main() {

	done := make(chan struct{})
	defer close(done)

	fmt.Println("Beginning...")

	chans := make([]chan int, N)
	var wg sync.WaitGroup

	for i := 0; i < N; i++ {
		wg.Add(1)
		chans[i] = make(chan int)
		go func(i int) { // p0
			defer wg.Done()
			for m := range chans[i] {
				fmt.Println("Received ", m)
			}
			fmt.Println("Ending p", i)
		}(i)
	}

	go func() {
		wg.Wait()
		done <- struct{}{} // signal main that we are done
	}()

	for i := 0; i < N; i++ {
		fmt.Println("Closing c", i)
		if i != 0 { // Skip #0 so wg doesn't reach '0'
			close(chans[i])
		}
	}

	<-done // wait to receive signal from anonymous join function
	fmt.Println("Ending.")
}

更新: 我修改了代码以避免竞态条件,但仍然出现这个错误。

if i != 0 是故意放在那里的。我希望 wg.Wait 永远阻塞(其信号量永远不会达到 0)。为什么我不能这样做?这似乎与在其他地方使用 <-done 而没有匹配的 done <- struct{}{} 是一样的。在那种情况下,编译器也会抱怨吗?

英文:

I am trying to repro an issue and came to a minimum use case with the following code. If I close all the channels (bypassing the i == 0 test), things are working as expected. Wg state decrements and done is triggered, main exits fine. When I skip closing one of these channel (on purpose), I expect the main routine to wait while the waitgroup semaphore will block indefinitely in this case. Instead, I am getting an error: "fatal error: all goroutines are asleep - deadlock!". Why is that? I must have missed something fundamental or this the runtime being overzealous?

package main

import (
	&quot;fmt&quot;
	&quot;sync&quot;
)

const N int = 4

func main() {

	done := make(chan struct{})
	defer close(done)

	fmt.Println(&quot;Beginning...&quot;)

	chans := make([]chan int, N)
	var wg sync.WaitGroup

	for i := 0; i &lt; N; i++ {
		wg.Add(1)
		chans[i] = make(chan int)
		go func(i int) { // p0
			defer wg.Done()
			for m := range chans[i] {
				fmt.Println(&quot;Received &quot;, m)
			}
			fmt.Println(&quot;Ending p&quot;, i)
		}(i)
	}

	go func() {
		wg.Wait()
		done &lt;- struct{}{} // signal main that we are done
	}()

	for i := 0; i &lt; N; i++ {
		fmt.Println(&quot;Closing c&quot;, i)
		if i != 0 { // Skip #0 so wg doesn&#39;t reach &#39;0&#39;
			close(chans[i])
		}
	}

	&lt;-done // wait to receive signal from anonymous join function
	fmt.Println(&quot;Ending.&quot;)
}

UPDATE: I edited the code to avoid the race condition. Still getting this error.

The if i != 0 is there because it's intentional. I want the wg.Wait to block forever (with its semaphore never reaching 0.) Why can't I do that? It seems the same as if I were using &lt;-done without a matching done &lt;- struct{}{} somewhere else. Would the compiler complain too in that case?

答案1

得分: 1

以下是翻译好的内容:

这是发生的情况:

  • 第一个 go func(i int) { 协程没有退出,因为 chans[0] 没有关闭。
  • 因为协程没有退出,wg.Done 没有被调用。
  • 调用 wg.Wait() 由于前面的原因而永远阻塞。
  • 主程序永远阻塞,因为信号没有发送给 done

你可以通过移除 if i != 0 { 来解决死锁问题,但还有另一个问题。等待组上存在竞争条件。可能在调用 wg.Add(1) 之前就已经调用了 wg.Done()。在启动协程之前调用 wg.Add() 来避免竞争条件。

英文:

Here's what's going on:

  • The first go func(i int) { goroutine does not exit because chans[0] is not closed.
  • Because the goroutine does not exit, wg.Done is not called.
  • The call to wg.Wait() blocks forever because of the previous point.
  • Main blocks forever because the signal is not sent to done.

You can fix the deadlock by removing the if i != 0 {, but there is another issue. There is a race on the wait group. It's possible that wg.Done() is called before wg.Add(1) is called. Call wg.Add() before starting the goroutine to avoid the race.

答案2

得分: 0

你的for循环中的if语句没有关闭最后一个通道,因此你的goroutine会一直等待chans[i]发生某些事情,这将阻塞defer wg.Done()的执行,进而导致wg.Wait()无法完成,进而导致done <- struct{}{}无法被触发。

简而言之,你循环中的if语句没有关闭最后一个通道,导致死锁,因为没有人可以执行任何操作。

正如@CodingPickle指出的那样,将wg.Add(1)移到for循环的开头,以防止任何竞争条件的发生。

http://play.golang.org/p/j1D5LZGUhd

英文:

The if statement in your for loop doesn't let the last channel close, so your goroutine is left waiting on something to happen to chans[i] which will block the defer wg.Done() from ever happening which in turn will never let wg.Wait() finish WHICH THENNNNN will never let done &lt;- struct{}{} get signalled

So in short, your if statement in your loop is not closing the last channel and causing a deadlock because nobody can do nothing.

As @CodingPickle did point out, move your wg.Add(1) to the beginning of your for loop to prevent any race conditions

http://play.golang.org/p/j1D5LZGUhd

huangapple
  • 本文由 发表于 2016年1月4日 10:29:52
  • 转载请务必保留本文链接:https://go.coder-hub.com/34583936.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定