英文:
How can I avoid deadlock
问题
请注意以下代码片段。
package main
import (
"errors"
"fmt"
"math/rand"
"runtime"
"sync"
"time"
)
func random(min, max int) int {
rand.Seed(time.Now().Unix())
return rand.Intn(max-min) + min
}
func err1(rand int, chErr chan error, wg *sync.WaitGroup) {
if rand == 1 {
chErr <- errors.New("Error 1")
}
wg.Done()
}
func err2(rand int, chErr chan error, wg *sync.WaitGroup) {
if rand == 2 {
chErr <- errors.New("Error 2")
}
wg.Done()
}
func err3(rand int, chErr chan error, wg *sync.WaitGroup) {
if rand == 3 {
chErr <- errors.New("Error 3")
}
wg.Done()
}
func err4(rand int, chErr chan error, wg *sync.WaitGroup) {
if rand == 3 {
chErr <- errors.New("Error 4")
}
wg.Done()
}
func err5(rand int, chErr chan error, wg *sync.WaitGroup) {
if rand == 4 {
chErr <- errors.New("Error 5")
}
wg.Done()
}
func main() {
runtime.GOMAXPROCS(runtime.NumCPU())
chErr := make(chan error, 1)
wg := new(sync.WaitGroup)
//n := random(1, 8)
n := 3
fmt.Println(n)
wg.Add(5)
go err1(n, chErr, wg)
go err2(n, chErr, wg)
go err3(n, chErr, wg)
go err4(n, chErr, wg)
go err5(n, chErr, wg)
fmt.Println("Wait")
wg.Wait()
select {
case err := <-chErr:
fmt.Println(err)
close(chErr)
default:
fmt.Println("NO error, job done")
}
}
我该如何避免死锁?我可以为缓冲区分配长度为2,但也许有更优雅的方法来解决这个问题。
我在err3和err4函数中故意使用了rand == 3。
英文:
Look at the following code snippet.
package main
import (
"errors"
"fmt"
"math/rand"
"runtime"
"sync"
"time"
)
func random(min, max int) int {
rand.Seed(time.Now().Unix())
return rand.Intn(max-min) + min
}
func err1(rand int, chErr chan error, wg *sync.WaitGroup) {
if rand == 1 {
chErr <- errors.New("Error 1")
}
wg.Done()
}
func err2(rand int, chErr chan error, wg *sync.WaitGroup) {
if rand == 2 {
chErr <- errors.New("Error 2")
}
wg.Done()
}
func err3(rand int, chErr chan error, wg *sync.WaitGroup) {
if rand == 3 {
chErr <- errors.New("Error 3")
}
wg.Done()
}
func err4(rand int, chErr chan error, wg *sync.WaitGroup) {
if rand == 3 {
chErr <- errors.New("Error 4")
}
wg.Done()
}
func err5(rand int, chErr chan error, wg *sync.WaitGroup) {
if rand == 4 {
chErr <- errors.New("Error 5")
}
wg.Done()
}
func main() {
runtime.GOMAXPROCS(runtime.NumCPU())
chErr := make(chan error, 1)
wg := new(sync.WaitGroup)
//n := random(1, 8)
n := 3
fmt.Println(n)
wg.Add(5)
go err1(n, chErr, wg)
go err2(n, chErr, wg)
go err3(n, chErr, wg)
go err4(n, chErr, wg)
go err5(n, chErr, wg)
fmt.Println("Wait")
wg.Wait()
select {
case err := <-chErr:
fmt.Println(err)
close(chErr)
default:
fmt.Println("NO error, job done")
}
}
How can I avoid deadlock here? I could assign buffer length 2, but maybe it has more elegant way to solve the problem.
I did rand == 3 on functions err3 and err4 with consciously.
答案1
得分: 3
你的程序出现了死锁,因为你的通道已满。
你的通道大小为1。然后你调用了wg.Wait()
,它等待5个函数被调用。现在,一旦到达err3
,rand == 3
,因此一个错误被传递到你的通道上。
此时,你的通道已满,而且你只完成了3个等待组项。
err4
被调用时,值为3,它也想将一个错误放入你的通道。此时,它被阻塞了,因为你的通道已满,并且没有任何东西被弹出。
因此,你的主goroutine将被阻塞,因为你的等待组永远不会完成。
解决方法确实是增加通道缓冲区的大小。这样,当错误尝试放入通道时,它不会被阻塞,你的等待组有机会完成所有的项。
英文:
Your program is deadlocking because your channels are full.
Your channel size is one. You're then calling wg.Wait()
.. which waits for 5 functions to be called. Now, once you get to err3
.. rand == 3
and therefore an error is passed on your channel.
At this point, your channel is full and you've only ticked off 3 of your waitgroup items.
err4
is called with the value 3 .. which also wants to put an error on your channel. At this point, it blocks - because your channel is full and nothing has been popped from it.
So your main goroutine will block because your waitgroup will never be finished.
The fix is indeed to make your channel buffer larger. That way, when the errors are attempting to be placed on the channel - it won't block, and your waitgroup has a chance to have all of its items ticked off.
答案2
得分: 3
通常情况下,不要陷入认为更大的缓冲区可以解决死锁问题的误区。这种方法在某些特定情况下可能有效,但并不普遍适用。
解决死锁问题的最佳方法是理解 goroutine 之间的依赖关系。基本上,你必须消除相互依赖的通信循环。非阻塞发送的想法(参见 @izca 的回答)是一个有用的技巧,但并不是唯一的技巧。
关于如何避免死锁/活锁的知识有很多。其中很多知识来自于 Occam 在 80 年代和 90 年代流行的时候。有一些特别有价值的经验来自于 Jeremy Martin(《无死锁并发系统的设计策略》)、Peter Welch(《更高级的范式》)等人。
-
客户端-服务器策略很简单:将你的 Go 协程网络描述为一组通信的服务器和客户端;确保网络图中没有循环 => 死锁被消除。
-
I/o-par 是一种形成环和环面的 Go 协程结构的方法,这样结构内部不会发生死锁;这是一种特殊情况,其中允许循环,但以一种一般的无死锁方式运行。
因此,我的策略是首先减小缓冲区的大小,思考发生了什么,解决死锁问题。然后,根据基准测试的结果,再引入缓冲区以提高性能。死锁是由通信图中的循环引起的。打破这些循环。
英文:
Generally, do not fall into the trap of thinking that larger buffers fix deadlocks. This approach might work in certain specific cases but just isn't generally true.
Deadlock is best addressed by understanding how goroutines depend on each other. Essentially, you must eliminate loops of communicating where there is a mutual dependency. The non-blocking send idea (see @izca's answer) is one helpful trick, but not the only one.
There is a considerable body of knowledge on how to avoid deadlock/livelock. Much of it is from the days when Occam was popular in the '80s and '90s. There are a few special gems from people such as Jeremy Martin (Design Strategy for Deadlock-Free
Concurrent Systems), Peter Welch (Higher Level Paradigms) and others.
-
The client-server strategy is simple: describe your Go-routine
network as a set of communicating servers and their clients; ensure
that there are no loops in the network graph => deadlock is
eliminated. -
I/o-par is a way to form rings and toruses of Go-routines such that
there will not be a deadlock within the structure; this is a
particular case where loops are allowed but behave in a
general deadlock-free way.
So, my strategy is to reduce the buffer sizes first, think about what's happening, fix the deadlocks. Then later, re-introduce buffers to improve performance, based on benchmarks. Deadlocks are caused by loops in the communication graph. Break the loops.
答案3
得分: 1
由于您在err3()
和err4()
中故意使用了rand == 3
,所以可能有两种解决方案:
-
增加通道的缓冲区大小
将chErr
通道的缓冲区大小增加至至少2,因为在您的程序中,使用n = 3
可能会导致两个goroutine向通道发送值。
-
使用非阻塞发送
在所有的errX()
函数中(至少在err3()
和err4()
中,因为它们在相同的条件下发送),使用非阻塞通道发送,可以使用select
语句:
select {
case chErr <- errors.New("Error 3"):
default:
}
这将尝试向通道发送一个error
,但如果通道没有准备好(因为另一个goroutine已经发送了一个值,导致通道已满),则会选择default
分支,什么也不做。
在Go Playground上尝试一下。
注意:这将“丢失”其中一个错误,因为通道只能容纳一个错误,但您只从中读取(接收)一个值。
您可以在Go Concurrency Patterns: Timing out, moving on博文中了解更多关于非阻塞发送的内容。
英文:
Since you wrote you intentionally used rand == 3
in both err3()
and err4()
, there can be 2 solutions:
-
Increase the buffer size of the channel
Increase the buffer size of the chErr
channel to at least 2 because in your program using n = 3
might result 2 goroutines sending a value on the channel.
-
Use non-blocking send
Use a non-blocking channel send preferably in all of your errX()
functions (but at least in err3()
and err4()
because they send on the same condition) with select
:
select {
case chErr <- errors.New("Error 3"):
default:
}
This will try to send an error
on the channel but if it is not ready (if it's full because another goroutine has already sent a value), the default
case will be selected which does nothing.
Try it out on Go Playground.
Note: this will "lose" one of the errors because the channel can only hold one error, but you read (receive) only one value from it anyway.
You can read more about the non-blocking send in the Go Concurrency Patterns: Timing out, moving on blog article.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论