英文:
goroutine not seeing context cancel?
问题
我有两个 goroutine 同时运行。
在某个时刻,我希望程序能够优雅地退出,所以我使用 cancel()
函数通知我的 goroutine 需要停止,但只有其中一个接收到了消息。
这是我的主函数(简化版):
ctx := context.Background()
ctx, cancel := context.WithCancel(ctx)
done := make(chan os.Signal, 1)
signal.Notify(done, os.Interrupt, syscall.SIGINT, syscall.SIGTERM)
wg := &sync.WaitGroup{}
wg.Add(2)
go func() {
err := eng.Watcher(ctx, wg)
if err != nil {
cancel()
}
}()
go func() {
err := eng.Suspender(ctx, wg)
if err != nil {
cancel()
}
}()
<-done // 等待 SIGINT / SIGTERM
log.Print("接收到关闭信号")
cancel()
wg.Wait()
log.Print("控制器正常退出")
Suspender
goroutine 成功退出(以下是代码):
package main
import (
"context"
"sync"
"time"
log "github.com/sirupsen/logrus"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/client-go/util/retry"
)
func (eng *Engine) Suspender(ctx context.Context, wg *sync.WaitGroup) error {
contextLogger := eng.logger.WithFields(log.Fields{
"go-routine": "Suspender",
})
contextLogger.Info("启动 Suspender goroutine")
now := time.Now().In(eng.loc)
for {
select {
case n := <-eng.Wl:
// 做一些事情
case <-ctx.Done():
// 上下文结束,停止处理结果
contextLogger.Infof("goroutine Suspender 被上下文取消")
return nil
}
}
}
以下是没有接收到上下文取消的函数:
package main
import (
"context"
"sync"
"time"
log "github.com/sirupsen/logrus"
)
func (eng *Engine) Watcher(ctx context.Context, wg *sync.WaitGroup) error {
contextLogger := eng.logger.WithFields(log.Fields{
"go-routine": "Watcher",
"uptime-schedule": eng.upTimeSchedule,
})
contextLogger.Info("启动 Watcher goroutine")
ticker := time.NewTicker(time.Second * 30)
for {
select {
case <-ctx.Done():
contextLogger.Infof("goroutine watcher 被上下文取消")
log.Printf("toto")
return nil
case <-ticker.C:
// 做一些事情
}
}
}
请问我能帮到你什么?谢谢
英文:
I have two goroutines running at the same time.
At some point, I want my program to exit gracefully so I use the cancel()
func to notify my goroutines that they need to be stopped, but only one of the two receive the message.
here is my main (simplified):
ctx := context.Background()
ctx, cancel := context.WithCancel(ctx)
done := make(chan os.Signal, 1)
signal.Notify(done, os.Interrupt, syscall.SIGINT, syscall.SIGTERM)
wg := &sync.WaitGroup{}
wg.Add(2)
go func() {
err := eng.Watcher(ctx, wg)
if err != nil {
cancel()
}
}()
go func() {
err := eng.Suspender(ctx, wg)
if err != nil {
cancel()
}
}()
<-done // wait for SIGINT / SIGTERM
log.Print("receive shutdown")
cancel()
wg.Wait()
log.Print("controller exited properly")
The Suspender goroutine exist successfully (here is the code):
package main
import (
"context"
"sync"
"time"
log "github.com/sirupsen/logrus"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/client-go/util/retry"
)
func (eng *Engine) Suspender(ctx context.Context, wg *sync.WaitGroup) error {
contextLogger := eng.logger.WithFields(log.Fields{
"go-routine": "Suspender",
})
contextLogger.Info("starting Suspender goroutine")
now := time.Now().In(eng.loc)
for {
select {
case n := <-eng.Wl:
//dostuff
case <-ctx.Done():
// The context is over, stop processing results
contextLogger.Infof("goroutine Suspender canceled by context")
return nil
}
}
}
and here is the func that is not receiving the context cancellation:
package main
import (
"context"
"sync"
"time"
log "github.com/sirupsen/logrus"
)
func (eng *Engine) Watcher(ctx context.Context, wg *sync.WaitGroup) error {
contextLogger := eng.logger.WithFields(log.Fields{
"go-routine": "Watcher",
"uptime-schedule": eng.upTimeSchedule,
})
contextLogger.Info("starting Watcher goroutine")
ticker := time.NewTicker(time.Second * 30)
for {
select {
case <-ctx.Done():
contextLogger.Infof("goroutine watcher canceled by context")
log.Printf("toto")
return nil
case <-ticker.C:
//dostuff
}
}
}
}
Can you please help me ?
Thanks
答案1
得分: 2
你尝试过使用errgroup吗?它内置了上下文取消功能:
ctx := context.Background()
ctx, cancel := context.WithCancel(ctx)
defer cancel()
done := make(chan os.Signal, 1)
signal.Notify(done, os.Interrupt, syscall.SIGINT, syscall.SIGTERM)
// "golang.org/x/sync/errgroup"
wg, ctx := errgroup.WithContext(ctx)
wg.Go(func() error {
return eng.Watcher(ctx, wg)
})
wg.Go(func() error {
return eng.Suspender(ctx, wg)
})
wg.Go(func() error {
defer cancel()
<-done
return nil
})
err := wg.Wait()
if err != nil {
log.Print(err)
}
log.Print("receive shutdown")
log.Print("controller exited properly")
英文:
Did you try it with an errgroup? It has context cancellation baked in:
ctx := context.Background()
ctx, cancel := context.WithCancel(ctx)
defer cancel()
done := make(chan os.Signal, 1)
signal.Notify(done, os.Interrupt, syscall.SIGINT, syscall.SIGTERM)
// "golang.org/x/sync/errgroup"
wg, ctx := errgroup.WithContext(ctx)
wg.Go(func() error {
return eng.Watcher(ctx, wg)
})
wg.Go(func() error {
return eng.Suspender(ctx, wg)
})
wg.Go(func() error {
defer cancel()
<-done
return nil
})
err := wg.Wait()
if err != nil {
log.Print(err)
}
log.Print("receive shutdown")
log.Print("controller exited properly")
答案2
得分: 0
表面上看,代码看起来不错。我唯一能想到的是在"dostuff"中可能很繁忙。在调试器中逐步执行与时间相关的代码可能会很棘手,所以尝试添加一些日志记录:
case <-ticker.C:
log.Println("doing stuff")
//dostuff
log.Println("done stuff")
(我还假设你在go协程中的某个地方调用了wg.Done()
,尽管如果缺少这些调用,那也不会导致你所描述的问题。)
英文:
On the surface the code looks good. The only thing I can think is that it's busy in "dostuff". It can be tricky to step through timing related code in the debugger so try adding some logging:
case <-ticker.C:
log.Println("doing stuff")
//dostuff
log.Println("done stuff")
(I also assume you are calling wg.Done()
in your go-routines somewhere though if they are missing that would not be the cause of the problem you describe.)
答案3
得分: 0
Suspender
和Watcher
中的代码没有通过Done()
方法调用来减少waitgroup计数器 - 这是无限执行的原因。
老实说,忘记这样的小事情是相当正常的。这就是为什么作为Go的标准通用实践,建议使用defer
并在函数/方法的最开始处理那些关键的事情。
更新后的实现可能如下所示:
func (eng *Engine) Suspender(ctx context.Context, wg *sync.WaitGroup) error {
defer wg.Done()
// ------------------------------------
func (eng *Engine) Watcher(ctx context.Context, wg *sync.WaitGroup) error {
defer wg.Done()
contextLogger := eng.logger.WithFields(log.Fields{
另外,另一个建议是,查看主例程时,建议始终将context
按值传递给任何被调用的go例程或方法调用(lambda)。
这种方法可以帮助开发人员避免很难注意到的与程序相关的错误。
go func(ctx context.Context) {
err := eng.Watcher(ctx, wg)
if err != nil {
cancel()
}
}(ctx)
编辑-1:(确切的解决方案)
尝试按我之前提到的方式使用值传递来传递上下文。否则,两个go例程将使用相同的上下文(因为您正在引用它),并且只会触发一个ctx.Done()
。
通过将ctx
作为值传递,Go中创建了两个独立的子上下文。在使用cancel()关闭父上下文时,两个子上下文都会独立触发ctx.Done()
。
英文:
The code in Suspender
and in Watcher
doesn't decrement the waitgroup counter through the Done()
method call - the reason behind the infinite execution.
And to be honest it's quite normal to forget such small things. That's why as a standard general practice in Go, it is suggested to use defer
and handle things that are critical (and should be handled inside the function/method ) at the very beginning.
The updated implementation might look like
func (eng *Engine) Suspender(ctx context.Context, wg *sync.WaitGroup) error {
defer wg.Done()
// ------------------------------------
func (eng *Engine) Watcher(ctx context.Context, wg *sync.WaitGroup) error {
defer wg.Done()
contextLogger := eng.logger.WithFields(log.Fields{
Also, another suggestion, looking at the main routine, it is always suggested to pass context by value
to any go-routine or method calls (lambda) that are being invoked.
This approach saves developers from a lot of program-related bugs that can't be noticed very easily.
go func(ctx context.Context) {
err := eng.Watcher(ctx, wg)
if err != nil {
cancel()
}
}(ctx)
Edit-1: (the exact solution)
Try passing the context using the value in the go routines as I mentioned earlier. Otherwise, both of the go routine will use a single context (because you are referencing it) and only one ctx.Done()
will be fired.
By passing ctx
as a value 2 separate child contexts are created in Go. And while closing parent with cancel() - both children independently fires ctx.Done()
.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论