英文:
Why does my program not shut down when I use goroutine in main?
问题
上下文
请仔细阅读代码中的注释,所有信息都在注释中。
如果你有使用 discordgo 的经验
完整的代码可以在这里找到:https://github.com/telephrag/kubinka/tree/bug(查看 strg 和 main 包)。当添加了 goroutine 命令处理程序后,停止工作的问题也会出现。与数据库交互的所有内容(在 /deploy 和 /return 上存储和从数据库中删除)都完全不起作用。用户只会收到“应用程序未响应”的消息,而不是正确的响应(查看以 cmd_ 前缀开头的包)。
package main
import (
	"context"
	"fmt"
	"log"
	"os"
	"os/signal"
	"syscall"
	"time"
	"go.etcd.io/bbolt"
)
/* 	要重现:
启动程序,等待几秒钟,然后按下^C。
预期情况是程序在几次尝试后仍不会关闭。
*/
func WatchExpirations(ctx context.Context, db *bbolt.DB, bkt string) error {
	timeout := time.After(time.Second * 5)
	for {
		select {
		case <-timeout:
			tx, err := db.Begin(true)
			if err != nil {
				return fmt.Errorf("bolt: failed to start transaction")
			}
			bkt := tx.Bucket([]byte(bkt))
			c := bkt.Cursor()
			for k, v := c.First(); k != nil; k, v = c.Next() {
				// 处理 bucket 的内容...
				fmt.Println(v) // 检查 v 是否符合条件,如果符合则删除
				if err := tx.Commit(); err != nil { // BUG: 在循环中提交事务
					tx.Rollback()
					return fmt.Errorf("bolt: failed to commit transaction: %w", err)
				}
				timeout = time.After(time.Second * 5)
			}
		case <-ctx.Done():
			return ctx.Err()
		}
	}
}
func main() {
	ctx, cancel := context.WithCancel(context.Background())
	db, err := bbolt.Open("kubinka.db", 0666, nil)
	if err != nil {
		log.Panicf("failed to open db %s: %v", "kubinka.db", err)
	}
	if err = db.Update(func(tx *bbolt.Tx) error {
		_, err := tx.CreateBucketIfNotExists([]byte("players"))
		if err != nil {
			return fmt.Errorf("failed to create bucket %s: %w", "players", err)
		}
		return nil
	}); err != nil {
		log.Panic(err)
	}
	defer func() { // BUG?: 在 defer 中发生 panic
		if err := db.Close(); err != nil { // 在调试模式下会正常关闭
			log.Panicf("error closing db conn: %v", err) // 否则会卡住
		}
	}()
	// 使用 `ds` 处理来自用户的命令,同时在内部存储 ctx
	go func() { // 添加此 goroutine 会阻止程序关闭
		err = WatchExpirations(ctx, db, "players")
		if err != nil {
			log.Printf("error while watching expirations in db")
			cancel()
		}
	}()
	interrupt := make(chan os.Signal, 1)
	signal.Notify(interrupt, syscall.SIGTERM, syscall.SIGINT)
	for {
		select {
		// 如调试器中所见,会进入此分支
		// 但程序会永远停滞不前
		case <-interrupt:
			log.Println("Execution stopped by user")
			cancel()
			return // 被调用,但程序不会停止
		case <-ctx.Done():
			log.Println("ctx cancelled")
			return
		default:
			time.Sleep(time.Millisecond * 200)
		}
	}
}
英文:
Context
Please, read the comments in code carefully. Everything is in them.
In case you have experience using discordgo
The full code can be found here: https://github.com/telephrag/kubinka/tree/bug (see packages strg and main) With addition of goroutine command handlers stop working properly as well. Everything related to interaction with database (storing and removing from database on /deploy and /return respectively) is not working at all. Users receive only "The application did not respond" message instead of proper response (see packages starting with cmd_ prefix).
package main
import (
"context"
"fmt"
"log"
"os"
"os/signal"
"syscall"
"time"
"go.etcd.io/bbolt"
)
/* 	TO REPRODUCE:
Start the program wait a few seconds and press ^C.
Expect the case of program not shutting down after few attempts.
*/
func WatchExpirations(ctx context.Context, db *bbolt.DB, bkt string) error {
timeout := time.After(time.Second * 5)
for {
select {
case <-timeout:
tx, err := db.Begin(true)
if err != nil {
return fmt.Errorf("bolt: failed to start transaction")
}
bkt := tx.Bucket([]byte(bkt))
c := bkt.Cursor()
for k, v := c.First(); k != nil; k, v = c.Next() {
// do stuff with bucket...
fmt.Println(v) // check if v matches condition, delete if does
if err := tx.Commit(); err != nil { // BUG: commiting transaction in a loop
tx.Rollback()
return fmt.Errorf("bolt: failed to commit transaction: %w", err)
}
timeout = time.After(time.Second * 5)
}
case <-ctx.Done():
return ctx.Err()
}
}
}
func main() {
ctx, cancel := context.WithCancel(context.Background())
db, err := bbolt.Open("kubinka.db", 0666, nil)
if err != nil {
log.Panicf("failed to open db %s: %v", "kubinka.db", err)
}
if err = db.Update(func(tx *bbolt.Tx) error {
_, err := tx.CreateBucketIfNotExists([]byte("players"))
if err != nil {
return fmt.Errorf("failed to create bucket %s: %w", "players", err)
}
return nil
}); err != nil {
log.Panic(err)
}
defer func() { // BUG?: Panicing inside defer
if err := db.Close(); err != nil { // will close normally in debug mode
log.Panicf("error closing db conn: %v", err) // will stuck otherwise
}
}()
// use `ds` to handle commands from user while storing ctx internally
go func() { // addition of this goroutine prevents program from shutting down
err = WatchExpirations(ctx, db, "players")
if err != nil {
log.Printf("error while watching expirations in db")
cancel()
}
}()
interrupt := make(chan os.Signal, 1)
signal.Notify(interrupt, syscall.SIGTERM, syscall.SIGINT)
for {
select {
// as was seen in the debugger this branch is being reached
// however than program stalls eternally
case <-interrupt:
log.Println("Execution stopped by user")
cancel()
return // is called but program doesn't stop
case <-ctx.Done():
log.Println("ctx cancelled")
return
default:
time.Sleep(time.Millisecond * 200)
}
}
}
答案1
得分: 0
根据您在存储库中的评论,问题似乎出现在这里:
tx, err := db.Begin(true)
if err != nil {
   return fmt.Errorf("bolt: failed to start transaction")
}
bkt := tx.Bucket([]byte(bkt))
c := bkt.Cursor()
for k, v := c.First(); k != nil; k, v = c.Next() {
	// 处理存储桶...
	fmt.Println(v) // 检查 v 是否符合条件,如果符合则删除
	if err := tx.Commit(); err != nil { // BUG: 在循环中提交事务
		tx.Rollback()
		return fmt.Errorf("bolt: failed to commit transaction: %w", err)
	}
	timeout = time.After(time.Second * 5)
}
循环可能会执行0到多次。
- 如果没有迭代 - 
tx不会被提交,timeout不会被重置(因此case <-timeout:将不会再次触发)。 - 如果有多次迭代 - 您将尝试多次 
tx.Commit()(会出错)。 
这可能导致您看到的问题;bolt 的 Close 函数:
Close 释放所有数据库资源。在关闭数据库之前,必须关闭所有事务。
因此,如果有一个正在运行的事务,Close 会阻塞直到其完成(在内部,bolt 在事务开始时锁定 mutex,并在完成时 释放)。
解决方案是确保事务始终关闭(并且只关闭一次)。
英文:
As per the comment in your repo the issue appears to have been here:
tx, err := db.Begin(true)
if err != nil {
   return fmt.Errorf("bolt: failed to start transaction")
}
bkt := tx.Bucket([]byte(bkt))
c := bkt.Cursor()
for k, v := c.First(); k != nil; k, v = c.Next() {
	// do stuff with bucket...
	fmt.Println(v) // check if v matches condition, delete if does
	if err := tx.Commit(); err != nil { // BUG: commiting transaction in a loop
		tx.Rollback()
		return fmt.Errorf("bolt: failed to commit transaction: %w", err)
	}
	timeout = time.After(time.Second * 5)
}
The loop could iterate 0-many times.
- If there are no iterations - 
txis not committed andtimeoutnot reset (socase <-timeout:will not be triggered again). - If there are more than one iterations - you will attempt to 
tx.Commit()multiple times (an error). 
This probably led to the issue you saw; the bolt Close function:
>Close releases all database resources. All transactions must be closed before closing the database.
So if there is a transaction running Close blocks until is completes (internally bolt locks a mutex when the transaction begins and releases it when done).
The solution is to ensure that the transaction is always closed (and only closed once).
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论