运行时错误:goroutine 栈超过了 1000000000 字节的限制。

huangapple go评论79阅读模式
英文:

runtime: goroutine stack exceeds 1000000000-byte limit

问题

我在SAAS系统中有一个预订系统,其中有成千上万的商家在这个平台上经营业务。

为了完成他们的每日预订,我有一个每5分钟运行一次的cron任务。在这个cron任务中,我有一个调度器API,在主go例程内调用,但是使用另一个go例程和执行命令独立完成。我突然在我的cron系统中遇到了这个错误。

runtime: goroutine stack exceeds 1000000000-byte limit

这是我的代码结构:

package cron

import (
	"gopkg.in/robfig/cron.v3"
)

func RunCron() {
	c := cron.New()
	c.AddFunc("@every 0h5m0s", SendBookingMail)
	c.Start()
}

func SendBookingMail() {
	// 这个函数获取所有商家,并为每个商家的API URL发出curl命令,然后执行下面的函数。
}

func sendMailCron() {
	completeBkMailData := struct {
		Booking         models.Booking    `json:"booking"`
		TestCustomerIds []int             `json:"test_customer_ids"`
		SmsPermission   bool              `json:"sms_permission"`
		SmsKeys         map[string]string `json:"sms_keys"`
	}{
		booking,
		testCids,
		smsPermission,
		smsKeys,
	}
	b, err := json.Marshal(completeBkMailData)
	if err != nil {
		fmt.Println(err)
	}
	jsonString := string(b)
	command := "https://example.com/booking-mail"
	StartCurlCommand(command, "POST", jsonString)
}

func StartCurlCommand(url, reqType, jsonData string, headers ...string) error {
	var ip, userAgent, bearerToken string
	var cmd *exec.Cmd
	if len(headers) > 0 {
		ip = headers[0]
		userAgent = headers[1]
		bearerToken = headers[2]
	}
	if reqType == "POST" {
		cmd = exec.Command("curl", "-H", "Connection: close", "--no-keepalive", "-H", "Content-Type: application/json", "-X", "POST", "-d", jsonData, url)
	} else {
		cmd = exec.Command("curl", "-H", "Connection: close", "--no-keepalive", url)
	}
	var out bytes.Buffer
	var stderr bytes.Buffer
	cmd.Stdout = &out
	cmd.Stderr = &stderr
	err := cmd.Start()
	if err == nil {
		go func(cmd *exec.Cmd) {
			_ = cmd.Wait()
		}(cmd)
	}
	return err
}

我已经搜索过了,发现代码中可能存在递归的问题。但是我无法确定具体在哪里。请帮忙看看这里有什么问题?

英文:

I have booking SAAS system in which I have thousands of merchants which run their business over this platform.

To complete their bookings on daily basis I have a cron which runs every 5 mins. In this cron, I have a scheduler api which is called inside main go routine but completed independently using another go routine along with execute command. I suddenly started getting this error on my cron system.

runtime: goroutine stack exceeds 1000000000-byte limit

Here is my code structure:


package cron
import (
"gopkg.in/robfig/cron.v3"
)
func RunCron() {
c := cron.New()
c.AddFunc("@every 0h5m0s", SendBookingMail)
c.Start()
}
func SendBookingMail() {
// this function get all merchants & issue curl command for api url for each merchant. and then the below function is executed. 
}
func sendMailCron() {
completeBkMailData := struct {
Booking         models.Booking    `json:"booking"`
TestCustomerIds []int             `json:"test_customer_ids"`
SmsPermission   bool              `json:"sms_permission"`
SmsKeys         map[string]string `json:"sms_keys"`
}{
booking,
testCids,
smsPermission,
smsKeys,
}
b, err := json.Marshal(completeBkMailData)
if err != nil {
fmt.Println(err)
}
jsonString := string(b)
command := "https://example.com/booking-mail"
StartCurlCommand(command, "POST", jsonString)
}
func StartCurlCommand(url, reqType, jsonData string, headers ...string) error {
var ip, userAgent, bearerToken string
var cmd *exec.Cmd
if len(headers) > 0 {
ip = headers[0]
userAgent = headers[1]
bearerToken = headers[2]
}
if reqType == "POST" {
cmd = exec.Command("curl", "-H", "Connection: close", "--no-keepalive", "-H", "Content-Type: application/json", "-X", "POST", "-d", jsonData, url)
} else {
cmd = exec.Command("curl", "-H", "Connection: close", "--no-keepalive", url)
}
var out bytes.Buffer
var stderr bytes.Buffer
cmd.Stdout = &out
cmd.Stderr = &stderr
err := cmd.Start()
if err == nil {
go func(cmd *exec.Cmd) {
_ = cmd.Wait()
}(cmd)
}
return err
}

I have already searched for this and found that there can be some sort of recursion in the code. But I am not able to identify where it is. Please help what is wrong here ?

答案1

得分: 1

goroutine stack exceeds 1000000000-byte limit 表示你的程序中存在无限递归(或递归层次太深)。调用栈是一种有限的资源,因此应该谨慎使用递归。

示例:

package main

func test(x int) int {
    return x + test(x+1)
}

func main() {
    test(1)
}

打印出 panic 信息:

$ go run .
runtime: goroutine stack exceeds 1000000000-byte limit
runtime: sp=0xc020160398 stack=[0xc020160000, 0xc040160000]
fatal error: stack overflow

runtime stack:
runtime.throw(0x474d4b, 0xe)
        /usr/local/go/src/runtime/panic.go:1116 +0x72
runtime.newstack()
        /usr/local/go/src/runtime/stack.go:1067 +0x78d
runtime.morestack()
        /usr/local/go/src/runtime/asm_amd64.s:449 +0x8f

goroutine 1 [running]:
main.test(0xffffdf, 0x0)
        /home/test/gtest/test.go:3 +0x50 fp=0xc0201603a8 sp=0xc0201603a0 pc=0x45dcd0
main.test(0xffffde, 0x0)
        /home/test/gtest/test.go:4 +0x2f fp=0xc0201603c8 sp=0xc0201603a8 pc=0x45dcaf
main.test(0xffffdd, 0x0)
        /home/test/gtest/test.go:4 +0x2f fp=0xc0201603e8 sp=0xc0201603c8 pc=0x45dcaf
main.test(0xffffdc, 0x0)
        /home/test/gtest/test.go:4 +0x2f fp=0xc020160408 sp=0xc0201603e8 pc=0x45dcaf
main.test(0xffffdb, 0x0)
        /home/test/gtest/test.go:4 +0x2f fp=0xc020160428 sp=0xc020160408 pc=0x45dcaf
main.test(0xffffda, 0x0)
        /home/test/gtest/test.go:4 +0x2f fp=0xc020160448 sp=0xc020160428 pc=0x45dcaf
main.test(0xffffd9, 0x0)
        /home/test/gtest/test.go:4 +0x2f fp=0xc020160468 sp=0xc020160448 pc=0x45dcaf
main.test(0xffffd8, 0x0)
        /home/test/gtest/test.go:4 +0x2f fp=0xc020160488 sp=0xc020160468 pc=0x45dcaf
main.test(0xffffd7, 0x0)
        /home/test/gtest/test.go:4 +0x2f fp=0xc0201604a8 sp=0xc020160488 pc=0x45dcaf
. . .

从 goroutine 的跟踪信息中,我们可以看到问题出现在 `test.go` 的第 `4` 行,即对第 `3` 行的递归调用。这应该足够让我们修复代码了。

<details>
<summary>英文:</summary>

`goroutine stack exceeds 1000000000-byte limit` means you have **infinite recursion** (or too deep recursion) in your program. Call stack is a limited resource, so recursion should be used sparingly.

Example:

    package main
    
    func test(x int) int {
    	return x + test(x+1)
    }
    
    func main() {
    	test(1)
    }

Print a panic:

&lt;!-- language: lang-none --&gt;

	$ go run .
	runtime: goroutine stack exceeds 1000000000-byte limit
	runtime: sp=0xc020160398 stack=[0xc020160000, 0xc040160000]
	fatal error: stack overflow

	runtime stack:
	runtime.throw(0x474d4b, 0xe)
			/usr/local/go/src/runtime/panic.go:1116 +0x72
	runtime.newstack()
			/usr/local/go/src/runtime/stack.go:1067 +0x78d
	runtime.morestack()
			/usr/local/go/src/runtime/asm_amd64.s:449 +0x8f

	goroutine 1 [running]:
	main.test(0xffffdf, 0x0)
			/home/test/gtest/test.go:3 +0x50 fp=0xc0201603a8 sp=0xc0201603a0 pc=0x45dcd0
	main.test(0xffffde, 0x0)
			/home/test/gtest/test.go:4 +0x2f fp=0xc0201603c8 sp=0xc0201603a8 pc=0x45dcaf
	main.test(0xffffdd, 0x0)
			/home/test/gtest/test.go:4 +0x2f fp=0xc0201603e8 sp=0xc0201603c8 pc=0x45dcaf
	main.test(0xffffdc, 0x0)
			/home/test/gtest/test.go:4 +0x2f fp=0xc020160408 sp=0xc0201603e8 pc=0x45dcaf
	main.test(0xffffdb, 0x0)
			/home/test/gtest/test.go:4 +0x2f fp=0xc020160428 sp=0xc020160408 pc=0x45dcaf
	main.test(0xffffda, 0x0)
			/home/test/gtest/test.go:4 +0x2f fp=0xc020160448 sp=0xc020160428 pc=0x45dcaf
	main.test(0xffffd9, 0x0)
			/home/test/gtest/test.go:4 +0x2f fp=0xc020160468 sp=0xc020160448 pc=0x45dcaf
	main.test(0xffffd8, 0x0)
			/home/test/gtest/test.go:4 +0x2f fp=0xc020160488 sp=0xc020160468 pc=0x45dcaf
	main.test(0xffffd7, 0x0)
			/home/test/gtest/test.go:4 +0x2f fp=0xc0201604a8 sp=0xc020160488 pc=0x45dcaf
	. . .

From the goroutine trace we can see that the issue is in `test.go` at line `4`, which is the recursive call to line `3`. That should give us enough knowledge to fix our code.


</details>



# 答案2
**得分**: 0

可能是因为[问题7181][1]尚未发布。

有没有办法对此进行投票?我们的开发团队刚刚花了一天的时间猜测我们代码的哪一部分可能在堆栈中,因为我们的整个代码库都被从跟踪中省略了。

[提交3a81338][2]已被撤销,但使用带有此提交的修补过的Go进行测试将会有所帮助:不再在无限递归期间打印大量堆栈跟踪,这会向用户发送垃圾信息并且没有用处,而是打印出前50个和后50个帧。

这将有助于更好地确定递归的根本原因。

[OP Amandeep Kaur][3]在[评论中][4]确认:

> 堆栈跟踪中提到了递归。
我修复了这个问题。现在系统正常工作。

  [1]: https://github.com/golang/go/issues/7181
  [2]: https://github.com/golang/go/commit/3a81338622eb5c8b94f11001855e2a68a9e36bed
  [3]: https://stackoverflow.com/users/7533957/amandeep-kaur
  [4]: https://stackoverflow.com/questions/69625277/runtime-goroutine-stack-exceeds-1000000000-byte-limit?noredirect=1#comment123068142_69625277

<details>
<summary>英文:</summary>

&gt; . But I am not able to identify where it is. 

That might be because [issue 7181][1] has yet to be released.

&gt; Any way to vote on this? Our dev team just spent a day trying to guess which part of our code might have been in the stack because our entire codebase was elided from the trace.

[commit 3a81338][2] was reverted, but using a patched Go with this commit, just for testing, would help: instead of printing massive stack traces during
endless recursion, which spams users and aren&#39;t useful, it now prints out
the top and bottom 50 frames.

That would go a long way to help identify the root cause for the recursion.

The [OP Amandeep Kaur][3] confirms in [the comments][4]:

&gt; There was recursion mentioned in the stack trace.  
I fixed that. Now the system is working fine.


  [1]: https://github.com/golang/go/issues/7181
  [2]: https://github.com/golang/go/commit/3a81338622eb5c8b94f11001855e2a68a9e36bed
  [3]: https://stackoverflow.com/users/7533957/amandeep-kaur
  [4]: https://stackoverflow.com/questions/69625277/runtime-goroutine-stack-exceeds-1000000000-byte-limit?noredirect=1#comment123068142_69625277

</details>



huangapple
  • 本文由 发表于 2021年10月19日 12:52:03
  • 转载请务必保留本文链接:https://go.coder-hub.com/69625277.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定