英文:
Golang stuck in WaitGroup
问题
我被卡在一个自己的等待循环中,不太确定原因。该函数接收一个输入通道和一个输出通道,然后对通道中的每个项目执行http.GET请求以获取内容,并从HTML中提取
获取和解析的过程在一个go协程中进行,并且我设置了一个等待组(innerWait),以确保在关闭输出通道之前我已经处理完所有内容。
日志内容如下:
2015/08/09 22:02:10 INFO: Revived query parameter: golang
2015/08/09 22:02:10 INFO: Getting active tweets from the last 7 days.
2015/08/09 22:02:10 INFO: Incrementing inner WaitGroup.
2015/08/09 22:02:10 INFO: Incrementing inner WaitGroup.
2015/08/09 22:02:10 INFO: Incrementing inner WaitGroup.
2015/08/09 22:02:10 INFO: Incrementing inner WaitGroup.
2015/08/09 22:02:10 INFO: Incrementing inner WaitGroup.
2015/08/09 22:02:10 INFO: Incrementing inner WaitGroup.
2015/08/09 22:02:10 INFO: Waiting for title pull wait group.
2015/08/09 22:02:10 INFO: Getting title for: http://devsisters.github.io/goquic/
2015/08/09 22:02:10 INFO: Pulled title: GoQuic by devsisters
2015/08/09 22:02:10 INFO: Getting title for: http://whizdumb.me/2015/03/03/matching-a-string-and-extracting-values-using-regex/
2015/08/09 22:02:10 INFO: Pulled title: Matching a string and extracting values using regex | Whizdumb's blog
2015/08/09 22:02:10 INFO: Getting title for: https://www.reddit.com/r/golang/comments/3g7tyv/dropboxs_infrastructure_is_go_at_a_huge_scale/
2015/08/09 22:02:10 INFO: Pulled title: Dropbox's infrastructure is Go at a huge scale : golang
2015/08/09 22:02:10 INFO: Getting title for: http://dave.cheney.net/2015/08/08/performance-without-the-event-loop
2015/08/09 22:02:10 INFO: Pulled title: Performance without the event loop | Dave Cheney
2015/08/09 22:02:11 INFO: Getting title for: https://github.com/ccirello/sublime-gosnippets
2015/08/09 22:02:11 INFO: Pulled title: ccirello/sublime-gosnippets · GitHub
2015/08/09 22:02:11 INFO: Getting title for: https://medium.com/iron-io-blog/an-easier-way-to-create-tiny-golang-docker-images-7ba2893b160?mkt_tok=3RkMMJWWfF9wsRonuqTMZKXonjHpfsX57ewoWaexlMI/0ER3fOvrPUfGjI4ATsNrI%2BSLDwEYGJlv6SgFQ7LMMaZq1rgMXBk%3D&utm_content=buffer45a1c&utm_medium=social&utm_source=twitter.com&utm_campaign=buffer
2015/08/09 22:02:11 INFO: Pulled title: An Easier Way to Create Tiny Golang Docker Images — Iron.io Blog — Medium
根据日志,我可以看到我已经到达了innerWait.Wait()命令,这也告诉我管道的另一端已经关闭了入站通道。
似乎匿名函数中的延迟语句没有被调用,因为我无法在任何地方看到延迟的日志语句。但是我无法弄清楚原因,因为该块中的所有代码似乎都执行了。
希望能得到帮助。
英文:
I'm stuck in my own wait loop and not really sure why. The function takes an input and output channel, then takes each item in the channel, executes an http.GET for the content and pulls the <title> tag from the html.
The process to GET and scrape is inside a go routine, and I've set up a wait group (innerWait) to be sure that I've processed everything before closing the output channel.
func (fp FeedProducer) getTitles(in <-chan feeds.Item,
out chan<- feeds.Item,
wg *sync.WaitGroup) {
defer wg.Done()
var innerWait sync.WaitGroup
for item := range in {
log.Infof(fp.c, "Incrementing inner WaitGroup.")
innerWait.Add(1)
go func(item feeds.Item) {
defer innerWait.Done()
defer log.Infof(fp.c, "Decriment inner wait group by defer.")
client := urlfetch.Client(fp.c)
resp, err := client.Get(item.Link.Href)
log.Infof(fp.c, "Getting title for: %v", item.Link.Href)
if err != nil {
log.Errorf(fp.c, "Error retriving page. %v", err.Error())
return
}
if strings.ToLower(resp.Header.Get("Content-Type")) == "text/html; charset=utf-8" {
title := fp.scrapeTitle(resp)
item.Title = title
} else {
log.Errorf(fp.c, "Wrong content type. Received: %v from %v", resp.Header.Get("Content-Type"), item.Link.Href)
}
out <- item
}(item)
}
log.Infof(fp.c, "Waiting for title pull wait group.")
innerWait.Wait()
log.Infof(fp.c, "Done waiting for title pull.")
close(out)
}
func (fp FeedProducer) scrapeTitle(request *http.Response) string {
defer request.Body.Close()
tokenizer := html.NewTokenizer(request.Body)
var titleIsNext bool
for {
token := tokenizer.Next()
switch {
case token == html.ErrorToken:
log.Infof(fp.c, "Hit the end of the doc without finding title.")
return ""
case token == html.StartTagToken:
tag := tokenizer.Token()
isTitle := tag.Data == "title"
if isTitle {
titleIsNext = true
}
case titleIsNext && token == html.TextToken:
title := tokenizer.Token().Data
log.Infof(fp.c, "Pulled title: %v", title)
return title
}
}
}
Log content looks like this:
2015/08/09 22:02:10 INFO: Revived query parameter: golang
2015/08/09 22:02:10 INFO: Getting active tweets from the last 7 days.
2015/08/09 22:02:10 INFO: Incrementing inner WaitGroup.
2015/08/09 22:02:10 INFO: Incrementing inner WaitGroup.
2015/08/09 22:02:10 INFO: Incrementing inner WaitGroup.
2015/08/09 22:02:10 INFO: Incrementing inner WaitGroup.
2015/08/09 22:02:10 INFO: Incrementing inner WaitGroup.
2015/08/09 22:02:10 INFO: Incrementing inner WaitGroup.
2015/08/09 22:02:10 INFO: Waiting for title pull wait group.
2015/08/09 22:02:10 INFO: Getting title for: http://devsisters.github.io/goquic/
2015/08/09 22:02:10 INFO: Pulled title: GoQuic by devsisters
2015/08/09 22:02:10 INFO: Getting title for: http://whizdumb.me/2015/03/03/matching-a-string-and-extracting-values-using-regex/
2015/08/09 22:02:10 INFO: Pulled title: Matching a string and extracting values using regex | Whizdumb's blog
2015/08/09 22:02:10 INFO: Getting title for: https://www.reddit.com/r/golang/comments/3g7tyv/dropboxs_infrastructure_is_go_at_a_huge_scale/
2015/08/09 22:02:10 INFO: Pulled title: Dropbox's infrastructure is Go at a huge scale : golang
2015/08/09 22:02:10 INFO: Getting title for: http://dave.cheney.net/2015/08/08/performance-without-the-event-loop
2015/08/09 22:02:10 INFO: Pulled title: Performance without the event loop | Dave Cheney
2015/08/09 22:02:11 INFO: Getting title for: https://github.com/ccirello/sublime-gosnippets
2015/08/09 22:02:11 INFO: Pulled title: ccirello/sublime-gosnippets · GitHub
2015/08/09 22:02:11 INFO: Getting title for: https://medium.com/iron-io-blog/an-easier-way-to-create-tiny-golang-docker-images-7ba2893b160?mkt_tok=3RkMMJWWfF9wsRonuqTMZKXonjHpfsX57ewoWaexlMI/0ER3fOvrPUfGjI4ATsNrI%2BSLDwEYGJlv6SgFQ7LMMaZq1rgMXBk%3D&utm_content=buffer45a1c&utm_medium=social&utm_source=twitter.com&utm_campaign=buffer
2015/08/09 22:02:11 INFO: Pulled title: An Easier Way to Create Tiny Golang Docker Images — Iron.io Blog — Medium
I can see that I'm getting to the innerWait.Wait() command based on the logs, which also tells me that the inbound channel has been closed on the other side of the pipe.
It would appear that the defer statements in the anonymous function are not being called, as I can't see the deferred log statement printed anywhere. But I can't for the life of me tell why as all code in that block appears to execute.
Help is appreciated.
答案1
得分: 7
以下是要翻译的内容:
goroutine在这一行代码中被卡住了,它们正在向out
发送数据:
out <- item
修复的方法是启动一个goroutine来接收out
的数据。
解决这类问题的一个好方法是通过发送SIGQUIT信号来转储goroutine的堆栈信息。
英文:
The goroutines are stuck sending to out
at this line:
out <- item
The fix is to start a goroutine to receive on out
.
A good way to debug issues like this is to dump the goroutine stacks by sending the process a SIGQUIT.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论