英文:
How to print the results from a concurrent and recursive function?
问题
我一直在进行Go语言之旅,我已经完成了网络爬虫练习,但我认为我用来打印所有结果的技术可能效率低下。
这是我的代码。我只编辑了crawl和main函数,所以我只会发布那部分。这是练习的链接(http://tour.golang.org/#70)
var used = make(map[string]bool)
func Crawl(url string, depth int, fetcher Fetcher, results chan string) {
if depth <= 0 {
return
}
body, urls, err := fetcher.Fetch(url)
if err != nil {
results <- fmt.Sprintf("%v",err)
return
}
results <-fmt.Sprintf("\nfound: %s %q\n", url, body)
for _,u := range urls {
if used[u] == false {
used[u] = true
go Crawl(u, depth-1, fetcher, results)
}
}
return
}
//------------------------------------------------------------
func main() {
used["http://golang.org/"] = true
results := make(chan string)
go Crawl("http://golang.org/", 4, fetcher, results)
for i := 0; i < len(used); i++ {
fmt.Println(<-results)
}
}
我在main函数中使用"for i < len(used)"这一行来确保只有在有结果要打印时才打印results的值。我不能只使用
for i := range results
因为在crawl函数中使用"close(results)"很困难,因为它是递归的,但是用我这种方式做,我每次都要找到变量used的长度。
有更好的方法吗?
英文:
I've been going through the go tour, and I've finished the web crawler exercise, but I think that the technique I used to print all the results may be inefficient.
Here is my code. I only edited the crawl and main functions so I'll just post that. Here is the link to the exercise ( http://tour.golang.org/#70 )
var used = make(map[string]bool)
func Crawl(url string, depth int, fetcher Fetcher, results chan string) {
if depth <= 0 {
return
}
body, urls, err := fetcher.Fetch(url)
if err != nil {
results <- fmt.Sprintf("%v",err)
return
}
results <-fmt.Sprintf("\nfound: %s %q\n", url, body)
for _,u := range urls {
if used[u] == false {
used[u] = true
go Crawl(u, depth-1, fetcher, results)
}
}
return
}
//------------------------------------------------------------
func main() {
used["http://golang.org/"] = true
results := make(chan string)
go Crawl("http://golang.org/", 4, fetcher, results)
for i := 0; i < len(used); i++ {
fmt.Println(<-results)
}
}
I use the "for i < len(used)" line in main to ensure that the value from results is printed only if there is a result to print. I can't just use
for i := range results
because it is hard to use "close(results)" in the crawl function since it is recursive, but with the way I do it I have to find the length of the variable used every time.
Is there a better way to do this?
答案1
得分: 3
要等待一组goroutine完成,使用sync.WaitGroup。
我相信你会发现官方文档中的示例非常熟悉。
http://golang.org/pkg/sync/#example_WaitGroup
引用:
var wg sync.WaitGroup
var urls = []string{
"http://www.golang.org/",
"http://www.google.com/",
"http://www.somestupidname.com/",
}
for _, url := range urls {
// 增加WaitGroup计数器。
wg.Add(1)
// 启动一个goroutine来获取URL。
go func(url string) {
// 获取URL。
http.Get(url)
// 减少计数器。
wg.Done()
}(url)
}
// 等待所有HTTP获取完成。
wg.Wait()
这将阻塞直到所有工作完成。
如果你真的想要在收集结果时逐步打印它们,最简单的方法是在fetcher本身中完成。
英文:
To wait for a collection of goroutines to finish, use a <code>sync.WaitGroup</sync>.
I believe you'll find the example in the official documentation very familiar..
http://golang.org/pkg/sync/#example_WaitGroup
Quoting:
var wg sync.WaitGroup
var urls = []string{
"http://www.golang.org/",
"http://www.google.com/",
"http://www.somestupidname.com/",
}
for _, url := range urls {
// Increment the WaitGroup counter.
wg.Add(1)
// Launch a goroutine to fetch the URL.
go func(url string) {
// Fetch the URL.
http.Get(url)
// Decrement the counter.
wg.Done()
}(url)
}
// Wait for all HTTP fetches to complete.
wg.Wait()
This will block until all the work is done.
If you really want to print the results progressively as you collect them, the simplest way is to do it in the fetcher itself.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论