如何打印并发和递归函数的结果?

huangapple go评论86阅读模式
英文:

How to print the results from a concurrent and recursive function?

问题

我一直在进行Go语言之旅,我已经完成了网络爬虫练习,但我认为我用来打印所有结果的技术可能效率低下。

这是我的代码。我只编辑了crawl和main函数,所以我只会发布那部分。这是练习的链接(http://tour.golang.org/#70)

    var used = make(map[string]bool)

    func Crawl(url string, depth int, fetcher Fetcher, results chan string) {
        if depth <= 0 {
            return
        }
        body, urls, err := fetcher.Fetch(url)
        if err != nil {
            results <- fmt.Sprintf("%v",err)
            return
        }
        results <-fmt.Sprintf("\nfound: %s %q\n", url, body)
        for _,u := range urls {
            if used[u] == false {
                used[u] = true
                go Crawl(u, depth-1, fetcher, results)
            }
        }
        return
    }
    //------------------------------------------------------------
    func main() {
        used["http://golang.org/"] = true
        results := make(chan string)
        go Crawl("http://golang.org/", 4, fetcher, results)
        for i := 0; i < len(used); i++ {
            fmt.Println(<-results)
        }
    }

我在main函数中使用"for i < len(used)"这一行来确保只有在有结果要打印时才打印results的值。我不能只使用

    for i := range results

因为在crawl函数中使用"close(results)"很困难,因为它是递归的,但是用我这种方式做,我每次都要找到变量used的长度。

有更好的方法吗?

英文:

I've been going through the go tour, and I've finished the web crawler exercise, but I think that the technique I used to print all the results may be inefficient.

Here is my code. I only edited the crawl and main functions so I'll just post that. Here is the link to the exercise ( http://tour.golang.org/#70 )

    var used = make(map[string]bool)

    func Crawl(url string, depth int, fetcher Fetcher, results chan string) {
        if depth &lt;= 0 {
	        return
        }
        body, urls, err := fetcher.Fetch(url)
    	if err != nil {
            results &lt;- fmt.Sprintf(&quot;%v&quot;,err)
    		return
        }
    	results &lt;-fmt.Sprintf(&quot;\nfound: %s %q\n&quot;, url, body)
        for _,u := range urls {
            if used[u] == false {
                used[u] = true
                go Crawl(u, depth-1, fetcher, results)
            }
    	}
        return
    }
    //------------------------------------------------------------
    func main() {
        used[&quot;http://golang.org/&quot;] = true
        results := make(chan string)
        go Crawl(&quot;http://golang.org/&quot;, 4, fetcher, results)
        for i := 0; i &lt; len(used); i++ {
            fmt.Println(&lt;-results)
        }
    }

I use the "for i < len(used)" line in main to ensure that the value from results is printed only if there is a result to print. I can't just use

    for i := range results

because it is hard to use "close(results)" in the crawl function since it is recursive, but with the way I do it I have to find the length of the variable used every time.

Is there a better way to do this?

答案1

得分: 3

要等待一组goroutine完成,使用sync.WaitGroup。

我相信你会发现官方文档中的示例非常熟悉。

http://golang.org/pkg/sync/#example_WaitGroup

引用:

var wg sync.WaitGroup
var urls = []string{
    "http://www.golang.org/",
    "http://www.google.com/",
    "http://www.somestupidname.com/",
}
for _, url := range urls {
    // 增加WaitGroup计数器。
    wg.Add(1)
    // 启动一个goroutine来获取URL。
    go func(url string) {
        // 获取URL。
        http.Get(url)
        // 减少计数器。
        wg.Done()
    }(url)
}
// 等待所有HTTP获取完成。
wg.Wait()

这将阻塞直到所有工作完成。

如果你真的想要在收集结果时逐步打印它们,最简单的方法是在fetcher本身中完成。

英文:

To wait for a collection of goroutines to finish, use a <code>sync.WaitGroup</sync>.

I believe you'll find the example in the official documentation very familiar..

http://golang.org/pkg/sync/#example_WaitGroup

Quoting:

var wg sync.WaitGroup
var urls = []string{
    &quot;http://www.golang.org/&quot;,
    &quot;http://www.google.com/&quot;,
    &quot;http://www.somestupidname.com/&quot;,
}
for _, url := range urls {
    // Increment the WaitGroup counter.
    wg.Add(1)
    // Launch a goroutine to fetch the URL.
    go func(url string) {
        // Fetch the URL.
        http.Get(url)
        // Decrement the counter.
        wg.Done()
    }(url)
}
// Wait for all HTTP fetches to complete.
wg.Wait()

This will block until all the work is done.

If you really want to print the results progressively as you collect them, the simplest way is to do it in the fetcher itself.

huangapple
  • 本文由 发表于 2012年9月1日 13:51:12
  • 转载请务必保留本文链接:https://go.coder-hub.com/12225278.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定