WebCrawl exercise done using sync.WaitGroup panics. What I am doing wrong? What would be Go idiomatic solution?

huangapple go评论84阅读模式
英文:

WebCrawl exercise done using sync.WaitGroup panics. What I am doing wrong? What would be Go idiomatic solution?

问题

我正在使用他们的“Go之旅”学习Go语言。我已经完成了所有的练习,但最后一个练习让我感到沮丧。它出现了fatal error: all goroutines are asleep - deadlock!的错误。

package main

import (
	"fmt"
	"sync"
)

type Fetcher interface {
	// Fetch返回URL的内容和在该页面上找到的URL的切片。
	Fetch(url string) (body string, urls []string, err error)
}

var urlCache = make(map[string]bool)
var mutex = sync.Mutex{}

// Crawl使用fetcher递归地爬取以url为起点的页面,最大深度为depth。
func Crawl(url string, depth int, fetcher Fetcher, wg *sync.WaitGroup) {
	defer wg.Done()

	fmt.Printf("Crawl: %v \n", url)

	if depth <= 0 {
		fmt.Println("Crawl: reached depth")
		return
	}

	mutex.Lock()
	if alreadyFetched := urlCache[url]; alreadyFetched {
		fmt.Printf("Crawl: %v already fetched\n", url)
		return
	}
	urlCache[url] = true
	mutex.Unlock()

	body, urls, err := fetcher.Fetch(url)
	if err != nil {
		fmt.Println(err)
		return
	}
	fmt.Printf("Crawl: found %s %q\n", url, body)
	for _, u := range urls {
		wg.Add(1)
		go Crawl(u, depth-1, fetcher, wg)
	}
	return
}

func main() {
	var wg sync.WaitGroup

	fmt.Println("Main: Starting worker")
	wg.Add(1)
	go Crawl("https://golang.org/", 4, fetcher, &wg)
	fmt.Println("Main: Waiting for workers to finish")

	fmt.Println("Main: Completed")

	wg.Wait()
}

// fakeFetcher是一个返回固定结果的Fetcher。
type fakeFetcher map[string]*fakeResult

type fakeResult struct {
	body string
	urls []string
}

func (f fakeFetcher) Fetch(url string) (string, []string, error) {
	if res, ok := f[url]; ok {
		return res.body, res.urls, nil
	}
	return "", nil, fmt.Errorf("not found: %s", url)
}

// fetcher是一个填充了假数据的fakeFetcher。
var fetcher = fakeFetcher{
	"https://golang.org/": &fakeResult{
		"The Go Programming Language",
		[]string{
			"https://golang.org/pkg/",
			"https://golang.org/cmd/",
		},
	},
	"https://golang.org/pkg/": &fakeResult{
		"Packages",
		[]string{
			"https://golang.org/",
			"https://golang.org/cmd/",
			"https://golang.org/pkg/fmt/",
			"https://golang.org/pkg/os/",
		},
	},
	"https://golang.org/pkg/fmt/": &fakeResult{
		"Package fmt",
		[]string{
			"https://golang.org/",
			"https://golang.org/pkg/",
		},
	},
	"https://golang.org/pkg/os/": &fakeResult{
		"Package os",
		[]string{
			"https://golang.org/",
			"https://golang.org/pkg/",
		},
	},
}

任何提示和帮助将不胜感激。谢谢。

编辑1:内联代码

英文:

I am learning Go using their "A Tour of Go". I managed to do all exercises but the last one has me frustrated. It is dying with fatal error: all goroutines are asleep - deadlock!

package main
import (
&quot;fmt&quot;
&quot;sync&quot;
)
type Fetcher interface {
// Fetch returns the body of URL and
// a slice of URLs found on that page.
Fetch(url string) (body string, urls []string, err error)
}
var urlCache = make(map[string]bool)
var mutex = sync.Mutex{}
// Crawl uses fetcher to recursively crawl
// pages starting with url, to a maximum of depth.
func Crawl(url string, depth int, fetcher Fetcher, wg *sync.WaitGroup) {
defer wg.Done()
fmt.Printf(&quot;Crawl: %v \n&quot;, url)
if depth &lt;= 0 {
fmt.Println(&quot;Crawl: reached depth&quot;)
return
}
mutex.Lock()
if alreadyFetched := urlCache
; alreadyFetched { fmt.Printf(&quot;Crawl: %v already fetched\n&quot;, url) return } urlCache
= true mutex.Unlock() body, urls, err := fetcher.Fetch(url) if err != nil { fmt.Println(err) return } fmt.Printf(&quot;Crawl: found %s %q\n&quot;, url, body) for _, u := range urls { wg.Add(1) go Crawl(u, depth-1, fetcher, wg) } return } func main() { var wg sync.WaitGroup fmt.Println(&quot;Main: Starting worker&quot;) wg.Add(1) go Crawl(&quot;https://golang.org/&quot;, 4, fetcher, &amp;wg) fmt.Println(&quot;Main: Waiting for workers to finish&quot;) fmt.Println(&quot;Main: Completed&quot;) wg.Wait() } // fakeFetcher is Fetcher that returns canned results. type fakeFetcher map[string]*fakeResult type fakeResult struct { body string urls []string } func (f fakeFetcher) Fetch(url string) (string, []string, error) { if res, ok := f
; ok { return res.body, res.urls, nil } return &quot;&quot;, nil, fmt.Errorf(&quot;not found: %s&quot;, url) } // fetcher is a populated fakeFetcher. var fetcher = fakeFetcher{ &quot;https://golang.org/&quot;: &amp;fakeResult{ &quot;The Go Programming Language&quot;, []string{ &quot;https://golang.org/pkg/&quot;, &quot;https://golang.org/cmd/&quot;, }, }, &quot;https://golang.org/pkg/&quot;: &amp;fakeResult{ &quot;Packages&quot;, []string{ &quot;https://golang.org/&quot;, &quot;https://golang.org/cmd/&quot;, &quot;https://golang.org/pkg/fmt/&quot;, &quot;https://golang.org/pkg/os/&quot;, }, }, &quot;https://golang.org/pkg/fmt/&quot;: &amp;fakeResult{ &quot;Package fmt&quot;, []string{ &quot;https://golang.org/&quot;, &quot;https://golang.org/pkg/&quot;, }, }, &quot;https://golang.org/pkg/os/&quot;: &amp;fakeResult{ &quot;Package os&quot;, []string{ &quot;https://golang.org/&quot;, &quot;https://golang.org/pkg/&quot;, }, }, }

Any tips and help will be greatly appreciated. Thank you.

Edit1: In-lined code

答案1

得分: 1

    mutex.Lock()
if alreadyFetched := urlCache
; alreadyFetched { fmt.Printf("Crawl: %v 已经获取\n", url) mutex.Unlock() return } urlCache
= true mutex.Unlock()
英文:

Here:

    mutex.Lock()
if alreadyFetched := urlCache
; alreadyFetched { fmt.Printf(&quot;Crawl: %v already fetched\n&quot;, url) return } urlCache
= true mutex.Unlock()

When the if condition is true, you return without unlocking the shared mutex.

So eventually the other goroutines will deadlock on mutex.Lock(), because the goroutine which acquired it never released.

Call mutex.Unlock() also in the if block before returning.

You could also use defer mutex.Unlock() right after locking, and before the if statement, and in a trivial application this won't make an appreciable difference, but in a real-world scenario you want to keep the resource locked for the shortest time possible. If you have a function body with other long-running operations, unlocking immediately after the if is acceptable. But then you must remember to release the lock if the if can return control flow to the caller.

huangapple
  • 本文由 发表于 2021年11月18日 00:34:31
  • 转载请务必保留本文链接:https://go.coder-hub.com/70008225.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定