英文:
Simple solution for golang tour webcrawler exercise
问题
我是Go语言的新手,看到了一些关于这个练习的解决方案,但我觉得它们很复杂...
在我的解决方案中,一切都很简单,但是我遇到了死锁错误。我无法弄清楚如何正确关闭通道并停止主块内的循环。有没有简单的方法来做到这一点?
感谢任何/所有提供帮助的人!
package main
import (
"fmt"
"sync"
)
type Fetcher interface {
// Fetch returns the body of URL and
// a slice of URLs found on that page.
Fetch(url string) (body string, urls []string, err error)
}
type SafeCache struct {
cache map[string]bool
mux sync.Mutex
}
func (c *SafeCache) Set(s string) {
c.mux.Lock()
c.cache[s] = true
c.mux.Unlock()
}
func (c *SafeCache) Get(s string) bool {
c.mux.Lock()
defer c.mux.Unlock()
return c.cache[s]
}
var (
sc = SafeCache{cache: make(map[string]bool)}
errs = make(chan error)
ress = make(chan string)
)
// Crawl uses fetcher to recursively crawl
// pages starting with url, to a maximum of depth.
func Crawl(url string, depth int, fetcher Fetcher) {
if depth <= 0 {
return
}
var (
body string
err error
urls []string
)
if ok := sc.Get(url); !ok {
sc.Set(url)
body, urls, err = fetcher.Fetch(url)
} else {
err = fmt.Errorf("Already fetched: %s", url)
}
if err != nil {
errs <- err
return
}
ress <- fmt.Sprintf("found: %s %q\n", url, body)
for _, u := range urls {
go Crawl(u, depth-1, fetcher)
}
return
}
func main() {
go Crawl("http://golang.org/", 4, fetcher)
for {
select {
case res, ok := <-ress:
fmt.Println(res)
if !ok {
break
}
case err, ok := <-errs:
fmt.Println(err)
if !ok {
break
}
}
}
}
// fakeFetcher is Fetcher that returns canned results.
type fakeFetcher map[string]*fakeResult
type fakeResult struct {
body string
urls []string
}
func (f fakeFetcher) Fetch(url string) (string, []string, error) {
if res, ok := f[url]; ok {
return res.body, res.urls, nil
}
return "", nil, fmt.Errorf("not found: %s", url)
}
// fetcher is a populated fakeFetcher.
var fetcher = fakeFetcher{
"http://golang.org/": &fakeResult{
"The Go Programming Language",
[]string{
"http://golang.org/pkg/",
"http://golang.org/cmd/",
},
},
"http://golang.org/pkg/": &fakeResult{
"Packages",
[]string{
"http://golang.org/",
"http://golang.org/cmd/",
"http://golang.org/pkg/fmt/",
"http://golang.org/pkg/os/",
},
},
"http://golang.org/pkg/fmt/": &fakeResult{
"Package fmt",
[]string{
"http://golang.org/",
"http://golang.org/pkg/",
},
},
"http://golang.org/pkg/os/": &fakeResult{
"Package os",
[]string{
"http://golang.org/",
"http://golang.org/pkg/",
},
},
}
英文:
I'm new to Go and I saw some solutions for this exercise, but I think they are complex...
In my solution everything seems simple, but I've got a deadlock error. I can't figure out how to properly close channels and stop loop inside main block. Is there a simple way to do this?
Thanks for any/all help one may provide!
<!-- language: lang-golang -->
package main
import (
"fmt"
"sync"
)
type Fetcher interface {
// Fetch returns the body of URL and
// a slice of URLs found on that page.
Fetch(url string) (body string, urls []string, err error)
}
type SafeCache struct {
cache map[string]bool
mux sync.Mutex
}
func (c *SafeCache) Set(s string) {
c.mux.Lock()
c.cache展开收缩 = true
c.mux.Unlock()
}
func (c *SafeCache) Get(s string) bool {
c.mux.Lock()
defer c.mux.Unlock()
return c.cache展开收缩
}
var (
sc = SafeCache{cache: make(map[string]bool)}
errs, ress = make(chan error), make(chan string)
)
// Crawl uses fetcher to recursively crawl
// pages starting with url, to a maximum of depth.
func Crawl(url string, depth int, fetcher Fetcher) {
if depth <= 0 {
return
}
var (
body string
err error
urls []string
)
if ok := sc.Get(url); !ok {
sc.Set(url)
body, urls, err = fetcher.Fetch(url)
} else {
err = fmt.Errorf("Already fetched: %s", url)
}
if err != nil {
errs <- err
return
}
ress <- fmt.Sprintf("found: %s %q\n", url, body)
for _, u := range urls {
go Crawl(u, depth-1, fetcher)
}
return
}
func main() {
go Crawl("http://golang.org/", 4, fetcher)
for {
select {
case res, ok := <-ress:
fmt.Println(res)
if !ok {
break
}
case err, ok := <-errs:
fmt.Println(err)
if !ok {
break
}
}
}
}
// fakeFetcher is Fetcher that returns canned results.
type fakeFetcher map[string]*fakeResult
type fakeResult struct {
body string
urls []string
}
func (f fakeFetcher) Fetch(url string) (string, []string, error) {
if res, ok := f; ok {
return res.body, res.urls, nil
}
return "", nil, fmt.Errorf("not found: %s", url)
}
// fetcher is a populated fakeFetcher.
var fetcher = fakeFetcher{
"http://golang.org/": &fakeResult{
"The Go Programming Language",
[]string{
"http://golang.org/pkg/",
"http://golang.org/cmd/",
},
},
"http://golang.org/pkg/": &fakeResult{
"Packages",
[]string{
"http://golang.org/",
"http://golang.org/cmd/",
"http://golang.org/pkg/fmt/",
"http://golang.org/pkg/os/",
},
},
"http://golang.org/pkg/fmt/": &fakeResult{
"Package fmt",
[]string{
"http://golang.org/",
"http://golang.org/pkg/",
},
},
"http://golang.org/pkg/os/": &fakeResult{
"Package os",
[]string{
"http://golang.org/",
"http://golang.org/pkg/",
},
},
}
答案1
得分: 2
你可以使用sync.WaitGroup来解决这个问题。
- 你可以在单独的goroutine中开始监听你的通道。
WaitGroup
将协调你有多少个goroutine。
wg.Add(1)
表示我们将启动一个新的goroutine。
wg.Done()
表示goroutine已经完成。
wg.Wait()
会阻塞goroutine,直到所有启动的goroutine都完成。
这三个方法可以协调goroutine的执行。
PS. 你可能对sync.RWMutex对于你的SafeCache
感兴趣。
英文:
you can solve this with sync.WaitGroup
- You can start listening your channels in separate goroutines.
WaitGroup
will coordinate how many goroutines do you have.
wg.Add(1)
says that we're going to start new goroutine.
wg.Done()
says that goroutine is finished.
wg.Wait()
blocks goroutine, until all started goroutines aren't finished yet.
This 3 methods allows you to coordinate goroutines.
PS. you might be interested in sync.RWMutex for your SafeCache
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论