英文:
Goroutines and channels in Go
问题
我正在尝试理解一个用Go语言表示多个读者和写者的代码示例。
这个代码示例用于计算一个网页或多个网页的大小。
代码 版本1:
package main
import (
"fmt"
"io/ioutil"
"net/http"
)
func main() {
urls := []string{"http://google.com", "http://yahoo.com", "http://reddit.com"}
sizeCh := make(chan string)
urlCh := make(chan string)
for i := 0; i < 3; i++ { //稍后我们将把 i<3 改为 i<2
go worker(urlCh, sizeCh, i)
}
for _, u := range urls {
urlCh <- u //稍后:go generator(u, urlCh)
}
for i := 0; i < len(urls); i++ {
fmt.Println(<-sizeCh)
}
}
func worker(urlCh chan string, sizeCh chan string, id int) {
for {
url := <-urlCh
length, err := getPage(url)
if err == nil {
sizeCh <- fmt.Sprintf("%s has length %d. worker %d", url, length, id)
} else {
sizeCh <- fmt.Sprintf("Error getting %s: %s. worker %d", url, err, id)
}
}
}
func getPage(url string) (int, error) {
resp, err := http.Get(url)
if err != nil {
return 0, err
}
defer resp.Body.Close()
body, err := ioutil.ReadAll(resp.Body)
if err != nil {
return 0, err
}
return len(body), nil
}
结果:
http://reddit.com has length 110937. worker 0
http://google.com has length 18719. worker 2
http://yahoo.com has length 326987. worker 1
但是,将 for i := 0; i < 3; i++
(第15行)改为 for i := 0; i < 2; i++
,即 i < len(urls)
,我们得不到结果(一直在等待...)
在 [版本2] 中,我们向版本1中添加了一个辅助函数:
func generator(url string, urlCh chan string) {
urlCh <- url
}
并将第19-21行改为:
for _, u := range urls {
go generator(u, urlCh)
}
即使使用 i < 2
,它也能正常工作:
http://google.com has length 18701. worker 1
http://reddit.com has length 112469. worker 0
http://yahoo.com has length 325752. worker 1
为什么版本1在条件 i < 2
(即 i < len(urls)
)下失败,而版本2没有失败?
英文:
I'm trying to understand a code example which represents multiple readers and writers in Go.
This code example is used to calculate the size(s) of a webpage/webpages.
Code version 1:
package main
import (
"fmt"
"io/ioutil"
"net/http"
)
func main() {
urls := []string{"http://google.com", "http://yahoo.com", "http://reddit.com"}
sizeCh := make(chan string)
urlCh := make(chan string)
for i := 0; i < 3; i++ { //later we change i<3 to i<2
go worker(urlCh, sizeCh, i)
}
for _, u := range urls {
urlCh <- u //later: go generator(u, urlCh)
}
for i := 0; i < len(urls); i++ {
fmt.Println(<-sizeCh)
}
}
func worker(urlCh chan string, sizeCh chan string, id int) {
for {
url := <-urlCh
length, err := getPage(url)
if err == nil {
sizeCh <- fmt.Sprintf("%s has legth %d. worker %d", url, length, id)
} else {
sizeCh <- fmt.Sprintf("Error getting %s: %s. worker %d", url, err, id)
}
}
}
func getPage(url string) (int, error) {
resp, err := http.Get(url)
if err != nil {
return 0, err
}
defer resp.Body.Close()
body, err := ioutil.ReadAll(resp.Body)
if err != nil {
return 0, err
}
return len(body), nil
}
The result:
http://reddit.com has legth 110937. worker 0
http://google.com has legth 18719. worker 2
http://yahoo.com has legth 326987. worker 1
But after changing for i := 0; i < 3; i++
(line 15) to for i := 0; i < 2; i++
, namly i < len(urls), we get no result (always waitting...)
In [version 2], we add a helper function into version 1:
func generator(url string, urlCh chan string) {
urlCh <- url
}
and change line 19-21 to:
for _, u := range urls {
go generator(u, urlCh)
}
It works fine even with i<2
:
http://google.com has legth 18701. worker 1
http://reddit.com has legth 112469. worker 0
http://yahoo.com has legth 325752. worker 1
Why does the version 1 fail under condition i<2
(i.e.i<len(urls)
) but version 2 does not?
答案1
得分: 2
在你的程序中,你有以下循环迭代3个URL:
for _, u := range urls {
urlCh <- u // 稍后:go generator(u, urlCh)
}
由于urlCh
是无缓冲的,循环体中的发送操作将不会完成,直到另一个 Goroutine 执行相应的接收操作。
当你有3个工作 Goroutine 时,这没有问题。当你将其减少到两个时,这意味着至少一个 Goroutine 需要进展到足够远的地方,以从 urlCh
接收第二个值。
现在,如果我们看一下 worker
的主体,我们可以看到问题:
for {
url := <-urlCh
length, err := getPage(url)
if err == nil {
sizeCh <- fmt.Sprintf("%s has length %d. worker %d", url, length, id)
} else {
sizeCh <- fmt.Sprintf("Error getting %s: %s. worker %d", url, err, id)
}
}
这个循环无法完成,直到它成功地在 sizeCh
上发送一个值。而且由于这个通道也是无缓冲的,直到另一个 Goroutine 准备好从该通道接收一个值,这个发送操作也不会发生。
不幸的是,唯一会这样做的 Goroutine 是 main
,它只有在完成向 urlCh
发送值之后才会这样做。因此,我们遇到了死锁。
将发送到 urlCh
的操作移到单独的 Goroutine 中可以解决这个问题,因为 main
可以继续进行到从 sizeCh
读取的地方,即使并没有所有的值都被发送到 urlCh
。
英文:
In your program, you have the following loop iterating over the 3 URLs:
for _, u := range urls {
urlCh <- u //later: go generator(u, urlCh)
}
Since urlCh is unbuffered, the send operation in the loop body will not complete until a corresponding receive operation is performed by another Goroutine.
When you had 3 worker goroutines, this is no problem. When you reduced it to two, it means that at least one goroutine will need to progress far enough to receive a second value from urlCh
.
Now if we look at the body of worker
we can see the problem:
for {
url := <-urlCh
length, err := getPage(url)
if err == nil {
sizeCh <- fmt.Sprintf("%s has legth %d. worker %d", url, length, id)
} else {
sizeCh <- fmt.Sprintf("Error getting %s: %s. worker %d", url, err, id)
}
}
This loop can't complete until it successfully sends a value on sizeCh
. And since this channel is also unbuffered, that won't happen until another goroutine is ready to receive a value from that channel.
Unfortunately the only goroutine that will do that is main
, which only does so when it is finished sending values to urlCh
. Thus we have a deadlock.
Moving the sends to urlCh
to separate goroutines fixes the problem because main
can progress to the point where it is reading from sizeCh
, even though not all values have been sent to urlCh
.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论