英文:
How to start a new goroutine each time a channel is updated
问题
我正在制作一个监控不同网页的程序,每当一个新的URL被添加到页面上时,我想启动一个新的goroutine来爬取这个新的URL。
我试图通过以下方式模拟这个过程:
package main
import (
"fmt"
"sync"
"time"
)
func main() {
var Wg sync.WaitGroup
link := make(chan string)
startList := []string{}
go func() {
for i := 0; i < 20; i++ {
// 模拟原始网页的监控
nextLink := fmt.Sprintf("cool-website-%d", i)
link <- nextLink
}
}()
for i := 0; i < 20; i++ {
newLink := <-link
startList = append(startList, newLink)
Wg.Add(1)
go simulateScraping(i, startList[i])
Wg.Done()
}
Wg.Wait()
}
func simulateScraping(i int, link string) {
fmt.Printf("模拟进程 %d\n", i)
fmt.Printf("正在爬取 www.%s.com\n", link)
time.Sleep(time.Duration(30) * time.Second)
fmt.Printf("进程 %d 完成\n", i)
}
这会导致以下错误:fatal error: all goroutines are asleep - deadlock!
。
我该如何在每次更新newLink
或startList
时才启动simulateScraping
函数?
谢谢!
英文:
I am making a program that monitors different webpages, each time a new url is added to a page, I would like to start a new goroutine to scrape the new url.
I am trying to simulate this like this:
package main
import (
"fmt"
"sync"
"time"
)
func main() {
var Wg sync.WaitGroup
link := make(chan string)
startList := []string{}
go func() {
for i := 0; i < 20; i++ {
//should simulate the monitoring of the original web page
nextLink := fmt.Sprintf("cool-website-%d", i)
link <- nextLink
}
}()
for i := 0; i < 20; i++ {
newLink := <-link
startList = append(startList, newLink)
Wg.Add(1)
go simulateScraping(i, startList[i])
Wg.Done()
}
Wg.Wait()
}
func simulateScraping(i int, link string) {
fmt.Printf("Simulating process %d\n", i)
fmt.Printf("scraping www.%s.com\n", link)
time.Sleep(time.Duration(30) * time.Second)
fmt.Printf("Finished process %d\n", i)
}
This results in the following error fatal error: all goroutines are asleep - deadlock!
.
How do I only start the simulateScraping function each time that newLink is updated or when startList is appended to?
Thanks!
答案1
得分: 3
我看到代码中有几个问题。
- 代码中的等待组(WaitGroup)是无用的,因为
Wg.Done
会立即被调用,并不会等待simulateScraping
完成,因为它是并行运行的。
要解决这个问题,可以使用闭包函数:
go func(i int) {
simulateScraping(i, newLink)
Wg.Done()
}(i)
- 我会使用for-each范围循环而不是增量循环。它允许代码在新值到达通道时立即执行,并在通道关闭时自动中断。
var i int
for newLink := range link {
Wg.Add(1)
go func(i int) {
simulateScraping(i, newLink)
Wg.Done()
}(i)
i++
}
Wg.Wait()
-
startList := []string{}
看起来是无用的。不确定它应该如何使用。 -
通道必须关闭。
go func() {
for i := 0; i < 20; i++ {
//应该模拟对原始网页的监视
nextLink := fmt.Sprintf("cool-website-%d", i)
link <- nextLink
}
close(link) // 关闭通道
}()
完整的代码如下:
package main
import (
"fmt"
"sync"
"time"
)
func main() {
var Wg sync.WaitGroup
link := make(chan string)
go func() {
for i := 0; i < 20; i++ {
//应该模拟对原始网页的监视
nextLink := fmt.Sprintf("cool-website-%d", i)
link <- nextLink
}
close(link)
}()
var i int
for newLink := range link {
Wg.Add(1)
go func(i int) {
simulateScraping(i, newLink)
Wg.Done()
}(i)
i++
}
Wg.Wait()
}
func simulateScraping(i int, link string) {
fmt.Printf("模拟进程 %d\n", i)
fmt.Printf("正在抓取 www.%s.com\n", link)
time.Sleep(3 * time.Second)
fmt.Printf("进程 %d 完成\n", i)
}
这里有一个关于“Go中的并发模式”的好讲解。
英文:
I see several problems with the code.
- Wait group is useless in the code because
Wg.Done
is called immediately and does not wait until thesimulateScraping
finishes, because it's running in parallel.
To fix this, the closure function could be used
go func(i int) {
simulateScraping(i, newLink)
Wg.Done()
}(i)
- Instead of an increment loop, I would use for-each range loop. It allows code to be executed as soon as a new value get to a channel and automatically breaks when the channel closes.
var i int
for newLink := range link {
Wg.Add(1)
go func(i int) {
simulateScraping(i, newLink)
Wg.Done()
}(i)
i++
}
Wg.Wait()
-
startList := []string{}
Looks useless. Not sure how it was supposed to be used. -
Channel must be closed.
go func() {
for i := 0; i < 20; i++ {
//should simulate the monitoring of the original web page
nextLink := fmt.Sprintf("cool-website-%d", i)
link <- nextLink
}
close(link) // Closing the channel
}()
The whole code
package main
import (
"fmt"
"sync"
"time"
)
func main() {
var Wg sync.WaitGroup
link := make(chan string)
go func() {
for i := 0; i < 20; i++ {
//should simulate the monitoring of the original web page
nextLink := fmt.Sprintf("cool-website-%d", i)
link <- nextLink
}
close(link)
}()
var i int
for newLink := range link {
Wg.Add(1)
go func(i int) {
simulateScraping(i, newLink)
Wg.Done()
}(i)
i++
}
Wg.Wait()
}
func simulateScraping(i int, link string) {
fmt.Printf("Simulating process %d\n", i)
fmt.Printf("scraping www.%s.com\n", link)
time.Sleep(3 * time.Second)
fmt.Printf("Finished process %d\n", i)
}
Here is a good talk about "Concurrency Patterns In Go"
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论