每当一个通道更新时,如何启动一个新的 goroutine?

huangapple go评论86阅读模式
英文:

How to start a new goroutine each time a channel is updated

问题

我正在制作一个监控不同网页的程序,每当一个新的URL被添加到页面上时,我想启动一个新的goroutine来爬取这个新的URL。

我试图通过以下方式模拟这个过程:

package main

import (
	"fmt"
	"sync"
	"time"
)

func main() {
	var Wg sync.WaitGroup
	link := make(chan string)
	startList := []string{}
	go func() {
		for i := 0; i < 20; i++ {
			// 模拟原始网页的监控
			nextLink := fmt.Sprintf("cool-website-%d", i)
			link <- nextLink
		}
	}()

	for i := 0; i < 20; i++ {
		newLink := <-link
		startList = append(startList, newLink)
		Wg.Add(1)
		go simulateScraping(i, startList[i])
		Wg.Done()
	}
	Wg.Wait()
}

func simulateScraping(i int, link string) {
	fmt.Printf("模拟进程 %d\n", i)
	fmt.Printf("正在爬取 www.%s.com\n", link)
	time.Sleep(time.Duration(30) * time.Second)
	fmt.Printf("进程 %d 完成\n", i)
}

这会导致以下错误:fatal error: all goroutines are asleep - deadlock!
我该如何在每次更新newLinkstartList时才启动simulateScraping函数?

谢谢!

英文:

I am making a program that monitors different webpages, each time a new url is added to a page, I would like to start a new goroutine to scrape the new url.
I am trying to simulate this like this:

package main

import (
	&quot;fmt&quot;
	&quot;sync&quot;
	&quot;time&quot;
)

func main() {
	var Wg sync.WaitGroup
	link := make(chan string)
	startList := []string{}
	go func() {
		for i := 0; i &lt; 20; i++ {
            //should simulate the monitoring of the original web page
			nextLink := fmt.Sprintf(&quot;cool-website-%d&quot;, i)
			link &lt;- nextLink
		}
	}()

	for i := 0; i &lt; 20; i++ {
		newLink := &lt;-link
		startList = append(startList, newLink)
		Wg.Add(1)
		go simulateScraping(i, startList[i])
		Wg.Done()
	}
	Wg.Wait()
}

func simulateScraping(i int, link string) {
	fmt.Printf(&quot;Simulating process %d\n&quot;, i)
	fmt.Printf(&quot;scraping www.%s.com\n&quot;, link)
	time.Sleep(time.Duration(30) * time.Second)
	fmt.Printf(&quot;Finished process %d\n&quot;, i)
}

This results in the following error fatal error: all goroutines are asleep - deadlock!.
How do I only start the simulateScraping function each time that newLink is updated or when startList is appended to?

Thanks!

答案1

得分: 3

我看到代码中有几个问题。

  1. 代码中的等待组(WaitGroup)是无用的,因为Wg.Done会立即被调用,并不会等待simulateScraping完成,因为它是并行运行的。

要解决这个问题,可以使用闭包函数:

		go func(i int) {
			simulateScraping(i, newLink)
			Wg.Done()
		}(i)
  1. 我会使用for-each范围循环而不是增量循环。它允许代码在新值到达通道时立即执行,并在通道关闭时自动中断。
	var i int
	for newLink := range link {
		Wg.Add(1)
		go func(i int) {
			simulateScraping(i, newLink)
			Wg.Done()
		}(i)
		i++
	}
	Wg.Wait()
  1. startList := []string{}看起来是无用的。不确定它应该如何使用。

  2. 通道必须关闭。

    go func() {
        for i := 0; i < 20; i++ {
            //应该模拟对原始网页的监视
            nextLink := fmt.Sprintf("cool-website-%d", i)
            link <- nextLink
        }
       close(link) // 关闭通道
    }()

完整的代码如下:

package main

import (
	"fmt"
	"sync"
	"time"
)

func main() {
	var Wg sync.WaitGroup
	link := make(chan string)
	go func() {
		for i := 0; i < 20; i++ {
			//应该模拟对原始网页的监视
			nextLink := fmt.Sprintf("cool-website-%d", i)
			link <- nextLink
		}
		close(link)
	}()

	var i int
	for newLink := range link {
		Wg.Add(1)
		go func(i int) {
			simulateScraping(i, newLink)
			Wg.Done()
		}(i)
		i++
	}
	Wg.Wait()
}

func simulateScraping(i int, link string) {
	fmt.Printf("模拟进程 %d\n", i)
	fmt.Printf("正在抓取 www.%s.com\n", link)
	time.Sleep(3 * time.Second)
	fmt.Printf("进程 %d 完成\n", i)
}

这里有一个关于“Go中的并发模式”的好讲解。

英文:

I see several problems with the code.

  1. Wait group is useless in the code because Wg.Done is called immediately and does not wait until the simulateScraping finishes, because it's running in parallel.

To fix this, the closure function could be used

		go func(i int) {
			simulateScraping(i, newLink)
			Wg.Done()
		}(i)

  1. Instead of an increment loop, I would use for-each range loop. It allows code to be executed as soon as a new value get to a channel and automatically breaks when the channel closes.
	var i int
	for newLink := range link {
		Wg.Add(1)
		go func(i int) {
			simulateScraping(i, newLink)
			Wg.Done()
		}(i)
		i++
	}
	Wg.Wait()
  1. startList := []string{} Looks useless. Not sure how it was supposed to be used.

  2. Channel must be closed.

    go func() {
        for i := 0; i &lt; 20; i++ {
            //should simulate the monitoring of the original web page
            nextLink := fmt.Sprintf(&quot;cool-website-%d&quot;, i)
            link &lt;- nextLink
        }
       close(link) // Closing the channel
    }()

The whole code

package main

import (
	&quot;fmt&quot;
	&quot;sync&quot;
	&quot;time&quot;
)

func main() {
	var Wg sync.WaitGroup
	link := make(chan string)
	go func() {
		for i := 0; i &lt; 20; i++ {
			//should simulate the monitoring of the original web page
			nextLink := fmt.Sprintf(&quot;cool-website-%d&quot;, i)
			link &lt;- nextLink
		}
		close(link)
	}()

	var i int
	for newLink := range link {
		Wg.Add(1)
		go func(i int) {
			simulateScraping(i, newLink)
			Wg.Done()
		}(i)
		i++
	}
	Wg.Wait()
}

func simulateScraping(i int, link string) {
	fmt.Printf(&quot;Simulating process %d\n&quot;, i)
	fmt.Printf(&quot;scraping www.%s.com\n&quot;, link)
	time.Sleep(3 * time.Second)
	fmt.Printf(&quot;Finished process %d\n&quot;, i)
}

Here is a good talk about "Concurrency Patterns In Go"

huangapple
  • 本文由 发表于 2022年8月7日 00:53:57
  • 转载请务必保留本文链接:https://go.coder-hub.com/73261683.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定