GoColly的默认模式是同步(sync)还是异步(async)?

huangapple go评论75阅读模式
英文:

What is the default mode in GoColly, sync or async?

问题

在GoColly中,默认的网络请求执行模式是异步模式。尽管在Collector中有一个Async方法,你可能会认为默认模式是同步的。然而,当我在程序中执行这8个请求时,除了在异步模式下需要使用Wait方法之外,我没有看到任何特别的区别。似乎这个方法只控制程序的执行方式(其他代码),而请求始终是异步的。

以下是翻译好的代码:

package main

import (
	"fmt"

	"github.com/gocolly/colly/v2"
)

func main() {

	urls := []string{
		"http://webcode.me",
		"https://example.com",
		"http://httpbin.org",
		"https://www.perl.org",
		"https://www.php.net",
		"https://www.python.org",
		"https://code.visualstudio.com",
		"https://clojure.org",
	}

	c := colly.NewCollector(
		colly.Async(true),
	)

	c.OnHTML("title", func(e *colly.HTMLElement) {
		fmt.Println(e.Text)
	})

	for _, url := range urls {

		c.Visit(url)
	}

	c.Wait()
}

希望对你有帮助!如果还有其他问题,请随时提问。

英文:

What is the default mode in which network requests are executed in GoColly? Since we have the Async method in the collector I would assume that the default mode is synchronous.
However, I see no particular difference when I execute these 8 requests in the program other than I need to use Wait for async mode. It seems as if the method only controls how the program is executed (the other code) and the requests are always asynchronous.

package main

import (
	"fmt"

	"github.com/gocolly/colly/v2"
)

func main() {

	urls := []string{
		"http://webcode.me",
		"https://example.com",
		"http://httpbin.org",
		"https://www.perl.org",
		"https://www.php.net",
		"https://www.python.org",
		"https://code.visualstudio.com",
		"https://clojure.org",
	}

	c := colly.NewCollector(
		colly.Async(true),
	)

	c.OnHTML("title", func(e *colly.HTMLElement) {
		fmt.Println(e.Text)
	})

	for _, url := range urls {

		c.Visit(url)
	}

	c.Wait()
}

答案1

得分: 1

默认的集合是同步的。

令人困惑的部分可能是收集器选项colly.Async(),它忽略了实际参数。实际上,在撰写本文时的实现如下:

func Async(a ...bool) CollectorOption {
	return func(c *Collector) {
		c.Async = true // uh-oh...!
	}
}

根据这个问题,这样做是为了向后兼容,这样(我相信)你可以传递一个没有参数的选项,它仍然可以工作,例如:

colly.NewCollector(colly.Async()) // 没有参数,异步收集

如果你完全删除异步选项,并只使用colly.NewCollector()进行实例化,网络请求将明显是顺序执行的 - 也就是说,你还可以删除c.Wait(),程序不会立即退出。

英文:

The default collection is synchronous.

The confusing bit is probably the collector option colly.Async() which ignores the actual param. In fact the implementation at the time of writing is:

func Async(a ...bool) CollectorOption {
	return func(c *Collector) {
		c.Async = true // uh-oh...!
	}
}

Based on this issue, it was done this way for backwards compatibility, so that (I believe) you can pass an option with no param at it'll still work, e.g.:

colly.NewCollector(colly.Async()) // no param, async collection

If you remove the async option altogether and instantiate with just colly.NewCollector(), the network requests will be clearly sequential — i.e. you can also remove c.Wait() and the program won't exit right away.

huangapple
  • 本文由 发表于 2022年1月23日 22:39:26
  • 转载请务必保留本文链接:https://go.coder-hub.com/70823151.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定