英文:
What is the default mode in GoColly, sync or async?
问题
在GoColly中,默认的网络请求执行模式是异步模式。尽管在Collector中有一个Async
方法,你可能会认为默认模式是同步的。然而,当我在程序中执行这8个请求时,除了在异步模式下需要使用Wait
方法之外,我没有看到任何特别的区别。似乎这个方法只控制程序的执行方式(其他代码),而请求始终是异步的。
以下是翻译好的代码:
package main
import (
"fmt"
"github.com/gocolly/colly/v2"
)
func main() {
urls := []string{
"http://webcode.me",
"https://example.com",
"http://httpbin.org",
"https://www.perl.org",
"https://www.php.net",
"https://www.python.org",
"https://code.visualstudio.com",
"https://clojure.org",
}
c := colly.NewCollector(
colly.Async(true),
)
c.OnHTML("title", func(e *colly.HTMLElement) {
fmt.Println(e.Text)
})
for _, url := range urls {
c.Visit(url)
}
c.Wait()
}
希望对你有帮助!如果还有其他问题,请随时提问。
英文:
What is the default mode in which network requests are executed in GoColly? Since we have the Async
method in the collector I would assume that the default mode is synchronous.
However, I see no particular difference when I execute these 8 requests in the program other than I need to use Wait
for async mode. It seems as if the method only controls how the program is executed (the other code) and the requests are always asynchronous.
package main
import (
"fmt"
"github.com/gocolly/colly/v2"
)
func main() {
urls := []string{
"http://webcode.me",
"https://example.com",
"http://httpbin.org",
"https://www.perl.org",
"https://www.php.net",
"https://www.python.org",
"https://code.visualstudio.com",
"https://clojure.org",
}
c := colly.NewCollector(
colly.Async(true),
)
c.OnHTML("title", func(e *colly.HTMLElement) {
fmt.Println(e.Text)
})
for _, url := range urls {
c.Visit(url)
}
c.Wait()
}
答案1
得分: 1
默认的集合是同步的。
令人困惑的部分可能是收集器选项colly.Async()
,它忽略了实际参数。实际上,在撰写本文时的实现如下:
func Async(a ...bool) CollectorOption {
return func(c *Collector) {
c.Async = true // uh-oh...!
}
}
根据这个问题,这样做是为了向后兼容,这样(我相信)你可以传递一个没有参数的选项,它仍然可以工作,例如:
colly.NewCollector(colly.Async()) // 没有参数,异步收集
如果你完全删除异步选项,并只使用colly.NewCollector()
进行实例化,网络请求将明显是顺序执行的 - 也就是说,你还可以删除c.Wait()
,程序不会立即退出。
英文:
The default collection is synchronous.
The confusing bit is probably the collector option colly.Async()
which ignores the actual param. In fact the implementation at the time of writing is:
func Async(a ...bool) CollectorOption {
return func(c *Collector) {
c.Async = true // uh-oh...!
}
}
Based on this issue, it was done this way for backwards compatibility, so that (I believe) you can pass an option with no param at it'll still work, e.g.:
colly.NewCollector(colly.Async()) // no param, async collection
If you remove the async option altogether and instantiate with just colly.NewCollector()
, the network requests will be clearly sequential — i.e. you can also remove c.Wait()
and the program won't exit right away.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论