英文:
How can I concurrently download files, limited to 3 at the same time
问题
我目前正在按顺序下载文件,代码如下:
for _, url := range urlSet.URLs {
bytes, err := getXML(url.Loc)
}
你想知道如何在Go语言中并发执行这个任务,但同时限制最多同时进行3个下载。
英文:
I am currently downloading files sequentially like this:
for _, url := range urlSet.URLs {
bytes, err := getXML(url.Loc)
}
How can I do this concurrently in golang, but limit it to 3 at a time.
答案1
得分: 2
以下代码将在goroutine中创建3个工作线程,每个线程不断从jobs通道中获取URL,并将结果写入results通道。
在这个例子中,结果是无意义的并被忽略,但你可能想要跟踪错误以便稍后重试,或者可能有其他用途。
如果你根本不关心结果,可以随意从这段代码中移除results通道,这样会稍微提高效率。
package main
import (
"fmt"
"time"
)
func worker(id int, jobs <-chan URL, results chan<- int) {
for j := range jobs {
bytes, err := getXML(j.Loc)
results <- 0 // 标记任务已完成
}
}
func main() {
//
// 在这里获取URLSET
//
numJobs := len(urlSet.URLs)
jobs := make(chan URL, numJobs)
results := make(chan int, numJobs)
for w := 1; w <= 3; w++ { // 只有3个工作线程,初始时都被阻塞
go worker(w, jobs, results)
}
// 不断向工作线程提供URL
for _, url := range urlSet.URLs {
jobs <- url
}
close(jobs) // 没有更多的URL,告诉工作线程停止循环
// 如果你希望确保工作线程不会在写入结果时永远阻塞,可以移除这个循环以及工作线程写入结果的部分,如果你不需要工作线程的输出的话
for a := 1; a <= numJobs; a++ {
<-results
}
}
英文:
The following code will create 3 workers inside goroutines, each one continually grabbing URLs from the jobs channel, and writing results to the results channel.
In this example, the results are meaningless and ignored, but you might want to keep track of errors for retry later, or maybe something else all together.
If you don't care about results at all, then feel free to remove the results channel from this code, which would make it slightly more efficient.
package main
import (
"fmt"
"time"
)
func worker(id int, jobs <-chan URL, results chan<- int) {
for j := range jobs {
bytes, err := getXML(j.Loc)
results <- 0 // flag that job is finished
}
}
func main() {
//
// Get URLSET HERE
//
numJobs := len(urlSet.URLs)
jobs := make(chan URL, numJobs)
results := make(chan int, numJobs)
for w := 1; w <= 3; w++ { // only 3 workers, all blocked initially
go worker(w, jobs, results)
}
// continually feed in urls to workers
for _, url := range urlSet.URLs {
jobs <- url
}
close(jobs) // no more urls, so tell workers to stop their loop
// needed if you want to make sure that workers don't block forever on writing results, remove both this loop and workers writing results if you don't need output from workers
for a := 1; a <= numJobs; a++ {
<-results
}
}
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论