英文:
Simple method for limiting concurrency in Go
问题
我有一个包含大约10,000个URL的CSV文件,我需要进行HTTP GET请求。限制Go协程的并发数不超过16个,最简单的方法是什么?
func getUrl(url string) {
request := gorequest.New()
resp, body, errs := request.Get(each[1]).End()
_ = resp
_ = body
_ = errs
}
func main() {
csvfile, err := os.Open("urls.csv")
defer csvfile.Close()
reader := csv.NewReader(csvfile)
reader.FieldsPerRecord = -1
rawCSVdata, err := reader.ReadAll()
completed := 0
concurrency := make(chan struct{}, 16) // 创建一个带有16个缓冲的通道
for _, each := range rawCSVdata {
concurrency <- struct{}{} // 将一个空结构体发送到通道,占用一个并发槽位
go func(url string) {
getUrl(url)
<-concurrency // 从通道中接收一个值,释放一个并发槽位
completed++
}(each[1])
}
for completed < len(rawCSVdata) {
time.Sleep(time.Millisecond * 100) // 等待一段时间,以允许其他协程执行
}
}
以上是将并发限制为不超过16个Go协程的最简单方法。我们使用一个带有16个缓冲的通道来控制并发数。在每次启动协程之前,我们将一个空结构体发送到通道中,占用一个并发槽位。在协程完成后,我们从通道中接收一个值,释放一个并发槽位。通过这种方式,我们可以确保同时运行的协程数量不超过16个。
英文:
I have a CSV file with ~10k URLs I need to HTTP get. What is the simplest way to limit the concurrency of Go routines to no more than 16 at a time?
func getUrl(url string) {
request := gorequest.New()
resp, body, errs := request.Get(each[1]).End()
_ = resp
_ = body
_ = errs
}
func main() {
csvfile, err := os.Open("urls.csv")
defer csvfile.Close()
reader := csv.NewReader(csvfile)
reader.FieldsPerRecord = -1
rawCSVdata, err := reader.ReadAll()
completed := 0
for _, each := range rawCSVdata {
go getUrl(each[1])
completed++
}
}
答案1
得分: 11
一个生产者-消费者模式:
package main
import (
"encoding/csv"
"os"
"sync"
"github.com/parnurzeal/gorequest"
)
const workersCount = 16
func getUrlWorker(urlChan chan string) {
for url := range urlChan {
request := gorequest.New()
resp, body, errs := request.Get(url).End()
_ = resp
_ = body
_ = errs
}
}
func main() {
csvfile, err := os.Open("urls.csv")
if err != nil {
panic(err)
}
defer csvfile.Close()
reader := csv.NewReader(csvfile)
reader.FieldsPerRecord = -1
rawCSVdata, err := reader.ReadAll()
var wg sync.WaitGroup
urlChan := make(chan string)
wg.Add(workersCount)
for i := 0; i < workersCount; i++ {
go func() {
getUrlWorker(urlChan)
wg.Done()
}()
}
completed := 0
for _, each := range rawCSVdata {
urlChan <- each[1]
completed++
}
close(urlChan)
wg.Wait()
}
这是一个使用生产者-消费者模式的代码示例。它从一个名为"urls.csv"的文件中读取URL,并使用16个并发的工作线程来发送HTTP请求。每个工作线程从URL通道中获取URL,并发送HTTP请求。主函数负责读取CSV文件、创建URL通道、启动工作线程,并等待所有工作线程完成。
英文:
A producer-consumers pattern:
package main
import (
"encoding/csv"
"os"
"sync"
"github.com/parnurzeal/gorequest"
)
const workersCount = 16
func getUrlWorker(urlChan chan string) {
for url := range urlChan {
request := gorequest.New()
resp, body, errs := request.Get(url).End()
_ = resp
_ = body
_ = errs
}
}
func main() {
csvfile, err := os.Open("urls.csv")
if err != nil {
panic(err)
}
defer csvfile.Close()
reader := csv.NewReader(csvfile)
reader.FieldsPerRecord = -1
rawCSVdata, err := reader.ReadAll()
var wg sync.WaitGroup
urlChan := make(chan string)
wg.Add(workersCount)
for i := 0; i < workersCount; i++ {
go func() {
getUrlWorker(urlChan)
wg.Done()
}()
}
completed := 0
for _, each := range rawCSVdata {
urlChan <- each[1]
completed++
}
close(urlChan)
wg.Wait()
}
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论