Simple method for limiting concurrency in Go

huangapple go评论83阅读模式
英文:

Simple method for limiting concurrency in Go

问题

我有一个包含大约10,000个URL的CSV文件,我需要进行HTTP GET请求。限制Go协程的并发数不超过16个,最简单的方法是什么?

func getUrl(url string) {
    request := gorequest.New()
    resp, body, errs := request.Get(each[1]).End()
    _ = resp
    _ = body
    _ = errs
}

func main() {
  
    csvfile, err := os.Open("urls.csv")
    defer csvfile.Close()
    reader := csv.NewReader(csvfile)
    reader.FieldsPerRecord = -1 
    rawCSVdata, err := reader.ReadAll()

    completed := 0
    concurrency := make(chan struct{}, 16) // 创建一个带有16个缓冲的通道
    for _, each := range rawCSVdata {
        concurrency <- struct{}{} // 将一个空结构体发送到通道,占用一个并发槽位
        go func(url string) {
            getUrl(url)
            <-concurrency // 从通道中接收一个值,释放一个并发槽位
            completed++
        }(each[1])
    }

    for completed < len(rawCSVdata) {
        time.Sleep(time.Millisecond * 100) // 等待一段时间,以允许其他协程执行
    }
}

以上是将并发限制为不超过16个Go协程的最简单方法。我们使用一个带有16个缓冲的通道来控制并发数。在每次启动协程之前,我们将一个空结构体发送到通道中,占用一个并发槽位。在协程完成后,我们从通道中接收一个值,释放一个并发槽位。通过这种方式,我们可以确保同时运行的协程数量不超过16个。

英文:

I have a CSV file with ~10k URLs I need to HTTP get. What is the simplest way to limit the concurrency of Go routines to no more than 16 at a time?

func getUrl(url string) {
    request := gorequest.New()
    resp, body, errs := request.Get(each[1]).End()
    _ = resp
    _ = body
    _ = errs
}

func main() {
  
    csvfile, err := os.Open(&quot;urls.csv&quot;)
    defer csvfile.Close()
    reader := csv.NewReader(csvfile)
    reader.FieldsPerRecord = -1 
    rawCSVdata, err := reader.ReadAll()

    completed := 0
    for _, each := range rawCSVdata {
        go getUrl(each[1])
        completed++
    }
}

答案1

得分: 11

一个生产者-消费者模式:

package main

import (
	"encoding/csv"
	"os"
	"sync"

	"github.com/parnurzeal/gorequest"
)

const workersCount = 16

func getUrlWorker(urlChan chan string) {
	for url := range urlChan {
		request := gorequest.New()
		resp, body, errs := request.Get(url).End()
		_ = resp
		_ = body
		_ = errs
	}
}

func main() {
	csvfile, err := os.Open("urls.csv")
	if err != nil {
		panic(err)
	}
	defer csvfile.Close()

	reader := csv.NewReader(csvfile)
	reader.FieldsPerRecord = -1
	rawCSVdata, err := reader.ReadAll()

	var wg sync.WaitGroup
	urlChan := make(chan string)

	wg.Add(workersCount)

	for i := 0; i < workersCount; i++ {
		go func() {
			getUrlWorker(urlChan)
			wg.Done()
		}()
	}

	completed := 0
	for _, each := range rawCSVdata {
		urlChan <- each[1]
		completed++
	}
	close(urlChan)

	wg.Wait()
}

这是一个使用生产者-消费者模式的代码示例。它从一个名为"urls.csv"的文件中读取URL,并使用16个并发的工作线程来发送HTTP请求。每个工作线程从URL通道中获取URL,并发送HTTP请求。主函数负责读取CSV文件、创建URL通道、启动工作线程,并等待所有工作线程完成。

英文:

A producer-consumers pattern:

package main

import (
	&quot;encoding/csv&quot;
	&quot;os&quot;
	&quot;sync&quot;

	&quot;github.com/parnurzeal/gorequest&quot;
)

const workersCount = 16

func getUrlWorker(urlChan chan string) {
	for url := range urlChan {
		request := gorequest.New()
		resp, body, errs := request.Get(url).End()
		_ = resp
		_ = body
		_ = errs
	}
}

func main() {
	csvfile, err := os.Open(&quot;urls.csv&quot;)
	if err != nil {
		panic(err)
	}
	defer csvfile.Close()

	reader := csv.NewReader(csvfile)
	reader.FieldsPerRecord = -1
	rawCSVdata, err := reader.ReadAll()

	var wg sync.WaitGroup
	urlChan := make(chan string)

	wg.Add(workersCount)

	for i := 0; i &lt; workersCount; i++ {
		go func() {
			getUrlWorker(urlChan)
			wg.Done()
		}()
	}

	completed := 0
	for _, each := range rawCSVdata {
		urlChan &lt;- each[1]
		completed++
	}
	close(urlChan)

	wg.Wait()
}

huangapple
  • 本文由 发表于 2015年11月1日 17:10:29
  • 转载请务必保留本文链接:https://go.coder-hub.com/33460672.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定