英文:
Fastest way to send multiple HTTP requests
问题
我有一个包含大约2000个用户对象(映射)的数组,我需要调用一个API来获取用户详细信息,然后处理响应,并尽快更新我的本地数据库。我使用了Go的waitgroup和goroutine来实现并发请求发送的方法,但是调用2000个请求需要大约24秒在我的2014款Macbook Pro上。有没有办法让它更快一些?
var wg sync.WaitGroup
json.Unmarshal(responseData, &users)
wg.Add(len(users))
for i := 0; i < len(users); i++ {
go func(userid string) {
url := "https://www.example.com/user_detail/" + userid
response, _ := http.Get(url)
defer response.Body.Close()
data, _ := ioutil.ReadAll(response.Body)
wg.Done()
}(users[i]["userid"])
}
wg.Wait()
英文:
I have an array of about 2000 user objects (maps) that I need to call an API to get the user detail -> process the response -> update my local DB as soon as possible. I used Go's waitgroup and goroutine to implement the concurrent request sending method, however to call 2000 requests it would take about 24 seconds on my 2014 Macbook Pro. Is there anyway to make it faster?
var wg sync.WaitGroup
json.Unmarshal(responseData, &users)
wg.Add(len(users))
for i:= 0; i<len(users); i++ {
go func(userid string){
url := "https://www.example.com/user_detail/"+ userid
response, _ := http.Get(url)
defer response.Body.Close()
data, _ := ioutil.ReadAll(response.Body)
wg.Done()
}(users[i]["userid"])
}
wg.Wait()
答案1
得分: 4
这种情况通常很难解决。性能在很大程度上取决于服务器、API、网络等具体情况。但是以下是一些建议,希望能帮助你解决问题:
- 尝试限制并发连接数。
正如评论中@JimB所提到的,尝试处理2000个并发连接可能效率低下,对服务器和客户端都是如此。尝试将并发连接数限制在10、20、50、100个同时连接。对每个值进行基准测试,并根据情况进行调整,直到获得最佳性能。
在客户端上,这可能允许重用连接(从而减少每个请求的平均开销),目前是不可能的,因为你在任何连接完成之前都会初始化所有2000个连接。
-
如果服务器支持HTTP/2,请确保你正在使用HTTP/2,它可以更高效(具体取决于上述第1点)。请参阅有关调试HTTP/2的文档。
-
如果API支持批量请求,请利用这一点,在单个请求中请求多个用户。
英文:
This sort of situation is very difficult to address in general. Performance at this level depends very much on the specifics of your server, API, network, etc. But here are a few suggestions to get you going:
- Try limiting the number of concurrent connections.
As mentioned by @JimB in comments, trying to handle 2000 concurrent connections is likely inefficient, for both the server and client. Try limiting to 10, 20, 50, 100 simultaneous connections. Benchmark each value, and tweak accordingly until you get the best performance.
On the client side, this may allow re-using connections (thus reducing the average per-request overhead), which is currently impossible, since you're initiating all 2000 connections before any of them complete.
-
If the server supports HTTP/2, make sure you're using HTTP/2, which can be more efficient (with multiple requests--so this really depends on #1 above, too). See the documentation about debugging HTTP/2.
-
If the API supports bulk requests, take advantage of this, and request multiple users in a single request.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论