英文:
Is there a better way limit requests at the "door"?
问题
现在我正在测试一个在我的AWS生产区域中非常简单的信号量。在部署后,延迟从150毫秒增加到300毫秒。我预计会有延迟,但如果能够降低延迟就太好了。这对我来说有点新,所以我正在进行实验。我将信号量设置为允许10000个连接。这与Redis设置的最大连接数相同。下面的代码是否最优?如果不是,有人可以帮助我进行优化吗?如果我做错了什么,请告诉我。我希望将其作为一个中间件,这样我可以在服务器上像这样调用它:n.UseHandler(wrappers.DoorMan(wrappers.DefaultHeaders(myRouter), 10000))
。
package wrappers
import "net/http"
// DoorMan 限制请求
func DoorMan(h http.Handler, n int) http.Handler {
sema := make(chan struct{}, n)
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
sema <- struct{}{}
defer func() { <-sema }()
h.ServeHTTP(w, r)
})
}
英文:
Right now I'm testing an extremely simple Semaphore in one of my production regions in AWS. On deployment the latency jumped from 150ms to 300ms. I assumed latency would occur, but if it could be dropped that would be great. This is a bit new to me so I'm experimenting. I've set the semaphore to allow 10000 connections. That's the same number as the maximum number of connections Redis is set to. Is the code below optimal? If not can someone help me optimize it, if I doing something wrong etc. I want to keep this as a piece of middleware so that I can simply call it like this in on the server n.UseHandler(wrappers.DoorMan(wrappers.DefaultHeaders(myRouter), 10000))
.
package wrappers
import "net/http"
// DoorMan limit requests
func DoorMan(h http.Handler, n int) http.Handler {
sema := make(chan struct{}, n)
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
sema <- struct{}{}
defer func() { <-sema }()
h.ServeHTTP(w, r)
})
}
答案1
得分: 2
你提出的解决方案存在一些问题。但首先,让我们退后一步,这里有两个问题,其中一个是隐含的:
- 如何高效地对入站连接进行速率限制?
- 如何防止向后端服务发送过多的出站连接?
根据你的描述,你实际上想要解决的是第二个问题,即防止过多的请求击中Redis。我将首先解决第一个问题,然后对第二个问题进行一些评论。
对入站连接进行速率限制
如果你真的想要在“门口”对入站连接进行速率限制,通常不应该在处理程序内部等待来实现。根据你提出的解决方案,服务将继续接受请求,这些请求将在sema <- struct{}{}
语句处排队等待。如果负载持续存在,最终会耗尽你的服务的套接字、内存或其他资源,导致服务崩溃。此外,如果你的请求速率接近信号量饱和,你会看到goroutine在处理请求之前在信号量处等待,从而导致延迟增加。
更好的做法是尽快快速响应(尤其是在负载较重时)。可以通过向客户端发送503 Service Unavailable
或智能负载均衡器来实现。
在你的情况下,可以像下面这样实现:
select {
case sema <- struct{}{}:
defer func() { <-sema }()
h.ServeHTTP(w, r)
default:
http.Error(w, "Overloaded", http.StatusServiceUnavailable)
}
对后端服务的出站连接进行速率限制
如果限制速率的原因是为了避免过载后端服务,你通常希望对请求链路中的服务过载做出反应,并通过请求链路传递反压。
在实际操作中,可以简单地在保护对后端的所有调用的包装器中使用与上述相同的信号量逻辑,并在信号量溢出时通过请求的调用链返回错误。
此外,如果后端发送类似于503
(或等效的)的状态码,通常应该以相同的方式向下传播该指示,或者采用其他回退行为来处理传入的请求。
你还可以考虑将此与断路器结合使用,如果后端服务似乎无响应或宕机,可以快速切断对后端服务的调用尝试。
通过上述方式限制并发或排队连接的速率通常是处理过载的好方法。当后端服务过载时,请求通常会花费更长的时间,从而降低每秒的有效请求数。然而,如果出于某种原因,你希望对每秒请求数有一个固定的限制,可以使用rate.Limiter
而不是信号量来实现。
关于性能的评论
在通道上发送和接收微不足道的对象的成本应该是亚微秒级别的。即使在高度拥塞的通道上,与通道同步的延迟也不会接近150毫秒。因此,假设处理程序中的工作是相同的,无论你的延迟增加来自于何处,几乎肯定与等待的goroutine有关(例如等待I/O或等待访问被其他goroutine阻塞的同步区域)。
如果你的入站请求接近你设置的并发限制(10000),或者如果你的请求出现峰值,可能会看到平均延迟增加,这是由于通道等待队列中的goroutine引起的。
无论如何,这应该很容易测量;你可以在处理路径中的某些点上跟踪时间戳。为了避免日志输出影响性能,我建议在所有请求的样本(例如0.1%)上进行跟踪。
英文:
The solution you outline has some issues. But first, let's take a small step back; there are two questions in this, one of them implied:
- How do you rate limit inbound connections efficiently?
- How do you prevent overloading a backend service with outbound connections?
What it sounds like you want to do is actually the second, to prevent too many requests from hitting Redis. I'll start by addressing the first one and then make some comments on the second.
Rate limiting inbound connections
If you really do want to rate limit inbound connections "at the door", you should normally never do that by waiting inside the handler. With your proposed solution, the service will keep accepting requests, which will queue up at the sema <- struct{}{}
statement. If the load persists, it will eventually take down your service, either by running out of sockets, memory, or some other resource. Also note that if your request rate is approaching saturation of the semaphore, you would see an increase in latency caused by goroutines waiting at the semaphore before handling the request.
A better way to do it is to always respond as quickly as possible (especially when under heavy load). This can be done by sending a 503 Service Unavailable
back to the client, or a smart load balancer, telling it to back off.
In your case, it could for example look like something along these lines:
select {
case sema <- struct{}{}:
defer func() { <-sema }()
h.ServeHTTP(w, r)
default:
http.Error(w, "Overloaded", http.StatusServiceUnavailable)
}
Rate limiting outbound connections to a backend service
If the reason for the rate limit is to avoid overloading a backend service, what you typically want to do is rather to react to that service being overloaded and apply back pressure through the request chain.
In practical terms, this could mean something as simple as putting the same kind of semaphore logic as above in a wrapper protecting all calls to the backend, and return an error through your call chain of a request if the semaphore overflows.
Additionally, if the backend sends status codes like 503
(or equivalent), you should typically propagate that indication downwards in the same way, or resort to some other fallback behaviour for handling the incoming request.
You might also want to consider combining this with a circuit breaker, cutting off attempts to call the backend service quickly if it seems to be unresponsive or down.
Rate limiting by capping the number of concurrent or queued connection as above is usually a good way to handle overload. When the backend service is overloaded, requests will typically take longer, which will then reduce the effective number of requests per second. However, if, for some reason, you want to have a fixed limit on number of requests per second, you could do that with a rate.Limiter
instead of a semaphore.
A comment on performance
The cost of sending and receiving trivial objects on a channel should be sub-microsecond. Even on a highly congested channel, it wouldn't be anywhere near 150 ms of additional latency only to synchronise with the channel. So, assuming the work done in the handler is otherwise the same, whatever your latency increase comes from it should almost certainly be associated with goroutines waiting somewhere (e.g. on I/O or to get access to synchronised regions that are blocked by other goroutines).
If you are getting incoming requests at a rate close to what can be handled with your set concurrency limit of 10000, or if you are getting spikes of requests, it is possible you would see such an increase in average latency stemming from goroutines in the wait queue on the channel.
Either way, this should be easily measurable; you could for example trace timestamps at certain points in the handling pathway. I would do this on a sample (e.g. 0.1%) of all requests to avoid having the log output affect the performance.
答案2
得分: 1
我会为这个问题使用稍微不同的机制,可能会使用一个工作池,就像这里描述的那样:
https://gobyexample.com/worker-pools
我会建议保持10000个goroutine运行(它们将处于睡眠状态,等待在一个阻塞通道上接收,所以不会浪费资源),并在请求和响应到达时将它们发送到工作池中。
如果你想要一个在池满时响应错误的超时机制,你也可以使用select
语句来实现。
英文:
I'd use a slightly different mechanism for this, probably a worker pool as described here:
https://gobyexample.com/worker-pools
I'd actually say keep 10000 goroutines running, (they'll be sleeping waiting to receive on a blocking channel, so it's not really a waste of resources), and send the request+response to the pool as they come in.
If you want a timeout that responds with an error when the pool is full you could implement that with a select
block as well.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论