在收到另一个请求后,返回HTTP请求的响应。

huangapple go评论91阅读模式
英文:

Serve response of an HTTP request after receiving another request

问题

我的用例是在从另一个服务器接收到请求后,为HTTP请求提供响应。

  1. 我希望以最佳方式实现这一点,并考虑到可扩展性。
  2. 我们正在使用Golang 1.19和Gin框架。
  3. 服务器将有多个Pod,因此通道将无法使用。
  4. 所有请求都将有超时设置,使得初始请求在60秒后超时。

我目前的解决方案是使用共享缓存,每个Pod将不断检查缓存。我相信,我可以通过使用通道来优化这个过程,系统定期检查是否有任何已完成的响应。

我还想知道如何在其他编程语言中实现这个目标。

PS:这是一个基于设计的问题,我在这里有一些声望可以提供奖励,因此在这里提问。如果问题不清楚,请随意编辑。

英文:

My use case is to serve response of an HTTP request after receiving another request from separate server.

在收到另一个请求后,返回HTTP请求的响应。

  1. I want to do this best possible way keeping scaling in mind.
  2. We are using Golang 1.19 with Gin Framework.
  3. Server will have multiple pods thus channels will not work.
  4. There will be timeouts for all request making initial request timed out after 60 seconds.

My current solution is to use a shared cache where each pod will keep checking the cache. I believe, I can optimize this with channels where rather than checking in cache one by one, system periodically checks for any completed response.

I would also like to know how it could have been achieved in other programming languages.

PS: This is design based query, I have some reputation here to share bounty thus asking here. Please feel free to edit if question is not clear.

答案1

得分: 2

  • 客户端向服务器A发起HTTP调用。
  • 服务器A创建一个唯一的键,并将其存储在缓存服务器中,有效期为60秒。
  • 服务器A通过HTTP调用将请求转发给B。这是一个“发出请求并忘记”的调用。
  • 服务器A立即将带有唯一键的响应返回给客户端。
  • 客户端开始轮询(假设每500毫秒)使用GET状态HTTP API来检查服务器A上的处理是否完成。
  • 与此同时,服务器B完成任务并通过HTTP API回调给服务器A。
  • 服务器A将响应存储在缓存中,与唯一键关联,有效期为短暂的时间(假设为60秒)。
  • 客户端调用A的状态检查API将从缓存中获取数据并将其返回给客户端。

除了服务器A和服务器B之外,没有其他组件。缓存服务器是A内部的,B不需要知道它。每个服务维护的内容都有明确定义,因此易于维护。B可以有内部队列来处理转发的请求,但A不需要知道B的实现方式。每个服务可以由不同的团队使用简单的HTTP协议进行维护。

另一个优点是客户端不需要保持与A的长时间运行的HTTP连接。我提到轮询是考虑到最基本的客户端,比如旧版浏览器。如果您的客户端支持Web套接字,可以使用它将响应从A发送回来。无论如何,在这里进行轮询的性能表现良好,因为状态检查API只是一个缓存调用,没有其他操作。

有人可能会问,在A和B之间的服务器通信中,哪里有重试逻辑?但我们不需要一个队列来实现重试。实质上,客户端在这里希望进行同步类型的调用。我们只是将其拆分为多个调用。响应无论如何都需要快速返回。在网络故障的情况下,我们可以在HTTP客户端中设置3-5次重试机制。我们正在使用Pods。我希望它在一个k8s负载均衡器后面。如果第一个Pod出现故障,负载均衡器将自动重试到一个健康的Pod。所以你在这方面基本上是有保障的。

我不知道您的确切要求,我是在快速思考后写下这些内容的。但总体而言,我认为这看起来不错。它将是稳健的,并且具有低延迟的调用。可能需要在某些地方进行一些微调。

根据评论更新:

客户端轮询的实现非常基本。基本上,它是一个循环,每0.5秒运行一次,并使用HTTP GET调用检查状态。我相信任何客户端都应该能够实现这一点。但可能会有一些限制,客户端团队不希望进行任何更改。我不想深入讨论这个问题。

假设某些客户端没有轮询功能,某些客户端没有Web套接字支持。那么唯一剩下的就是客户端与服务器A之间的长时间HTTP连接。假设我们采用这种方法。不考虑在服务器中保持长时间运行的请求线程的明显问题,这种情况下还存在另一个特定的问题。服务器A中的请求线程需要在服务器B进行回调时得到通知。

方法1: 我们使用某种队列,如其他答案中提到的(我会将队列保留在A内部,但我们不深入讨论这个问题)。我们使用某个消息键,以便在同一个Pod中进行消费。API和消费者都在同一个Pod中运行是可行的。但请求线程和消费者线程是不同的。消费者线程需要以某种方式通知请求线程结果可用。因此,需要一些线程间通信,这只会使事情变得复杂。而且这不是队列的责任。所以我会放弃这种方法。

有人可能会问,请求线程是否可以直接监听具有特定键的某个消息。我不这么认为。为了论证起见,有一些队列技术可以做到这一点。但实质上,你正在使用临时消费者和每个键的单独分区,这不会扩展。监听所有消息并忽略除一个之外的所有消息也不是一个可行的解决方案。

方法2: 从方法1中,我们了解到请求线程需要在服务器B发送回调时得到通知。这将使流程变得更加复杂,并且您将需要额外的组件,如ZooKeeper,来进行分布式锁定或监视某些更改。与其这样做,我们可以简单地扩展我们当前的系统以进行服务器端轮询。这不会对响应时间产生明显的影响,因为我们已经在客户端端进行了轮询。我们只是将其移到服务器端。而且,无论是构建分布式通知还是服务器轮询,我们都必须维护长时间运行的请求线程在服务器A中。流程如下:

  • 客户端调用服务器A。
  • 服务器A的请求线程向服务器B发起HTTP调用。
  • 服务器B开始后台处理,并立即将HTTP响应返回给A。
  • A中的请求线程开始循环,每500毫秒在缓存服务器中检查键是否可用。
  • B处理结果并通过HTTP回调给A。
  • A中的回调线程将结果缓存到相同的键中。
  • 原始的A请求线程从缓存中读取值,并将响应返回给客户端。

原始请求线程在进行下一次缓存调用之前会休眠500毫秒。因此,其他线程可以利用这段时间。而且按键获取缓存非常快速,您不会遇到任何可扩展性问题。

但是,如果您保持与客户端的长时间HTTP连接,您的请求线程将更快地耗尽。因此,您将需要更多的A Pod来处理相同的请求速率。我仍然建议与客户端团队进行沟通,并在那里进行必要的更改,以便使用短暂的连接(与当前流程相同)。

英文:
  • client fires HTTP call to server A.
  • server A creates a unique key & store it in cache server for 60s(timeout period).
  • server A forwards the request to B via HTTP call. It is a fire & forget call to B.
  • server A immediately returns response to the client with the unique key.
  • client starts polling (let's say every 500ms) using a GET status HTTP API to check if processing is done at server A.
  • mean while server B completes the task & calls back to server A via HTTP API.
  • server A stores the response in cache against the unique key for short period (let's say 60s).
  • client call to A for status check API will get the data from cache & returns it to the client.

There is no other component except server A & server B. There is cache server, but it is internal to A. B doesn't need to know about it. Which service maintains what is clearly defined, so easy to maintain. B can have internal queue to process the forwarded request. But A doesn't need to know how B is implemented. Each service can be maintained by different teams using simple HTTP contract.

Another advantage is that client is not holding a long running HTTP connection to A. I mentioned polling thinking of most primitive clients like an old browser. If your client supports web socket, you can use that to send response back from A. Anyways polling here would perform well as status check API is just a cache call & nothing else.

Someone can ask, where is the retry logic for server to server communication between A & B? But we don't need a queue for that. Essentially client wants a synchronous type of call here. We are just breaking it down to multiple calls. Response anyways needs to be quick. We can have a retry mechanism of 3-5 retries in the HTTP client in case of network failure. We are using pods. I expect it to be behind a k8s load balancer. That load balancer will anyway retry to a healthy pod if first pod goes down. So you are pretty much covered there.

I don't know your exact requirements & I am writing this after a quick thought. But overall this looks fine to me. It would be robust & have low latency calls. Maybe few tweaks would be needed here & there.

Update based on comment:

Implementation of client polling is pretty basic. Basically, it is a loop which runs every 0.5s & makes HTTP GET call to check the status. I believe any client should be able to implement that. But there might be some restrictions where client team don't want to make any changes. I don't want to get into that.

Let's say some clients don't have polling capabilities & some clients don't have web socket support. Then only thing remaining is a long lived HTTP connection between client & server A. Suppose we take this approach. Not going into the obvious problems of having a long lived request thread in server, there is another issue specific to this scenario. Request thread in server A needs to be notified when server B makes the callback.

Approach 1: We use some kind of queue as mentioned the other answer (I would keep the queue internal to A, but let's not get into that). And we use some message key so that the consumption happens in the same pod. We have API & consumer both running in the same pod. That is achievable. But request thread & consumer thread are different. Consumer thread somehow needs to notify request thread that result is available. So some inter-thread communication is required which only makes things complex. And it is not responsibility of a queue. So I would discard this approach.

Someone can ask, can request thread directly listens for some message with specific key. I don't think so. For the argument's sake, there are some queue technologies where you can do that. But you are essentially having ephemeral consumer & separate partition for each key which won't scale. Listening for all messages & ignoring all but one in the ephemeral consumer is not a viable solution either.

Approach 2: From approach 1, we understood that request thread needs to be notified when server B sends the callback. That would make the flow whole lot complex & you would need additional components like ZooKeeper to do the distributed locking or watching on some change. Instead of doing that, we can simply extend our current system to do server side polling. It shouldn't have any noticeable difference in response time as we are already doing client side polling. We are just moving it to server side. Also, anyways we would have to maintain long running request thread in server A whether we built distributed notification or server polling. The flow would look like:

  • Client calls server A.
  • Server A request thread makes a HTTP call to server B.
  • Server B starts background processing & returns HTTP response immediately to A.
  • Request thread in A starts a loop checking in cache server if key is available every 500ms.
  • B processes the result & makes HTTP callback to A.
  • callback thread in A caches the result against same key.
  • Original A request thread reads the value from cache & returns the response to client.

Original request thread would be sleeping for 500ms before making next cache call. So other threads would be able to utilize the time. Also cache get by key is super fast & you won't have any scalability issue there.

But you will exhaust your request threads quicker if you maintain long running HTTP connection from client. So you will need more A pods to handle same request rate. I would still recommend to talk to the client team & make the necessary changes there so that short lived connection (same as your current flow) is used.

答案2

得分: 1

tl;dr

问题描述

假设你的服务器应用程序叫做 server_app,有3个Pod:

     +---------------------+
     |  server_app_service |
     +---------------------+
     |  server_app_pod_a   |
     |  server_app_pod_b   |
     |  server_app_pod_c   |
     +---------------------+

你的服务接收到一个名为"request A"的请求,并决定将其传递给server_app_pod_a。现在,你的server_app_pod_a将请求转发给某个网关,并等待某种通知,以继续处理客户端的响应。正如你已经知道的,当网关执行request B时,并不能保证服务再次将其传递给server_app_pod_a。即使它这样做了,你的应用程序的状态管理也将变得非常困难。

消息传递

你可能已经注意到,在上一段中我加粗了“通知”一词,这是因为如果你真的仔细思考一下,request "B"更像是带有一些消息的通知,而不是对某个资源的请求。所以我的首选是像Kafka这样的消息队列(正如你所知道的,有很多这样的消息队列)。这个想法是,如果你能定义一个算法来计算请求的唯一键,你可以期望在同一个Pod中收到相应的通知。这样,状态管理将变得更简单,而且在同一个Pod中收到通知的机会将大大增加(当然,这取决于许多因素,比如消息队列的状态)。看一下你的问题:

> 1. 我想以最佳方式实现这个,同时考虑到扩展性。

当然,你可以使用这些像Kafka这样的消息队列,实现扩展性和减少数据丢失,无论是对于消息队列还是你的应用程序。

> 4. 对于所有请求都会有超时,使得初始请求在60秒后超时。

这取决于你在代码库中如何管理超时,使用上下文可能是一个好主意。

> 我还想知道如何在其他编程语言中实现这个。

使用消息队列是一个通用的想法,几乎适用于任何编程语言,但根据语言的编程范式、特定于语言的库和工具,可能还有其他方法来解决这个问题。例如,在Scala中,如果你使用了一个叫做akka的特定工具(提供了actor模型编程范式),你可以使用所谓的akka-cluster-sharding来处理这个问题。这个想法非常简单,我们知道必须有一种监督者,它知道自己的订阅者的确切位置和状态。所以当它接收到一条消息时,它只需要知道要将请求转发给哪个actor(我们正在谈论actor模型编程)。换句话说,它可以用来在集群上共享在同一台机器上或不在同一台机器上生成的actor之间的状态。但作为个人偏好,我不会选择特定于语言的通信方式,而是坚持通用的思想,因为它可能在将来引起问题。

总结

解释得够长了 :). 为了让我说的话有些意义,让我们按照完全相同的场景,使用不同的通信模型来进行跟进:

  1. 客户端向server_app服务发送请求“A”。
  2. 服务选择其中一个Pod(例如server_app_pod_b)来处理请求。
  3. Pod尝试为请求定义一些键,并将其与请求一起传递给网关,并等待在队列中发布具有该键的消息。
  4. 网关执行其预定任务,并向消息队列发送具有该键的消息。
  5. 同一个Pod serer_app_pod_b接收具有该键的消息,获取消息的数据,并继续处理客户端的请求。

可能还有其他方法来解决这个问题,但这是我会选择的方法。希望对你有所帮助!

英文:

tl;dr

problem description

So assuming your server application called server_app for instance, has 3 pods:

     +---------------------+
     |  server_app_service |
     +---------------------+
     |  server_app_pod_a   |
     |  server_app_pod_b   |
     |  server_app_pod_c   |
     +---------------------+

your service receives a request called "request A", and decides to pass it to server_app_pod_a. Now your server_app_pod_a forwards the request to some gateway, and waits for some sort of notification, to continue the processing of client's response. And as you already know, there's no assurance that when gateway does the request B, the service passes it to server_app_pod_a again. And even if it did so, your application's state management would become a difficult task to do.

Messaging

As you might've noticed, I bolded the word "notification" in the past paragraph, that's because if you really think about it, the request "B" looks more like a notification with some message rather than a request for some resource. So my number 1 choice would be a message queue like kafka (there are plenty of those, again, as you know). And the idea is, if you could define an algorithm to calculate unique keys for your requests, you can expect the resulting notifications in your exact same pod. This way, state management would be much simpler, and also, the chance of getting the notification in the same pod would be much higher (this depends on many factors of course, like the state of the message queue). Taking a look at your questions:

> 1. I want to do this best possible way keeping scaling in mind.

Sure thing, you can use these message queues like kafka, to achieve scaling and fewer data loss, both for the message queue and your application.

> 4. There will be timeouts for all request making initial request timed out after 60 seconds.

This one depends on how you manage timeouts in your codebase, using contexts would be a good idea.

> I would also like to know how it could have been achieved in other programming languages.

Using message queues is a general idea, which would be applicable to almost any programming language, but depending on the programming paradigms of a language, and language-specific libraries and tools, there might be some other approaches to this problem. For instance in Scala, if you use some specific tool called akka (which provides actor model programming paradigm), you can use something so called akka-cluster-sharding, to handle this problem. And the idea is pretty simple, we know that there must be some sort of superviser, which knows the exact location and state of its own subscribers. So when it receives some message, it just knows where and which actor (we're talking about actor model programming) to forward the request to. In other words, it can be used to share state between actors spawned on a cluster, either on the same machine or not. But as a personal preference, I wouldn't go for language-specific communications, and would stick to general ideas, because of the problems it might cause in the future.

Wrap-up

Long enough explanations :). Just to make some sense of what I'm talking about, let's follow up the exact same scenario, with a difference in communication model:

  1. Client sends request "A" to server_app service.
  2. The service, choses one of the pods (server_app_pod_b for instance) to handle the request.
  3. The pod then tries to define some key for the request, and passes it to the gateway, along with the request, and waits for a message with the key, to be published in the queue.
  4. The gateway does what it's supposed to, and sends a message with the key, to the message queue.
  5. The exact same pod serer_app_pod_b receives the message with the key, fetches the data of the message, and continues to process the client's request.

There are probably other approaches available to address this issue, but this is what I would go for. Hope that it helps!

huangapple
  • 本文由 发表于 2022年11月19日 22:32:30
  • 转载请务必保留本文链接:https://go.coder-hub.com/74500889.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定