如何在 Web 应用中管理长时间的进程

huangapple go评论88阅读模式
英文:

How to manage a long processes in web app

问题

我正在尝试在Go中实现以下功能。

我有一个包含表单的网页,用于上传一个.csv文件。
使用Gorilla mux来路由到一个处理程序,该处理程序接收文件并解析它,在数据上执行一系列操作,最后生成一个报告,包括解析的行数、被拒绝的行数等等。

我的问题是,尽管在我的机器上可以正常工作,但在服务器上,Apache会在我完成所有操作之前超时:文件上传本身不是问题,但我必须等待数据转换完成。

我尝试使用Gorilla websocket从进程中获取反馈(例如递增的解析和处理行数),并保持连接打开,但这是一个POST请求,而Gorilla websocket不会从http升级到websocket,除非有一个GET请求。

我甚至不确定我是否在使用websockets时走对了路。

我可以为处理过程本身创建一个goroutine,并在goroutine完成之前返回处理程序,但是如何在UI中显示处理结果呢?

因此,在这个阶段,我的问题归结为:在Go中,当你需要:

  • 上传文件,
  • 等待长时间的处理过程完成
  • 并在网页中显示结果

最好的方法是什么?

非常感谢指点正确的方向。

英文:

I'm trying to implement the following functionality in Go.

I have a web page with a form, used to upload a .csv file.
Gorilla mux is used to route to a handler which takes the file and parses it, does a bunch of operations on the data and at the end produces a report with number of lines parsed, # of rejected lines, etc.

My problem is that even though it works on my machine, on a server Apache will time out before I can get to the end of it all: the file upload itself isn't the issue, but I have to wait for transformations on the data to complete.

I've tried to use Gorilla websocket to get feedback from the process (incrementing number of lines parsed and treated, for instance) and keep the connection open, but this is a POST request, and Gorilla websocket won't upgrade from http to websocket unless one has a GET request.

I'm not even sure I'm on the right track with websockets for doing this type of thing.

I can have a goroutine for the processing itself and return the handler before the goroutine completes, but then how do I show the result of the process in the UI?

So at this stage my question boils down to: what would be the best way, in Go, when you need to:

  • upload a file,
  • wait for a long process to complete
  • and display the result in a web page?

A clue as to the right direction to go in would be much appreciated.

答案1

得分: 4

你遇到了一个非平凡的问题。有很多可能的解决方案,涉及到不同的用户体验、实现复杂性和副作用。这是一个相当大的主题,所以这个答案主要是作为进一步研究的起点。

最简单的选择

首先,无论采用哪种解决方案,你都需要为每个长时间运行的任务提供一个唯一的ID,浏览器可以使用该ID来获取后续的状态更新。任务运行器本身可以将作业标记为完成,或者如果你想向用户展示进度,可以定期发布进度更新。

最简单的实现方式可能是在表单提交后立即返回一个页面,URL中包含任务ID,页面的处理程序会检查任务状态,然后要么返回一个带有“仍在工作”或类似内容的页面,并在几秒钟后自动刷新,要么返回一个显示“已完成”的页面,不进行刷新。这个实现并不是特别困难,但也不是特别流畅。如果这是一个简单的内部使用项目,具有简单的用户体验和操作要求,我建议只采用这种方式。否则,我们将进一步深入研究!

实时更新

你可以通过几种不同的方法在不重新加载页面的情况下进行实时更新:

  • 定期使用AJAX请求来检查任务的状态,并根据响应更新用户界面。后端会有一个符合REST风格的处理程序。
  • 你可以使用WebSockets在单个连接上完成相同的操作。
  • 你可以使用HTTP长轮询来模拟类似于WebSocket的行为,但这种方法通常已被WebSocket取代。

无论选择哪种方法,都需要一个处理程序来提供状态更新信息,并在前端进行一些JavaScript操作,调用处理程序、解析响应并更新页面。

副作用

根据这个服务的规模和要求,有一些副作用需要考虑,主要是长时间运行的任务实际上是一种应用状态,使得你的应用具有状态,这在可用性、扩展性和部署方面有一些严重的操作问题。如果你运行多个负载均衡实例,你将不得不使用粘性会话或以某种方式共享任务状态。

在大规模处理长时间运行的任务时,最常见的方法是将工作程序与Web应用程序分离,使用某种工作队列(可以是数据库或专用消息代理,如Rabbit或Kafka)来管理任务。这样做会稍微复杂一些,因为你需要跨进程工作来获取状态更新,但在操作上会给你更多的灵活性。

我猜这个答案比你对“请求超时”的预期要复杂,但这是一个简单问题的非平凡解决方案。你在解决这个问题上肯定不是孤单的;研究如何处理Web应用程序中的长时间运行任务将提供大量的信息供你利用。

英文:

You've stumbled onto a non-trivial problem. There are a lot of possible solutions, with different user experiences, implementation complexities, and side effects. This is a pretty big topic so this answer is intended mostly as a starting point for further research.

The Simplest Option

First, pretty much regardless of solution, you're going to have to give each long-running task a unique ID that the browser can use to get status updates later. The task runner itself can just flag jobs as complete, or it can periodically issue progress updates if you want to present progress to the user.

The easiest to implement is likely to have your form submission immediately respond with a page, with the task ID included in the URL, whose handler checks the task status and either a) returns a page with "still working" or something to that effect and auto-refreshes after a few seconds, or b) returns a page saying "completed" and does not refresh. This isn't terribly difficult to implement, but it's not particularly smooth, either. If this is a simple internal-use project with simple UX and operational requirements, I'd just do this. Otherwise, further down the rabbit hole we go!

Live Updates

You could do live updates without reloading the page by a few different methods:

  • Regular AJAX requests to check the status of the task, updating the UI based on the response. This would have a REST-style handler on the back end.
  • You can use WebSockets to do the same thing over a single connection.
  • You can use HTTP long-polling to simulate WebSocket-like behavior, but this has generally been supplanted by WebSockets.

Either option will require both a handler to serve the status update information and some JavaScript wizardry on the front-end to call the handler, parse the response, and update the page.

Side-Effects

Depending on the scale and requirements of this service, there are some side-effects to consider; mainly that a long-running task is effectively a kind of application state, making your application stateful, which has some severe operational downsides when it comes to availability, scaling, and deployment. If you're running multiple load-balanced instances you'll have to use sticky sessions or share task status between instances somehow.

The most common way to handle long-running tasks at scale is to separate the worker from the web application, using some kind of work queue (either in a database or a dedicated message broker like Rabbit or Kafka) to manage the tasks. This makes it a little more complicated to get status updates because you're working across processes, but it gives you a lot more flexibility operationally.

I'm guessing this is a more complicated answer than you expected to "requests are timing out", but this is a case of a trivial issue with a non-trivial solution. You're certainly not alone in tackling this issue; researching handling long-running tasks in web applications will yield a ton of information you can leverage.

huangapple
  • 本文由 发表于 2017年7月27日 00:26:25
  • 转载请务必保留本文链接:https://go.coder-hub.com/45332625.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定