英文:
Cloud Functions Gen2 with Golang - Instance Lifetime and BigQuery Insertion Safety without Awaiting Job Completion
问题
我正在使用Golang编写的Google Cloud Function(Gen2),它由HTTP请求触发。我的用例需要将一些数据存储在BigQuery中,并且我希望在响应HTTP请求之前避免等待作业完成。
然而,我对Cloud Function实例在函数返回后的行为有一些担忧:
发送HTTP响应后,实例会保持活动状态多长时间?
如果不等待BigQuery作业完成,是否安全?如果实例在作业完成之前终止,是否会丢失数据?
对于这种情况,任何见解或最佳实践都将非常感谢。
英文:
I'm working with a Google Cloud Function (Gen2) written in Golang, which is triggered by an HTTP request. My use case requires storing some data in BigQuery, and I would like to avoid waiting for the job to complete before responding to the HTTP request.
However, I have concerns regarding the behavior of the Cloud Function instance after returning from the function:
How long does the instance remain alive after sending the HTTP response?
Is it safe not to wait for the BigQuery job to finish? Do I run the risk of losing data if the instance is terminated before the job completes?
Any insight or best practices regarding this scenario would be greatly appreciated.
答案1
得分: 2
这种方法是不被推荐的。请参阅下面的文档:
函数执行时间线
函数只能在函数执行期间访问其分配的资源(内存和CPU)。在执行期间之外运行的代码不能保证执行,并且可以随时停止。因此,您应该始终正确信号结束函数执行,并避免在其之后运行任何代码。
不要启动后台活动
后台活动是指函数终止后发生的任何事情。函数调用在函数返回或以其他方式发出完成信号后结束,例如在Node.js事件驱动函数中调用
callback
参数。在优雅终止之后运行的任何代码都无法访问CPU,并且不会取得任何进展。此外,当在同一环境中执行后续调用时,后台活动会恢复,干扰新的调用。这可能导致意外行为和难以诊断的错误。在函数终止后访问网络通常会导致连接被重置(
ECONNRESET
错误代码)。后台活动通常可以在各个调用的日志中检测到,通过查找在调用完成后记录的任何内容。后台活动有时可能深藏在代码中,特别是在存在异步操作(如回调或定时器)时。请检查代码,确保在终止函数之前完成所有异步操作。
另一种解决方案是将其实现为事件驱动函数(参见Cloud Functions的类型)。然后为该函数指定一个Pub/Sub触发器和一个Pub/Sub主题(参见Pub/Sub触发器)。客户端需要重写以将事件发布到此主题。
如果无法重写客户端,一种解决方法是保留HTTP函数和事件驱动函数,并通过向主题发布事件,使HTTP函数将工作转移到事件驱动函数。根据事件的大小和BigQuery作业的执行时间,也许不会让客户端等待的时间更短。我认为这种方法会显著增加成本。
英文:
This approach is discouraged. See the docs below:
> ### Function execution timeline
>
> A function has access to its allocated resources (memory and CPU) only for the duration of function execution. Code run outside of the execution period is not guaranteed to execute, and it can be stopped at any time. Therefore, you should always signal the end of your function execution correctly and avoid running any code beyond it.
> ### Do not start background activities
>
> Background activity is anything that happens after your function has terminated. A function invocation finishes once the function returns or otherwise signals completion, such as by calling the callback
argument in Node.js event-driven functions. Any code run after graceful termination cannot access the CPU and will not make any progress.
>
> In addition, when a subsequent invocation is executed in the same environment, your background activity resumes, interfering with the new invocation. This may lead to unexpected behavior and errors that are hard to diagnose. Accessing the network after a function terminates usually leads to connections being reset (ECONNRESET
error code).
>
> Background activity can often be detected in logs from individual invocations, by finding anything that is logged after the line saying that the invocation finished. Background activity can sometimes be buried deeper in the code, especially when asynchronous operations such as callbacks or timers are present. Review your code to make sure all asynchronous operations finish before you terminate the function.
An alternative solution is to implement it as an event-driven function (see Types of Cloud Functions). And then specify a Pub/Sub trigger and a Pub/Sub topic for this function (see Pub/Sub triggers). The client has to be rewritten to publish events to this topic.
If the client can not be rewritten, a workaround is to keep both the HTTP function and the event-driven function, and make the HTTP function offload the work to the event-driven function by publishing an event to the topic. Depending on the size of the event and the execution time of the BigQuery job, maybe it won't let the client wait for less time. And I think this approach will increase the cost remarkably.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论