Cloud Run Jobs 使用来自数据库的任务吗?

huangapple go评论42阅读模式
英文:

Cloud Run Jobs with tasks from database?

问题

我正在研究Cloud Run Jobs,但文档和(少见的)产品示例让我感到困惑。

我想创建一个在Cloud Schedule上运行的Cloud Run Job。任务数量取决于表中的项目数量。我想为每一行运行一个任务。

鉴于每个任务都是单独运行的,没有“父”容器保存任务信息,唯一想到的解决办法是使用limit 1 offset ${BATCH_TASK_INDEX}查询来获取每个任务的一行 - 这似乎不太高效。而且,用这种方法我无法事先知道任务数量。

我看到有一款更新的产品叫“Batch”,它有一个用于作业的脚本和一个用于任务的容器,这对我的用例有点用。但我不明白Cloud Run Jobs实际上适用于什么。也许有人能解释一下?

英文:

I am looking into Cloud Run Jobs, but I'm having trouble with the documentation and (rare) examples for the product.

I want to create a Cloud Run Job that runs on a Cloud Schedule. The number of tasks depends on the number of items in a table. I want to run a task for each row.

Given each task runs individually and there's no "parent" container holding the task information, the only solution that comes to mind is a limit 1 offset ${BATCH_TASK_INDEX} query to get one row for each task - which doesn't seem very efficient. Also, with this method I wouldn't know the task count before.

I've seen there's a newer product "Batch" which has a script for the job and a container for the task, which kind of works for my use case. But I fail to understand what Cloud Run Jobs is actually good for. Maybe someone can shed some light?

答案1

得分: 0

感谢 @yedpodtrzitko 的想法。我尝试了多种方法,花了好几个小时,但我仍然不知道 Cloud Run Jobs 应该如何工作 🤷

"parent" 容器的想法

问题在于 Cloud Run Jobs 的 execute 不接受任何参数,所以我不能只是有一个预定义的作业并在运行之前传递参数,整个作业都需要使用固定的参数创建。这意味着即使我有一个父容器来获取行,我仍然需要首先创建作业,然后再删除它。

批处理

批处理似乎更加灵活,可能适用于此用例,但我发现文档有点简单,不足以开始研究新产品。

云函数和任务

最后,我决定使用一个“父”云函数来检索行,然后创建 Cloud Tasks 来调用第二个函数来处理各个项目。任务可以控制并发以避免速率限制错误等。

我对 Cloud Run Jobs 很感兴趣,所以我将继续寻找可以用它解决的问题,但迄今为止还没有运气。

英文:

Thanks @yedpodtrzitko for the idea. I spent several hours with different approaches and I still don't know how Cloud Run Jobs is supposed to work 🤷

"parent" container idea

The issue here is that Cloud Run Jobs execute doesn't take any arguments, so I can't just have a pre-defined job and pass on the arguments before running it, the entire job needs to be created with fixed arguments. This means even if I have a parent container to fetch the rows, I'd still need to first create the job and delete if afterwards.

Batch

Batch seems to be a little more flexible and would probably work for the use case, but I found the documentation a bit bare to start another investigation into a new product.

Cloud Functions & Tasks

In the end I decided to use a "parent" Cloud Function to retrieve the rows and then create Cloud Tasks to call a second Function to process the individual items. Tasks adds some control over concurrency to avoid rate limit errors etc.

I'm intrigued by Cloud Run Jobs, so I'll continue to look for problems that can be solved with that, but so far no luck.

huangapple
  • 本文由 发表于 2023年6月13日 11:57:41
  • 转载请务必保留本文链接:https://go.coder-hub.com/76461608.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定