英文:
Cloud Run Jobs with tasks from database?
问题
我正在研究Cloud Run Jobs,但文档和(少见的)产品示例让我感到困惑。
我想创建一个在Cloud Schedule上运行的Cloud Run Job。任务数量取决于表中的项目数量。我想为每一行运行一个任务。
鉴于每个任务都是单独运行的,没有“父”容器保存任务信息,唯一想到的解决办法是使用limit 1 offset ${BATCH_TASK_INDEX}
查询来获取每个任务的一行 - 这似乎不太高效。而且,用这种方法我无法事先知道任务数量。
我看到有一款更新的产品叫“Batch”,它有一个用于作业的脚本和一个用于任务的容器,这对我的用例有点用。但我不明白Cloud Run Jobs实际上适用于什么。也许有人能解释一下?
英文:
I am looking into Cloud Run Jobs, but I'm having trouble with the documentation and (rare) examples for the product.
I want to create a Cloud Run Job that runs on a Cloud Schedule. The number of tasks depends on the number of items in a table. I want to run a task for each row.
Given each task runs individually and there's no "parent" container holding the task information, the only solution that comes to mind is a limit 1 offset ${BATCH_TASK_INDEX}
query to get one row for each task - which doesn't seem very efficient. Also, with this method I wouldn't know the task count before.
I've seen there's a newer product "Batch" which has a script for the job and a container for the task, which kind of works for my use case. But I fail to understand what Cloud Run Jobs is actually good for. Maybe someone can shed some light?
答案1
得分: 0
感谢 @yedpodtrzitko 的想法。我尝试了多种方法,花了好几个小时,但我仍然不知道 Cloud Run Jobs 应该如何工作 🤷
"parent" 容器的想法
问题在于 Cloud Run Jobs 的 execute
不接受任何参数,所以我不能只是有一个预定义的作业并在运行之前传递参数,整个作业都需要使用固定的参数创建。这意味着即使我有一个父容器来获取行,我仍然需要首先创建作业,然后再删除它。
批处理
批处理似乎更加灵活,可能适用于此用例,但我发现文档有点简单,不足以开始研究新产品。
云函数和任务
最后,我决定使用一个“父”云函数来检索行,然后创建 Cloud Tasks 来调用第二个函数来处理各个项目。任务可以控制并发以避免速率限制错误等。
我对 Cloud Run Jobs 很感兴趣,所以我将继续寻找可以用它解决的问题,但迄今为止还没有运气。
英文:
Thanks @yedpodtrzitko for the idea. I spent several hours with different approaches and I still don't know how Cloud Run Jobs is supposed to work 🤷
"parent" container idea
The issue here is that Cloud Run Jobs execute
doesn't take any arguments, so I can't just have a pre-defined job and pass on the arguments before running it, the entire job needs to be created with fixed arguments. This means even if I have a parent container to fetch the rows, I'd still need to first create the job and delete if afterwards.
Batch
Batch seems to be a little more flexible and would probably work for the use case, but I found the documentation a bit bare to start another investigation into a new product.
Cloud Functions & Tasks
In the end I decided to use a "parent" Cloud Function to retrieve the rows and then create Cloud Tasks to call a second Function to process the individual items. Tasks adds some control over concurrency to avoid rate limit errors etc.
I'm intrigued by Cloud Run Jobs, so I'll continue to look for problems that can be solved with that, but so far no luck.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论