2023年5月10日 17:57:58go评论54阅读模式

英文:

How to use zset in redis effectively for schedule task processing

问题

我们有一个用例，我们需要在不同的时间戳安排作业，直到成功完成或达到最大尝试次数。

我们考虑在此目的下使用redis中提供的sorted sets。

在每个新作业中，我们将node[data(string), timestamp_to_execute(double) as score]放入redis的一个有序集合（命名为'delay-queue'）中。

我们的应用程序的每个实例中都有一些工作线程，它们不断轮询从有序集合（delay-queue）中获取顶部分数的作业。如果[score <= currentime.millis()]，我们将执行该作业并将其从有序集合中删除，否则我们等待一段时间并再次检查。

我们不希望一个作业执行两次，因为我们有多个应用程序实例在运行，我们如何确保被一个应用程序实例选中的作业不应该被其他实例选中执行。
我们只是在评估一些边缘情况：如果我们在redis有序集合中有一些将来执行的作业，但是假设redis宕机或有序集合中的数据被刷新或擦除。作业将丢失，我们如何确保这不会发生。在这种情况下使用redis有序集合是否是正确的选择。

英文:

We have a use-case where we need to schedule jobs at different timestamps until it finishes successfully OR max attempts has reached.

We are thinking of using the sorted sets available in redis for this purpose.

On each new job we will put node[data(string), timestamp_to_execute(double) as score] into a zset (named 'delay-queue') in redis.

We will have few worker threads running in each instance of our application, which keep polling the top score job from the zset(delay-queue). If [score <= currentime.millis()] we will execute that job and remove it from the zset, else we wait for sometime and check again.

We don't want a job to execute twice, as we have multiple app instances running, how could we make sure the a job that's picked up by one app-instance should not be picked up by other as well for execution.
We are just evaluating on some edge cases: If we have some jobs to be executed in future on redis zset, but say redis goes down or data in zset gets refreshed or erased. The jobs would be lost, how do we make sure that should not happen. Would using redis zset a right choice in this situation.

答案1

得分: 1

以下是翻译好的部分：

有两个步骤来从 ZSET 中获取工作：

检查 ZSET 中的最小分数是否小于当前时间。
如果检查成功，弹出 ZSET 中的最小项。

为了防止多个应用程序同时获取工作，您有以下选项：

使用 Redis 脚本确保这两个步骤以原子方式运行。请参阅此帖子：https://stackoverflow.com/a/44109614/775640
使用 WATCH 确保这两个步骤以原子方式运行。请参阅 Redis 文档：https://redis.io/docs/manual/transactions/ 文档末尾有一个关于如何使用 WATCH 实现 ZPOP 的示例。
使用 Redis 分布式锁确保一次只有一个应用程序获取工作。

如果您希望在 Redis 宕机时防止数据丢失，可以配置 Redis 将数据持久化到磁盘：https://redis.io/docs/management/persistence/

根据您的数据重要性，如果数据非常关键（比如在账户之间转账），在我看来，使用 Kafka 更适合这种情况。

更新：

经过重新考虑，Kafka 可能不适合您的情况。您需要一个有序集合，但 Kafka 更像是一个消息队列。您可以继续使用 Redis 解决方案。如果需要更多的数据安全性，使用数据库也是一个选择。您可以将工作保存在表中，并使用数据库事务确保一次只有一个应用程序可以获取工作。

英文:

There are two steps to pick up a job from ZSET:

Check if the minimum score in ZSET is less than the current time.
If the check is successful, pop the minimum item in ZSET.

To prevent a job from being picked up by multiple apps, you have the following options:

Use a Redis script to ensure these two steps run in an atomic way. See this post: https://stackoverflow.com/a/44109614/775640
Use WATCH to ensure these two steps run in an atomic way. See the Redis documentation: https://redis.io/docs/manual/transactions/ At the end of the document, there is an example on how to implement ZPOP with WATCH.
Use Distributed Locks with Redis to ensure that only one app is picking up the job at a time.

If you want to prevent data loss when Redis goes down, you can configure Redis to persist data on disk: https://redis.io/docs/management/persistence/

Depending on how important your data is, if the data is extremely crucial (like transferring money between accounts), in my opinion, using Kafka is more suitable for such a case.

Update:

After reconsidering, Kafka might not be suitable for your case. You need an ordered set, but Kafka functions more like a message queue. You can stick with the Redis solution. If you require more data safety, using a database is also an option. You can save jobs in a table and use database transactions to ensure that only one app can pick up a job at a time.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何在Redis中有效地使用zset来进行计划任务处理。

问题

答案1

使用DynamoDB或Redis在我们的服务中。

Unable to retrieve persisted data with spring data redis

没有会话存储库可以自动配置（会话存储类型为’redis’）

为什么Datagrip连接Redis不正确？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论