英文:
Does NATS Jetstream provide message ordering by a key?
问题
我是新接触NATS Jetstream,我一直在阅读他们的官方文档(https://docs.nats.io/jetstream/jetstream)以了解其概念并将其与Kafka进行比较。我有一个主要的用例,就是基于特定ID(类似于Kafka世界中的“分区键”)解决消息/事件排序的问题。
例如,有多个更新事件针对一个“订单”实体,我的系统需要按照相同的顺序消费特定“订单”的事件。在这种情况下,我会在发布到Kafka主题时使用“order-id”作为分区键。在Jetstream中如何实现这一点?
我在Jetstream中遇到了一个去重键(Nats-Msg-Id
),但我认为这个功能更类似于Kafka中的主题压缩。我是对的吗?
不过,我已经用Golang编写了以下用于发布的代码:
order = Order{
OrderId: orderId,
Status: status,
}
orderJson, _ := json.Marshal(order)
dedupKey := nats.MsgId(order.OrderId)
_, err := js.Publish(subjectName, orderJson, dedupKey)
我这样做对吗?在Jetstream世界中,特定orderId的所有订单是否都会发送到同一个消费者组中的同一个消费者,从而保持顺序?
编辑1
这是我从@tbeets的建议中得到的结果。例如,我预定义了10个流主题,如ORDER.1
,ORDER.2
,ORDER.3
....ORDER.10
在发布端,我可以使用order-id%10+1
来找到要发布到的确切流主题。因此,在这里,我们实现了同一orderId的所有更新事件每次都会发送到同一个流主题。
现在,在订阅端,我有10个消费者组(每个消费者组中有10个消费者),每个消费者组从特定的流主题中消费,例如consumerGroup-1
从ORDER.1
中消费,consumerGroup-2
从ORDER.2
中消费,依此类推...
假设,对于order-id
为111的订单,有2个订单更新事件,它们将映射到ORDER.1
流主题,并相应地consumerGroup-1
将消费这两个事件。但在这个消费者组内,这两个更新事件可能会被分配给不同的消费者,如果其中一个消费者比较繁忙或者较慢,那么从整体上看,订单更新事件的消费可能会不同步或者乱序。
Kafka通过使用分区键的概念来解决这个问题,因为消费者组的消费者被分配到特定的分区。因此,同一orderId的所有事件都由同一个消费者消费,从而保持订单更新事件的顺序。在Jetstream中,我该如何解决这个问题?
英文:
I am new to NATS Jetstream and I have been reading their official documentation (https://docs.nats.io/jetstream/jetstream) to understand its concepts and compare it with Kafka. One of the major use cases I have, is to solve for message/event ordering based on a particular id (like a partition key
in the Kafka world).
For example, there are several update events coming for an Order
entity and my system needs to consume the events for a particular Order
in the same order. In this case, I would use the order-id
as the partition key while publishing to the Kafka topic. How do I accomplish this in Jetstream?
I have come across a de-duplication key (Nats-Msg-Id
) in Jetstream, but I think this feature is more synonymous with topic compaction in Kafka. Am I right?
Nevertheless, I have written the following code in Golang for publishing:
order = Order{
OrderId: orderId,
Status: status,
}
orderJson, _ := json.Marshal(order)
dedupKey := nats.MsgId(order.OrderId)
_, err := js.Publish(subjectName, orderJson, dedupKey)
Am I doing this right? Will all orders for a particular orderId go to the same consumer within a consumer group in the Jetstream world, hence maintaining the sequence?
Edit 1
This is what I get from @tbeets' suggestion. For example, I have predefined 10 stream subjects like ORDER.1
, ORDER.2
,ORDER.3
.... ORDER.10
On the publishing side, I can do an order-id%10+1
to find the exact stream subject to which I would want to publish. So here, we have accomplished that all update events for the same orderId will go to the same stream subject every time.
Now, on the subscriber side, I have 10 consumer groups (there are 10 consumers within each consumer group) and each consume from a particular stream subject, like consumerGroup-1
consumes from ORDER.1
, consumerGroup-2
consumes from ORDER.2
and so on...
Say, 2 order update events came for order-id
111, which would get mapped to ORDER.1
stream subject, and correspondingly consumerGroup-1
will consume these 2 events. But within this consumerGroup, the 2 update events can go to different consumers and if one of the consumers is a bit busy or slow, then at an overall level, the order update events consumption maybe out-of-sync or out-of-order.
Kafka solves this using the concept of partition key as consumers of a consumer group are allocated to a particular partition. Hence, all events for the same orderId, are consumed by the same consumer, hence, maintaining the sequence of order update event consumption. How do I solve this issue in Jetstream?
答案1
得分: 2
在NATS中,你的发布主题可以包含多个分隔的标记。例如,你的订单事件可以发布到ORDER.{store}.{orderid},其中最后两个标记对于每个事件都是特定的,并为你的用例提供所需的切片和切块维度。
然后,你可以为ORDER.>(即所有事件)定义一个JetStream。可以在JetStream上创建N个消费者(临时的或持久的),每个消费者都可以根据你的用例需求(例如ORDER.Store24.>)在底层流的消息上定义一个可选的过滤器。JetStream保证按照接收到的顺序传递消息(经过过滤或未经过过滤)。
英文:
In NATS, your publish subject can contain multiple delimited tokens. So for instance your Order event could be published to ORDER.{store}.{orderid} where the last two tokens are specific to each event and provide whatever slice-and-dice dimensions you need for your use case.
You then define a JetStream for ORDER.> (i.e. all of the events). N number of Consumers (ephemeral or durable) can be created on the JetStream, each with an optional filter definition to your use case needs (e.g. ORDER.Store24.>) on the underlying stream's messages. JetStream guarantees that messages (filtered or unfiltered) are delivered in the order they were received.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论