英文:
Performance of Nats Jetstream
问题
我正在尝试理解Nats Jetstream的扩展性,并有几个问题。
-
按主题订阅历史消息的效率如何?例如,假设有一个流
foo
,其中包含1亿条主题为foo.bar
的消息,然后有一条主题为foo.baz
的单独消息。如果我从流的开头订阅foo.baz
,服务器上的某个组件是否需要对foo
中的所有消息进行线性扫描,还是能够立即定位到foo.baz
消息? -
系统的水平扩展性如何?我之所以问这个问题,是因为我在尝试将Jetstream扩展到每秒几千条消息时遇到了问题,无论我投入多少台机器。测试参数如下:
- Nats服务器
2.6.3
运行在4核8GB的节点上 - 单个流复制3次(磁盘或内存似乎没有区别)
- 500字节的消息负载
n
个发布者,每个发布者每秒发布1k条消息
瓶颈似乎在发布方面,因为我可以至少以与发布速度相同的速度检索消息。
- Nats服务器
英文:
I'm trying to understand how Nats Jetstream scales and have a couple of questions.
-
How efficient is subscribing by subject to historic messages? For example lets say have a stream
foo
that consists of 100 million messages with a subject offoo.bar
and then a single message with a subjectfoo.baz
. If I then make a subscription tofoo.baz
from the start of the stream will something on the server have to perform a linear scan of all messages infoo
or will it be able to immediately seek to thefoo.baz
message. -
How well does the system horizontally scale? I ask because I'm having issues getting Jetstream to scale much above a few thousand messages per second, regardless of how many machines I throw at it. Test parameters are as follows:
- Nats Server
2.6.3
running on 4 core 8GB nodes - Single Stream replicated 3 times (disk or in-memory appears to make no difference)
- 500 byte message payloads
n
publishers each publishing 1k messages per second
The bottleneck appears to be on the publishing side as I can retrieve messages at least as fast as I can publish them.
- Nats Server
答案1
得分: 8
在NATS JetStream中发布消息与在Core NATS中发布消息略有不同。
是的,你可以将Core NATS消息发布到一个被流记录的主题中,该消息确实会被捕获到流中,但在Core NATS发布的情况下,发布应用程序不需要从nats-server接收到确认回复,而在JetStream发布调用的情况下,nats-server会向客户端发送确认回复,指示消息确实成功持久化和复制(或者没有)。
因此,当你执行js.Publish()时,实际上是进行了一个同步的相对高延迟的请求-响应(特别是如果你的复制是3或5,如果你的流持久化到文件中,并且取决于客户端应用程序和nats-server之间的网络延迟),这意味着如果你只是连续进行这些同步发布调用,你的吞吐量将受到限制。
如果你想要将消息发布到流中以获得吞吐量,你应该使用JetStream发布调用的异步版本(即你应该使用js.AsyncPublish()
,它返回一个PubAckFuture
)。
然而,在这种情况下,你还必须记住通过限制在任何给定时间内的“正在进行中”的异步发布应用程序的数量来引入一定量的流量控制(这是因为你总是可以异步发布消息比nats-server(s)能够复制和持久化消息要快得多。
如果你不断以尽可能快的速度异步发布(例如在发布某种批处理的结果时),最终会压垮你的服务器,这是你真正想要避免的事情。
你有两种选择来流量控制你的JetStream异步发布:
- 在获取JetStream上下文时,通过指定最大数量的“正在进行中”的异步发布请求作为选项来限制:即
js = nc.JetStream(nats.PublishAsyncMaxPending(100))
- 进行一个简单的批处理机制,以检查每个一段时间的发布的PubAcks,就像
nats bench
所做的那样:https://github.com/nats-io/natscli/blob/e6b2b478dbc432a639fbf92c5c89570438c31ee7/cli/bench_command.go#L476
关于预期的性能:使用异步发布允许你真正发挥NATS和JetStream的吞吐量能力。验证或测量性能的一个简单方法是使用nats
CLI工具(https://github.com/nats-io/natscli)运行基准测试。
例如,你可以从一个简单的测试开始:nats bench foo --js --pub 4 --msgs 1000000 --replicas 3
(在内存流中,有3个副本,每个副本有4个go例程,每个例程都有自己的连接,以100个消息为一批发布128字节的消息),你应该得到比几千条消息每秒更多的消息。
有关如何使用nats bench
命令的更多信息和示例,你可以参考这个视频:https://youtu.be/HwwvFeUHAyo
英文:
Publishing in NATS JetStream is slightly different than publishing in Core NATS.
Yes, you can publish a Core NATS message to a subject that is recorded by a stream and that message will indeed be captured in the stream, but in the case of the Core NATS publication, the publishing application does not expect an acknowledgement back from the nats-server, while in the case of the JetStream publish call, there is an acknowledgement sent back to the client from the nats-server that indicates that the message was indeed successfully persisted and replicated (or not).
So when you do js.Publish() you are actually making a synchronous relatively high latency request-reply (especially if your replication is 3 or 5, and more so if your stream is persisted to file, and depending on the network latency between the client application and the nats-server), which means that your throughput is going to be limited if you are just doing those synchronous publish calls back to back.
If you want throughput of publishing messages to a stream, you should use the asynchronous version of the JetStream publish call instead (i.e. you should use js.AsyncPublish()
that returns a PubAckFuture
).
However in that case you must also remember to introduce some amount of flow control by limiting the number of 'in-flight' asynchronous publish applications you want to have at any given time (this is because you can always publish asynchronously much much faster than the nats-server(s) can replicate and persist messages.
If you were to continuously publish asynchronously as fast as you can (e.g. when publishing the result of some kind of batch process) then you would eventually overwhelm your servers, which is something you really want to avoid.
You have two options to flow-control your JetStream async publications:
- specify a max number of in-flight asynchronous publication requests as an option when obtaining your JetStream context: i.e.
js = nc.JetStream(nats.PublishAsyncMaxPending(100))
- Do a simple batch mechanism to check for the publication's PubAcks every so many asynchronous publications, like
nats bench
does: https://github.com/nats-io/natscli/blob/e6b2b478dbc432a639fbf92c5c89570438c31ee7/cli/bench_command.go#L476
About the expected performance: using async publications allows you to really get the throughput that NATS and JetStream are capable of. A simple way to validate or measure performance is to use the nats
CLI tool (https://github.com/nats-io/natscli) to run benchmarks.
For example you can start with a simple test: nats bench foo --js --pub 4 --msgs 1000000 --replicas 3
(in memory stream with 3 replicas 4 go-routines each with it's own connection publishing 128 byte messages in batches of 100) and you should get a lot more than a few thousands messages per second.
For more information and examples of how to use the nats bench
command you can take a look at this video: https://youtu.be/HwwvFeUHAyo
答案2
得分: 4
对于一个 R3 文件存储,您可以期望每秒大约有 25 万个小消息。如果您使用同步发布,那么主要受到应用程序到系统的往返时间(RTT)以及流领导者到最近的追随者的 RTT 的影响。您可以使用窗口化的智能异步发布来获得更好的性能。
使用内存存储可以获得更高的数字,但同样会受到整个系统的 RTT 的影响。
如果您告诉我消息的大小,我们可以向您展示一些针对演示服务器(R1)和 NGS(R1 和 R3)的 nats bench 的结果。
对于关于过滤消费者的原始问题,>=2.8.x 不会执行线性扫描来检索 foo.baz。如果需要,我们也可以展示一个示例。
欢迎加入 Slack 频道(slack.nats.io),那里是一个非常活跃的社区。甚至可以直接给我发私信,我很乐意帮助您。
英文:
For an R3 filestore you can expect ~250k small msgs per second. If you utilize synchronous publish that will be dominated by RTT from the application to the system, and from the stream leader to the closest follower. You can use windowed intelligent async publish to get better performance.
You can get higher numbers with memory stores, but again will be dominated by RTT throughout the system.
If you give me a sense of how large are your messages we can show you some results from nats bench against the demo servers (R1) and NGS (R1 & R3).
For the original question regarding filtered consumers, >= 2.8.x will not do a linear scan to retrieve foo.baz. We could also show an example of this as well if it would help.
Feel free to join the slack channel (slack.nats.io) which is a pretty active community. Even feel free to DM me directly, happy to help.
答案3
得分: 3
这是要翻译的内容:
对此有一个意见会很好。我有一个类似的行为,唯一能够提高发布者吞吐量的方法是降低复制(从3个降到1个),但这不是一个可接受的解决方案。
我尝试增加更多资源(CPU/内存),但无法增加发布速率。
此外,水平扩展也没有任何区别。
在我的情况下,我正在使用Bench工具发布到js。
英文:
Would be good to get an opinion on this. I have a similar behaviour and the only way to achieve higher throughput for publishers is to lower replication (from 3 to 1) but that won't be an acceptable solution.
I have tried adding more resources (cpu/ram) with no success on increasing the publishing rate.
Also, scaling horizontally did not make any difference.
In my situation , i am using Bench tool to publish to js.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论