Can I avoid network shuffle when creating a KeyedStream from a pre-partitioned Kinesis Data Stream in Apache Flink?

huangapple go评论50阅读模式
英文:

Can I avoid network shuffle when creating a KeyedStream from a pre-partitioned Kinesis Data Stream in Apache Flink?

问题

能否从预先分片/预先分区的Kinesis数据流创建KeyedStream,而无需进行网络洗牌(例如使用reinterpretAsKeyedStream或类似的方法)? - 如果不可能(即从Kinesis中消费然后使用keyBy是唯一可靠的方法),那么通过对源数据流的分片字段使用keyBy能否最小化网络洗牌(例如env.addSource(source).keyBy(pojo -> pojo.getTransactionId()),其中源数据流按transactionId字段分片)? - 如果上述操作是可能的,有哪些限制?

我目前了解到的情况

我的应用背景

  • 关于配置:Kinesis数据流和Flink都将以无服务器方式托管,并根据负载自动进行扩缩容(据我了解,这意味着不能使用reinterpretAsKeyedStream)。

任何帮助/见解都将不胜感激,谢谢!

英文:

Is it possible to create a KeyedStream from a pre-sharded/pre-partitioned Kinesis Data Stream without the need for a network shuffle (i.e. using reinterpretAsKeyedStream or something similar)?

  • If that is not possible (i.e. the only reliable is to consume from Kinesis and then use keyBy), then is network shuffling at least minimized by doing a keyBy on a the field that the source is sharded by (e.g. env.addSource(source).keyBy(pojo -> pojo.getTransactionId()), where the source is a kinesis data stream that is sharded by transactionId)
  • If the above is possible, what are the limitations?

What I've Learned so Far

Context of my Application

  • Re. configurations: both the Kinesis Data Stream and Flink will be hosted serverlessly, and automatically scale up/down depending on load (which as I understand it, means that reinterpretAsKeyedStream cannot be used)

Any help/insight is much appreciated, thanks!

答案1

得分: 2

我不相信有任何简单的方法来实现你想要的,至少不是一种能够适应源和集群并行度变化的方式。我曾经使用直升机特技来实现类似的功能,但这牵涉到脆弱的代码(取决于Flink如何处理分区)。

英文:

I don't believe there's any way to easily do what you want, at least not in a way that's resilient to changes in the parallelism of your source and your cluster. I have used helicopter stunts to achieve something similar to this, but it involved fragile code (depends on exactly how Flink handles partitioning).

huangapple
  • 本文由 发表于 2023年2月9日 02:53:02
  • 转载请务必保留本文链接:https://go.coder-hub.com/75390464.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定