Apache Beam/Dataflow的ReShuffle已被弃用,应使用什么替代?

huangapple go评论137阅读模式
英文:

Apache Beam/Dataflow ReShuffle deprecated, what to use instead?

问题

Apache Beam的Reshuffle2017年5月被标记为已弃用,附带说明:

仅供内部使用;不提供向后兼容性保证。

此外,DataflowRunner安装了一个ReshuffleOverrideFactory,我不清楚这如何改变重新洗牌。

不管怎样,JavaDoc没有提到要使用什么替代方法。用户应该如何处理一般情况下和在Dataflow中具有高扇出的ParDo转换?

英文:

Apache Beam's Reshuffle was marked as deprecated in May 2017 with the note

> For internal use only; no backwards compatibility guarantees.

In addition, the DataflowRunner installs a ReshuffleOverrideFactory which I'm unclear of how changes the reshuffling.

Anyway, the JavaDoc doesn't mention what to use instead. How are users supposed do deal with ParDo transforms with high fan out in general and on Dataflow?

答案1

得分: 1

你可以查看 GroupByKeyCombine 操作中的 withFanout 选项。这是 Java API 的链接 - https://beam.apache.org/releases/javadoc/2.0.0/org/apache/beam/sdk/transforms/Combine.Globally.html#withFanout-int-

英文:

You can look at withFanout option in GroupByKey and Combine operation. Here is the link to the Java API - https://beam.apache.org/releases/javadoc/2.0.0/org/apache/beam/sdk/transforms/Combine.Globally.html#withFanout-int-

huangapple
  • 本文由 发表于 2020年3月16日 16:38:55
  • 转载请务必保留本文链接:https://go.coder-hub.com/60702671.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定