英文:
Apache Flink - Periodic Checkpoint Restore After Job Cancellation
问题
我正在使用 Flink Java 客户端,并使用 RemoteStreamEnvironment 提交作业。
已启用周期性检查点,并在作业取消时保留检查点:
streamExecutionEnvironment.getCheckpointConfig()
.setExternalizedCheckpointCleanup(CheckpointConfig.ExternalizedCheckpointCleanup.RETAIN_ON_CANCELLATION);
问题在于,对我来说不太清楚如何在再次提交“相同”的作业时从最新的检查点恢复 - Flink 随机生成一个新的作业ID,所有内容都从头开始。
我知道可以通过CLI或Web UI实现,但我需要使用Java客户端来支持这一点。
是否有相关支持?
提前感谢。
英文:
I'm using Flink Java Client and submit a job using RemoteStreamEnvironment.
Periodic checkpointing is enabled and is also retained on job cancellation:
streamExecutionEnvironment.getCheckpointConfig()
.setExternalizedCheckpointCleanup(CheckpointConfig.ExternalizedCheckpointCleanup.RETAIN_ON_CANCELLATION);
The issue, though, that it's not clear for me how to resume from the latest checkpoint when submitting "the same" job again - Flink generates a new job id randomly and it all starts from the scratch.
I know it's possible via CLI or WEB UI but I need to support this using Java client.
Is there any support for that?
Thanks in advance
答案1
得分: 1
你可以像从保存点恢复一样从检查点恢复。Flink 内部会确定恢复哪个并进行适当的恢复。从保存点恢复文档
你可以使用 Java 的 REST 端点来提交作业以及检查点。你可以在这里找到确切的详细信息(根据你使用的 Flink 版本) - REST 文档
> /jars/:jarid/run
在请求体中使用 savepointPath 来提交保存点或检查点。
英文:
You can restore from checkpoint the same way as you would restore from a savepoint. Flink internally figures out which is which and restores appropriately. restore from savepoint doc
You can use the rest endpoint from java to submit your job along with the checkpoint. You can find the exact details here (Check for the version of flink you are using) - rest doc
> /jars/:jarid/run
Use the savepointPath in the body for submitting either savepoint or checkpoint.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论