如何减少或禁用 Flink 中的检查点日志

huangapple go评论67阅读模式
英文:

How to reduce or disable checkpoint logs in Flink

问题

Sure, here's the translated content:

我正在使用 Flink 1.11.1,在 Kubernetes 中以独立模式运行它,使用 hdfs 进行存储和高可用性。这些天,我尝试启用 Flink 的检查点功能。但我注意到 jobmanager 和 taskmanagers 都记录了太多与检查点相关的日志,这非常烦人。示例如下:

Jobmanager

2020-10-08 19:54:23,237 [INFO] org.apache.flink.runtime.checkpoint.CheckpointCoordinator    - 正在触发作业 fbf26a33a6d5d235085d10e7a10c1cab 的检查点 1(类型=CHECKPOINT)@ 1602186863226。
2020-10-08 19:54:42,818 [INFO] org.apache.flink.runtime.checkpoint.CheckpointCoordinator    - 完成作业 fbf26a33a6d5d235085d10e7a10c1cab 的检查点 1(702534 字节,在 19488 毫秒内)。
2020-10-08 19:54:42,825 [INFO] org.apache.flink.runtime.checkpoint.CheckpointCoordinator    - 正在触发作业 fbf26a33a6d5d235085d10e7a10c1cab 的检查点 2(类型=CHECKPOINT)@ 1602186882820。
2020-10-08 19:54:43,384 [INFO] org.apache.flink.runtime.checkpoint.CheckpointCoordinator    - 完成作业 fbf26a33a6d5d235085d10e7a10c1cab 的检查点 2(729357 字节,在 494 毫秒内)。
2020-10-08 19:54:43,392 [INFO] org.apache.flink.runtime.checkpoint.CheckpointCoordinator    - 正在触发作业 fbf26a33a6d5d235085d10e7a10c1cab 的检查点 3(类型=CHECKPOINT)@ 1602186883388。
2020-10-08 19:54:44,295 [INFO] org.apache.flink.runtime.checkpoint.CheckpointCoordinator    - 完成作业 fbf26a33a6d5d235085d10e7a10c1cab 的检查点 3(736969 字节,在 836 毫秒内)。
2020-10-08 19:54:44,302 [INFO] org.apache.flink.runtime.checkpoint.CheckpointCoordinator    - 正在触发作业 fbf26a33a6d5d235085d10e7a10c1cab 的检查点 4(类型=CHECKPOINT)@ 1602186884298。
2020-10-08 19:54:44,794 [INFO] org.apache.flink.runtime.checkpoint.CheckpointCoordinator    - 完成作业 fbf26a33a6d5d235085d10e7a10c1cab 的检查点 4(748787 字节,在 431 毫秒内)。
2020-10-08 19:54:44,800 [INFO] org.apache.flink.runtime.checkpoint.CheckpointCoordinator    - 正在触发作业 fbf26a33a6d5d235085d10e7a10c1cab 的检查点 5(类型=CHECKPOINT)@ 1602186884796。
2020-10-08 19:54:45,198 [INFO] org.apache.flink.runtime.checkpoint.CheckpointCoordinator    - 完成作业 fbf26a33a6d5d235085d10e7a10c1cab 的检查点 5(755308 字节,在 327 毫秒内)。
2020-10-08 19:54:45,703 [INFO] org.apache.flink.runtime.checkpoint.CheckpointCoordinator    - 正在触发作业 fbf26a33a6d5d235085d10e7a10c1cab 的检查点 6(类型=CHECKPOINT)@ 1602186885698。
2020-10-08 19:54:45,897 [INFO] org.apache.flink.runtime.checkpoint.CheckpointCoordinator    - 完成作业 fbf26a33a6d5d235085d10e7a10c1cab 的检查点 6(757353 字节,在 163 毫秒内)。
2020-10-08 19:54:45,903 [INFO] org.apache.flink.runtime.checkpoint.CheckpointCoordinator    - 正在触发作业 fbf26a33a6d5d235085d10e7a10c1cab 的检查点 7(类型=CHECKPOINT)@ 1602186885899。

Taskmanager

2020-10-08 20:04:02,090 [INFO] org.apache.flink.streaming.api.functions.sink.TwoPhaseCommitSinkFunction - FlinkKafkaProducer 1/2 - 检查点 571 完成,正在提交事务 TransactionHolder{handle=KafkaTransactionState [transactionalId=null, producerId=-1, epoch=-1], transactionStartTime=1602187440992},来自检查点 571
2020-10-08 20:04:03,086 [INFO] org.apache.flink.streaming.api.functions.sink.TwoPhaseCommitSinkFunction - FlinkKafkaProducer 1/2 - 检查点 572 完成,正在提交事务 TransactionHolder{handle=KafkaTransactionState [transactionalId=null, producerId=-1, epoch=-1], transactionStartTime=1602187441992},来自检查点 572
2020-10-08 20:04:03,086 [INFO] org.apache.flink.streaming.api.functions.sink.TwoPhaseCommitSinkFunction - FlinkKafkaProducer 1/2 - 检查点 572 完成,正在提交事务 TransactionHolder{handle=KafkaTransactionState [transactionalId=null, producerId=-1, epoch=-1], transactionStartTime=1602187441992},来自检查点 572
2020-10-08 20:04:04,099 [INFO] org.apache.flink.streaming.api.functions.sink.TwoPhaseCommitSinkFunction - FlinkKafkaProducer 1/2 - 检查点 573 完成,正在提交事务 TransactionHolder{handle=KafkaTransactionState [transactionalId=null, producerId=-1, epoch=-1], transactionStartTime=1602187442992},来自检查点 573
2020-10-08 20:04:04,099 [INFO] org.apache.flink.streaming.api.functions.sink.TwoPhaseCommitSinkFunction

<details>
<summary>英文:</summary>

I am using Flink 1.11.1 and run it on Kubernetes in standalone mode with hdfs for storage and HA. These days, I try to enable the Flink checkpoint feature. But I notice both jobmanger and taskmanagers are logging too many logs relate to the checkpoint which is annoying. Examples are as following:

**Jobmanager**

2020-10-08 19:54:23,237 [INFO] org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering checkpoint 1 (type=CHECKPOINT) @ 1602186863226 for job fbf26a33a6d5d235085d10e7a10c1cab.
2020-10-08 19:54:42,818 [INFO] org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed checkpoint 1 for job fbf26a33a6d5d235085d10e7a10c1cab (702534 bytes in 19488 ms).
2020-10-08 19:54:42,825 [INFO] org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering checkpoint 2 (type=CHECKPOINT) @ 1602186882820 for job fbf26a33a6d5d235085d10e7a10c1cab.
2020-10-08 19:54:43,384 [INFO] org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed checkpoint 2 for job fbf26a33a6d5d235085d10e7a10c1cab (729357 bytes in 494 ms).
2020-10-08 19:54:43,392 [INFO] org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering checkpoint 3 (type=CHECKPOINT) @ 1602186883388 for job fbf26a33a6d5d235085d10e7a10c1cab.
2020-10-08 19:54:44,295 [INFO] org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed checkpoint 3 for job fbf26a33a6d5d235085d10e7a10c1cab (736969 bytes in 836 ms).
2020-10-08 19:54:44,302 [INFO] org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering checkpoint 4 (type=CHECKPOINT) @ 1602186884298 for job fbf26a33a6d5d235085d10e7a10c1cab.
2020-10-08 19:54:44,794 [INFO] org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed checkpoint 4 for job fbf26a33a6d5d235085d10e7a10c1cab (748787 bytes in 431 ms).
2020-10-08 19:54:44,800 [INFO] org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering checkpoint 5 (type=CHECKPOINT) @ 1602186884796 for job fbf26a33a6d5d235085d10e7a10c1cab.
2020-10-08 19:54:45,198 [INFO] org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed checkpoint 5 for job fbf26a33a6d5d235085d10e7a10c1cab (755308 bytes in 327 ms).
2020-10-08 19:54:45,703 [INFO] org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering checkpoint 6 (type=CHECKPOINT) @ 1602186885698 for job fbf26a33a6d5d235085d10e7a10c1cab.
2020-10-08 19:54:45,897 [INFO] org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed checkpoint 6 for job fbf26a33a6d5d235085d10e7a10c1cab (757353 bytes in 163 ms).
2020-10-08 19:54:45,903 [INFO] org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering checkpoint 7 (type=CHECKPOINT) @ 1602186885899 for job fbf26a33a6d5d235085d10e7a10c1cab.

**Taskmanager**

2020-10-08 20:04:02,090 [INFO] org.apache.flink.streaming.api.functions.sink.TwoPhaseCommitSinkFunction - FlinkKafkaProducer 1/2 - checkpoint 571 complete, committing transaction TransactionHolder{handle=KafkaTransactionState [transactionalId=null, producerId=-1, epoch=-1], transactionStartTime=1602187440992} from checkpoint 571
2020-10-08 20:04:03,086 [INFO] org.apache.flink.streaming.api.functions.sink.TwoPhaseCommitSinkFunction - FlinkKafkaProducer 1/2 - checkpoint 572 complete, committing transaction TransactionHolder{handle=KafkaTransactionState [transactionalId=null, producerId=-1, epoch=-1], transactionStartTime=1602187441992} from checkpoint 572
2020-10-08 20:04:03,086 [INFO] org.apache.flink.streaming.api.functions.sink.TwoPhaseCommitSinkFunction - FlinkKafkaProducer 1/2 - checkpoint 572 complete, committing transaction TransactionHolder{handle=KafkaTransactionState [transactionalId=null, producerId=-1, epoch=-1], transactionStartTime=1602187441992} from checkpoint 572
2020-10-08 20:04:04,099 [INFO] org.apache.flink.streaming.api.functions.sink.TwoPhaseCommitSinkFunction - FlinkKafkaProducer 1/2 - checkpoint 573 complete, committing transaction TransactionHolder{handle=KafkaTransactionState [transactionalId=null, producerId=-1, epoch=-1], transactionStartTime=1602187442992} from checkpoint 573
2020-10-08 20:04:04,099 [INFO] org.apache.flink.streaming.api.functions.sink.TwoPhaseCommitSinkFunction - FlinkKafkaProducer 1/2 - checkpoint 573 complete, committing transaction TransactionHolder{handle=KafkaTransactionState [transactionalId=null, producerId=-1, epoch=-1], transactionStartTime=1602187442992} from checkpoint 573
2020-10-08 20:04:05,130 [INFO] org.apache.flink.streaming.api.functions.sink.TwoPhaseCommitSinkFunction - FlinkKafkaProducer 1/2 - checkpoint 574 complete, committing transaction TransactionHolder{handle=KafkaTransactionState [transactionalId=null, producerId=-1, epoch=-1], transactionStartTime=1602187443999} from checkpoint 574
2020-10-08 20:04:05,130 [INFO] org.apache.flink.streaming.api.functions.sink.TwoPhaseCommitSinkFunction - FlinkKafkaProducer 1/2 - checkpoint 574 complete, committing transaction TransactionHolder{handle=KafkaTransactionState [transactionalId=null, producerId=-1, epoch=-1], transactionStartTime=1602187443999} from checkpoint 574
2020-10-08 20:04:06,096 [INFO] org.apache.flink.streaming.api.functions.sink.TwoPhaseCommitSinkFunction - FlinkKafkaProducer 1/2 - checkpoint 575 complete, committing transaction TransactionHolder{handle=KafkaTransactionState [transactionalId=null, producerId=-1, epoch=-1], transactionStartTime=1602187444995} from checkpoint 575
2020-10-08 20:04:06,096 [INFO] org.apache.flink.streaming.api.functions.sink.TwoPhaseCommitSinkFunction - FlinkKafkaProducer 1/2 - checkpoint 575 complete, committing transaction TransactionHolder{handle=KafkaTransactionState [transactionalId=null, producerId=-1, epoch=-1], transactionStartTime=1602187444992} from checkpoint 575

Is there any way we can disable or reduce the logs from checkpointing? Any help will be appreciated!

</details>


# 答案1
**得分**: 4

禁用烦人的日志最简单的方法是为目标组件指定所需的日志级别。在您的情况下,如果您想要禁用来自`org.apache.flink.runtime.checkpoint`,或者更广泛地来自所有flink组件 - `org.apache.flink`的日志,您可以将其日志级别提高到`WARN`。要执行此操作,请编辑`flink/conf/log4j.properties`文件并添加以下内容(或取消注释现有行):

```properties
logger.flink.name = org.apache.flink
logger.flink.level = WARN

应用程序停止/启动后,更改将被采纳。

英文:

The simpliest way to disable annoying logs would be to specify the required log level for the target components. In your case if you want to disable logs from org.apache.flink.runtime.checkpoint or more widely from all flink components - org.apache.flink, then you can increase the log level for it to WARN. To do it, edit the flink/conf/log4j.properties file and add the following (or uncomment the existing lines):

logger.flink.name = org.apache.flink
logger.flink.level = WARN

After the application stop/start the changes will be picked up.

huangapple
  • 本文由 发表于 2020年10月9日 04:21:21
  • 转载请务必保留本文链接:https://go.coder-hub.com/64270081.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定