Airflow BashOperator在任务失败时仍返回退出代码0,应返回退出代码1。

huangapple go评论69阅读模式
英文:

Airflow BashOperator return exit code 0 even when task failed and return exit code 1

问题

我正在尝试使用Airflow的Bash Operator和Kubernetes运行一个Spark作业,我已经将callback_failure配置为某个函数,但是尽管Spark作业以退出代码1失败,我的任务总是标记为成功,并且函数没有被调用(callback failure)。以下是Airflow日志的片段:

[2020-01-03 13:22:46,730] {{bash_operator.py:128}} INFO - 20/01/03 13:22:46 INFO LoggingPodStatusWatcherImpl: Container final statuses:
[2020-01-03 13:22:46,730] {{bash_operator.py:128}} INFO - 
[2020-01-03 13:22:46,730] {{bash_operator.py:128}} INFO - 
[2020-01-03 13:22:46,730] {{bash_operator.py:128}} INFO - 	 Container name: spark-kubernetes-driver
[2020-01-03 13:22:46,730] {{bash_operator.py:128}} INFO - 	 Container image: XXXXXXXXX.dkr.ecr.us-east-1.amazonaws.com/spark-py:XX_XX
[2020-01-03 13:22:46,730] {{bash_operator.py:128}} INFO - 	 Container state: Terminated
[2020-01-03 13:22:46,730] {{bash_operator.py:128}} INFO - 	 Exit code: 1
[2020-01-03 13:22:46,731] {{bash_operator.py:128}} INFO - 20/01/03 13:22:46 INFO Client: Application run_report_generator finished.
[2020-01-03 13:22:46,736] {{bash_operator.py:128}} INFO - 20/01/03 13:22:46 INFO ShutdownHookManager: Shutdown hook called
[2020-01-03 13:22:46,737] {{bash_operator.py:128}} INFO - 20/01/03 13:22:46 INFO ShutdownHookManager: Deleting directory /tmp/spark-adb99a7e-ce6c-49f6-8307-a17c28448043
[2020-01-03 13:22:46,761] {{bash_operator.py:132}} INFO - Command exited with return code 0
[2020-01-03 13:22:49,994] {{logging_mixin.py:95}} INFO - [ [34m2020-01-03 13:22:49,994 [0m] {{ [34mlocal_task_job.py: [0m105}} INFO [0m - Task exited with return code 0 
英文:

I am trying to run a spark job from airflow's bash operator with Kubernetes, I have configured callback_failure to some function, however even though spark job failed with exit code 1, my task is always marked as a success and function is not called( callbcak failure ). Following are snippets of airflow log:

[2020-01-03 13:22:46,730] {{bash_operator.py:128}} INFO - 20/01/03 13:22:46 INFO LoggingPodStatusWatcherImpl: Container final statuses:
[2020-01-03 13:22:46,730] {{bash_operator.py:128}} INFO - 
[2020-01-03 13:22:46,730] {{bash_operator.py:128}} INFO - 
[2020-01-03 13:22:46,730] {{bash_operator.py:128}} INFO - 	 Container name: spark-kubernetes-driver
[2020-01-03 13:22:46,730] {{bash_operator.py:128}} INFO - 	 Container image: XXXXXXXXX.dkr.ecr.us-east-1.amazonaws.com/spark-py:XX_XX
[2020-01-03 13:22:46,730] {{bash_operator.py:128}} INFO - 	 Container state: Terminated
[2020-01-03 13:22:46,730] {{bash_operator.py:128}} INFO - 	 Exit code: 1
[2020-01-03 13:22:46,731] {{bash_operator.py:128}} INFO - 20/01/03 13:22:46 INFO Client: Application run_report_generator finished.
[2020-01-03 13:22:46,736] {{bash_operator.py:128}} INFO - 20/01/03 13:22:46 INFO ShutdownHookManager: Shutdown hook called
[2020-01-03 13:22:46,737] {{bash_operator.py:128}} INFO - 20/01/03 13:22:46 INFO ShutdownHookManager: Deleting directory /tmp/spark-adb99a7e-ce6c-49f6-8307-a17c28448043
[2020-01-03 13:22:46,761] {{bash_operator.py:132}} INFO - Command exited with return code 0
[2020-01-03 13:22:49,994] {{logging_mixin.py:95}} INFO - [ [34m2020-01-03 13:22:49,994 [0m] {{ [34mlocal_task_job.py: [0m105}} INFO [0m - Task exited with return code 0 

答案1

得分: 1

你需要使用 set -e 来确保 BashOperator 在遇到任何非零返回代码时停止执行并返回错误。

英文:

You need to use set -e to ensure the BashOperator to stop execution and return error for any non-zero code.

答案2

得分: 0

你必须确保最后的退出代码不是0。

从你的输入中,你有这个:

[2020-01-03 13:22:46,761] {{bash_operator.py:132}} INFO - 命令以返回代码0退出

然后bash操作员将整个操作作为成功处理。

解决方案是将此退出代码明确设置为1。

例如,在Python中,你可以这样写:

import sys

if 条件_退出:

   sys.exit(1)
英文:

You have to make sure that the last exit code is not 0 .

From your input you have this:

[2020-01-03 13:22:46,761] {{bash_operator.py:132}} INFO - Command exited with return code 0

and then the bash operator treats the whole operator job as success.

The solution is to make this exit code explicitly equal to 1.

For example in python you can have:

 import sys

 if condition_for_exiting:

    sys.exit(1)

huangapple
  • 本文由 发表于 2020年1月6日 20:35:48
  • 转载请务必保留本文链接:https://go.coder-hub.com/59612237.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定