gitlab-ci.yml 作业失败(错误:作业失败:退出状态 1)

huangapple go评论97阅读模式
英文:

gitlab-ci.yml job failed (ERROR: Job failed: exit status 1)

问题

我的GitLab CI作业失败了。我收到的消息是ERROR: Job failed: exit status 1。这条消息对我来说不够详细,无法帮助我排除错误。

我正在为一个Node.Js Express应用程序实施CI/CD。在构建和部署服务器应用程序之前,我正在优雅地停止/关闭实际运行的应用程序实例。

这是我在stop-job内尝试执行的操作。

然而,当我运行停止作业时,Gitlab Runner会失败,并显示消息ERROR: Job failed: exit status 1

这是我的代码,位于.gitlab-ci.yml中:

stages:          # 列出作业的阶段以及它们的执行顺序
  - stop

stop-job:       # 此作业运行在<stop>阶段,该阶段首先运行。
  stage: stop
  script:
    - echo 'Stopping job ...';

    # 向监听端口3000(以及端口80)的服务器应用程序发送关闭/停止消息
    - echo 'shutdown' | nc localhost 3000 || echo 'No process listening on port 3000';

    - |
      while true; do
          # 计算使用端口80的进程数
          process_count=$(lsof -i :80 | grep LISTEN | wc -l)
          
          # 检查是否没有进程使用端口80
          if [ "$process_count" -eq 0 ]; then
              echo "There is no application or process using port 80"
              break
          fi
      
          echo "Port 80 is currently in use. Retrying in 5 seconds..."
          # 在再次重试之前等待5秒
          sleep 5
      done
      
      # 我们已经跳出循环,当前使用端口80的实例已关闭      
    - echo "Stopping job completed!"
  only:
    - main

    - echo 'Stopping job completed!'

我认为错误似乎发生在代码的这一部分:

      while true; do
          # 计算使用端口80的进程数
          process_count=$(lsof -i :80 | grep LISTEN | wc -l)
英文:

My GitLab CI job failed. The message I have is ERROR: Job failed: exit status 1. This message is not informative enough for me to troubleshoot the error.

I am implementing CI/CD for a Node.Js Express application. Before I build and deploy the Server application, I am gracefully stopping / shutting-down the instance of the application that is actually running.

This is what I am trying to do inside the stop-job.

However, when I run the stop job, Gitlab runner will fail with the message ERROR: Job failed: exit status 1.

This is my code, inside .gitlab-ci.yml:

stages:          # List of stages for jobs, and their order of execution
  - stop

stop-job:       # This job runs in the &lt;stop&gt; stage, which runs first.
  stage: stop
  script:
    - echo &#39;Stopping job ...&#39;

    # Send a kill / shutdown message to a server application listening on port 3000 (and on port 80)
    - echo &#39;shutdown&#39; | nc localhost 3000 || echo &#39;No process listening on port 3000&#39;

    - |
      while true; do
          # Count the process using port 80
          process_count=$(lsof -i :80 | grep LISTEN | wc -l)
          
          # Check if no process is using port 80
          if [ &quot;$process_count&quot; -eq 0 ]; then
              echo &quot;There is no application or process using port 80&quot;
              break
          fi
      
          echo &quot;Port 80 is currently in use. Retrying in 5 seconds...&quot;
          # Wait 5 seconds before we retry again
          sleep 5
      done
      
      # We are out of the loop and the current instance using port 80 is closed
    - echo &quot;Stopping job completed!&quot;
  only:
    - main

    - echo &#39;Stopping job completed!&#39;

I believe the error seems to happen around this part of the code.

      while true; do
          # Count the process using port 80
          process_count=$(lsof -i :80 | grep LISTEN | wc -l)

答案1

得分: 1

以下是翻译好的部分:

命令 lsof -i :80 返回值为 1,通常表示目前没有任何进程正在监听端口80。尽管这在表面上似乎不合常理,但在这个上下文中,特定的退出状态(返回值)1并不一定表示命令执行失败或错误。相反,它表示 lsof 命令未找到与端口80关联的任何打开文件或连接。

在 shell 中设置了 set -o pipefail。因为命令 lsof -i :80 返回 1,表示没有进程在使用端口80,而 pipefail 告诉 shell 在管道中的任何失败都被视为致命的(而不仅仅使用最后一个命令的退出状态)。

如果使用 set -o pipefail 运行,那么管道中的任何阶段失败都会导致整个管道被视为失败。

要在当前脚本的其余部分关闭这个功能,可以使用 set +o pipefail
要在当前脚本的其余部分重新启用这个功能,可以使用 set -o pipefail

以下是脚本中需要更正的部分的代码:

...

# 统计使用端口80的进程数
set +o pipefail
process_count=$(lsof -i :80 | grep LISTEN | wc -l)
set -o pipefail

...
英文:

The command lsof -i :80 returns a value of 1, it typically indicates that no process is currently listening on port 80. While it may seem counterintuitive, this specific exit status (return value) of 1 in this context does not necessarily indicate a command execution failure or error. Instead, it signifies that the lsof command did not find any open files or connections associated with port 80.

set -o pipefail was set in the shell. Because the command lsof -i :80 return 1 to indicate there is no process using port 80 , and the pipefail tells the shell to treat any failure in a pipeline as fatal (rather than only using the last command's exit status).
There I was.

If running with set -o pipefail, a failure at any stage in a shell pipeline will cause the entire pipeline to be considered failed.

To turn this off for the remainder of the current script with set +o pipefail
To turn this back on for the remainder of the current script with set -o pipefail

The following is the partial code which needed the correction in the script:

...

          # Count the process using port 80
          set +o pipefail
          process_count=$(lsof -i :80 | grep LISTEN | wc -l)
          set -o pipefail
          
...

huangapple
  • 本文由 发表于 2023年7月11日 07:38:01
  • 转载请务必保留本文链接:https://go.coder-hub.com/76657916.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定