开始一个任务,在 CPU 负载低于 2.0 之后。

huangapple go评论73阅读模式
英文:

Ansible - starting a task after cpu load is below 2.0

问题

我正在尝试创建一个剧本,在 CPU 负载低于 2.0 之后执行简单的调试任务。

到目前为止,在 cpu-load.yml 文件中我有以下内容:

---
- name: 检查 CPU 负载并等待
  hosts: localhost
  gather_facts: yes
  
  tasks:
    - name: 检查 CPU 负载
      shell: uptime | awk -F 'load average:' '{print $2}' | awk -F ', ' '{print $1}'
      register: cpu_load
      
    - name: 等待 CPU 负载低于 2.0
      wait_for:
        timeout: 300
        delay: 10
        shell: 在这里执行一些操作
        msg: "CPU 负载低于 2.0"
      
    - name: 继续任务
      debug:
        msg: "CPU 负载低于 2.0。继续执行!!!"

现在,我不确定如何使任务等待 CPU 负载低于 2.0。你们能帮助吗?

英文:

I am trying to create a playbook where I want to perform a simple debug task after cpu load is below 2.0.

I have this so far in cpu-load.yml:

---
- name: Check CPU load and wait
  hosts: localhost
  gather_facts: yes
  
  tasks:
    - name: Check cpu load
      shell: uptime | awk -F 'load average:' '{print $2}' | awk -F ', ' '{print $1}'
      register: cpu_load
      
    - name: Wait until cpu load is bellow 2.0
      wait_for:
        timeout: 300
        delay: 10
        shell: Do something here
        msg: "cpu load is bellow 2.0"
      
    - name: Continue jobs
      debug:
        msg: "CPU load is bellow 2.0. Continue!!!"

Now I am not sure how to make the task wait for the cpu load to go bellow 2.0. Can you guys help?

答案1

得分: 3

你需要在“检查CPU负载”任务周围放置一个until循环

- hosts: localhost
  gather_facts: false
  tasks:
    - name: Check cpu load
      shell: uptime | awk -F 'load average:' '{print $2}' | awk -F ', ' '{print $1}'
      register: cpu_load
      until: cpu_load.stdout|float < 2.0
      retries: 300
      delay: 1

    - name: Some other task
      debug:
        msg: hello world

这将等待最多五分钟(300次重试,每次延迟1秒),直到负载平均值降到2.0以下。


可能有更好的方法来获取当前1分钟的CPU负载;从/proc/loadavg读取可能是最简单的方法:

- hosts: localhost
  gather_facts: false
  tasks:
    - name: Check cpu load
      command: cat /proc/loadavg
      register: cpu_load
      until: cpu_load.stdout.split()|first|float < 2.0
      retries: 300
      delay: 1

    - name: Some other task
      debug:
        msg: hello world
英文:

You need to put an until loop around your "check cpu load" task:

- hosts: localhost
  gather_facts: false
  tasks:
    - name: Check cpu load
      shell: uptime | awk -F &#39;load average:&#39; &#39;{print $2}&#39; | awk -F &#39;, &#39; &#39;{print $1}&#39;
      register: cpu_load
      until: cpu_load.stdout|float &lt; 2.0
      retries: 300
      delay: 1

    - name: Some other task
      debug:
        msg: hello world

This will wait up to five minutes (300 retries with a 1-second delay) for the load average to drop below 2.0.


There are probably better ways to get the current 1-minute CPU load; reading from /proc/loadavg may be easiest:

- hosts: localhost
  gather_facts: false
  tasks:
    - name: Check cpu load
      command: cat /proc/loadavg
      register: cpu_load
      until: cpu_load.stdout.split()|first|float &lt; 2.0
      retries: 300
      delay: 1

    - name: Some other task
      debug:
        msg: hello world

答案2

得分: 2

基于@larsks的回答,并进一步探讨关于以下句子的问题:

可能有更好的方法来获取当前1分钟的CPU负载

实际上,在Linux上至少包含了这些信息的一个事实,例如:

$ ansible localhost -m setup -a gather_subset='!all,!min,loadavg'
localhost | SUCCESS => {
    "ansible_facts": {
        "ansible_loadavg": {
            "15m": 0.669921875,
            "1m": 0.48974609375,
            "5m": 0.4501953125
        },
        "gather_subset": [
            "!all",
            "!min",
            "loadavg"
        ],
        "module_setup": true
    },
    "changed": false
}

注意:

  1. gather_subset='!all,!min,loadavg' 确保仅收集所需的事实(并在下面的剧本中刷新)。有关更多信息,请参阅选项文档
  2. 我编写这些行时,loadavg 子集尚未在文档中记录,但在我的Ansible 2.14.6 / 2.15.0版本中的模块错误消息中列出了允许的选项(请参阅下面)。

了解了这一点,并应用与@larsk回答中相同的方法,可以实施如下检查:

---
- name: 演示检查负载并继续进行
  hosts: localhost
  gather_facts: false

  tasks:
    - name: 在下一个任务之前检查负载
      ansible.builtin.setup:
        gather_subset:
          - '!all'
          - '!min'
          - 'loadavg'
      retries: 30
      delay: 5
      until: ansible_loadavg['1m'] < 2.00

    - name: 现在在冷却的系统上执行某些操作
      debug:
        msg: 我在没有压力的情况下做一些事情

关于setup模块的未记录子集,以下是一个快速而粗糙的解决方案,用于列出您版本的模块中接受的所有子集:

$ ansible localhost -m setup -a gather_subset='toto'
localhost | FAILED! => {
    "changed": false,
    "msg": "Bad subset 'toto' given to Ansible. gather_subset options allowed: all, all_ipv4_addresses, all_ipv6_addresses, apparmor, architecture, caps, chroot, cmdline, date_time, default_ipv4, default_ipv6, devices, distribution, distribution_major_version, distribution_release, distribution_version, dns, effective_group_ids, effective_user_id, env, facter, fibre_channel_wwn, fips, hardware, interfaces, is_chroot, iscsi, kernel, kernel_version, loadavg, local, lsb, machine, machine_id, mounts, network, nvme, ohai, os_family, pkg_mgr, platform, processor, processor_cores, processor_count, python, python_version, real_user_id, selinux, service_mgr, ssh_host_key_dsa_public, ssh_host_key_ecdsa_public, ssh_host_key_ed25519_public, ssh_host_key_rsa_public, ssh_host_pub_keys, ssh_pub_keys, system, system_capabilities, system_capabilities_enforced, user, user_dir, user_gecos, user_gid, user_id, user_shell, user_uid, virtual, virtualization_role, virtualization_tech_guest, virtualization_tech_host, virtualization_type"
}
英文:

Building up on @larsks answer and going further regarding the sentence
> There are probably better ways to get the current 1-minute CPU load

There's actually a fact containing this information, at least on linux, e.g.

$ ansible localhost -m setup -a gather_subset=&#39;!all,!min,loadavg&#39;
localhost | SUCCESS =&gt; {
    &quot;ansible_facts&quot;: {
        &quot;ansible_loadavg&quot;: {
            &quot;15m&quot;: 0.669921875,
            &quot;1m&quot;: 0.48974609375,
            &quot;5m&quot;: 0.4501953125
        },
        &quot;gather_subset&quot;: [
            &quot;!all&quot;,
            &quot;!min&quot;,
            &quot;loadavg&quot;
        ],
        &quot;module_setup&quot;: true
    },
    &quot;changed&quot;: false
}

Notes:

  1. gather_subset=&#39;!all,!min,loadavg&#39; ensures that stricly only the needed facts are gathered (and refreshed in the below playbook). For more information, see the option documentation
  2. the loadavg subset is not documented at time I write these lines but is listed in the allowed options from the module error message in my ansible 2.14.6 / 2.15.0 versions (see below)

Knowing this and applying the same recipe as in @larsk's answer, the check can be implemented as:

---
- name: Demo play to check load and continue
  hosts: localhost
  gather_facts: false

  tasks:
    - name: Check load before next tasks
      ansible.builtin.setup:
        gather_subset:
          - &#39;!all&#39;
          - &#39;!min&#39;
          - &#39;loadavg&#39;
      retries: 30
      delay: 5
      until: ansible_loadavg[&#39;1m&#39;] &lt; 2.00

    - name: Now do something on a cooled down system
      debug:
        msg: I&#39;m doing something without pressure

Regarding the undocumented subset for the setup module, here is a quick and dirty solution to list all accepted subsets in your version of the module:

$ ansible localhost -m setup -a gather_subset=&#39;toto&#39;
localhost | FAILED! =&gt; {
    &quot;changed&quot;: false,
    &quot;msg&quot;: &quot;Bad subset &#39;toto&#39; given to Ansible. gather_subset options allowed: all, all_ipv4_addresses, all_ipv6_addresses, apparmor, architecture, caps, chroot, cmdline, date_time, default_ipv4, default_ipv6, devices, distribution, distribution_major_version, distribution_release, distribution_version, dns, effective_group_ids, effective_user_id, env, facter, fibre_channel_wwn, fips, hardware, interfaces, is_chroot, iscsi, kernel, kernel_version, loadavg, local, lsb, machine, machine_id, mounts, network, nvme, ohai, os_family, pkg_mgr, platform, processor, processor_cores, processor_count, python, python_version, real_user_id, selinux, service_mgr, ssh_host_key_dsa_public, ssh_host_key_ecdsa_public, ssh_host_key_ed25519_public, ssh_host_key_rsa_public, ssh_host_pub_keys, ssh_pub_keys, system, system_capabilities, system_capabilities_enforced, user, user_dir, user_gecos, user_gid, user_id, user_shell, user_uid, virtual, virtualization_role, virtualization_tech_guest, virtualization_tech_host, virtualization_type&quot;
}

huangapple
  • 本文由 发表于 2023年6月1日 21:37:04
  • 转载请务必保留本文链接:https://go.coder-hub.com/76382480.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定