如何将Cloud Composer DAG分配到特定的节点池运行?

huangapple go评论70阅读模式
英文:

How do I assign Cloud Composer DAGs to run on specific Node Pools?

问题

我正在使用Google Cloud上的Cloud Composer(Apache Airflow)。我们的一些流程需要比Composer默认节点池中可用资源更多,因此我已在集群中创建了一个额外的节点池。资源密集型的DAG使用KubernetesPodOperator,并通过affinity={ nodeAffinity...}属性明确地定位到特殊的节点池。

我的问题是自从创建了新的节点池以来,我注意到我的所有工作负载都被调度到这个新的节点池上。如何让我的常规工作负载继续在默认节点池上运行,同时将新的节点池保留给特殊用例?

以下是一个示例的KubernetesPodOperator定义,它定位到特殊节点池。常规的KubernetesPodOperator没有填写affinity属性:

KubernetesPodOperator(namespace='default',
            image="image_name",
            image_pull_policy='Always',
            name="example_name",
            task_id="example_name",
            get_logs=True,
            affinity={
                'nodeAffinity': {
                    'requiredDuringSchedulingIgnoredDuringExecution': {
                        'nodeSelectorTerms': [{
                            'matchExpressions': [{
                                'key': 'cloud.google.com/gke-nodepool',
                                'operator': 'In',
                                'values': ['datascience-pool']
                            }]
                        }]
                    }
                }
            },
            is_delete_operator_pod=True,
            dag=dag)
英文:

I am using Cloud Composer (Apache Airflow) on Google Cloud. Some of our processes require more resources than what's available on Composer's default node pool, so I've created an additional node pool within our cluster. The resource-intensive DAG's use the KubernetesPodOperator and specifically target the special node pool through the affinity={ nodeAffinity...} attribute.

My issue is that since creating the new node pool, I've noticed that ALL of my workloads are being scheduled on this new pool. How can I keep my normal workloads running on the default pool, while reserving the new node pool for special use cases?

Here is an example of KubernetesPodOperator definition that targets the special pool. The regular KubernetesPodOperator don't have the affinity attribute filled out:

KubernetesPodOperator(namespace='default',
            image="image_name",
            image_pull_policy='Always',
            name="example_name",
            task_id="example_name",
            get_logs=True,
            affinity={
                'nodeAffinity': {
                    'requiredDuringSchedulingIgnoredDuringExecution': {
                        'nodeSelectorTerms': [{
                            'matchExpressions': [{
                                'key': 'cloud.google.com/gke-nodepool',
                                'operator': 'In',
                                'values': ['datascience-pool']
                            }]
                        }]
                    }
                }
            },
            is_delete_operator_pod=True,
            dag=dag)

</details>


# 答案1
**得分**: 2

KubernetesPodOperator没有任何默认的亲和性偏好,所以您正常的工作负载最终被调度到新的节点池中的决策是由Kubernetes调度程序做出的。为了避免这种情况,您现在必须在所有的KubernetesPodOperator实例上设置亲和性(您可以通过使用`default_args`和`apply_defaults` Airflow装饰器来使这一过程相对不那么痛苦)。

至少在Cloud Composer的版本上,直到v1.8.3,Composer系统的Pod将始终在节点池`default-pool`中运行。因此,您可以使用这个来确保Pod在Composer节点池中运行,而不是自定义节点池。

<details>
<summary>英文:</summary>

The KubernetesPodOperator does not have any default affinity preferences, so the scheduling decision for your normal workloads to have ended up in the new node pool were made by the Kubernetes scheduler. To avoid this, you will now have to set affinity on all instances of KubernetesPodOperator (which you can make somewhat less painful by using `default_args` and the `apply_defaults` Airflow decorator).

At least as of versions of Cloud Composer up to v1.8.3, the Composer system pods will always run in the node pool `default-pool`. Therefore, you can use this to ensure the pods run in the Composer node pool instead of a custom one.

</details>



# 答案2
**得分**: 0

我不知道这是否是一个解决方法,但我已经通过为所有任务分配亲和性来解决了这个问题。需要高CPU或高内存的任务分配给相应的节点池,而默认任务分配给默认池。这解决了问题,我在许多流程中进行了测试。

<details>
<summary>英文:</summary>

I don&#39;t know if it is a work around but I have solved this issue by assigning affinity to all of the tasks. Tasks requiring high-cpu or high-memory are assigned to respective node pool and default tasks are assigned to default-pool. This resolves the issue, I have tested in many flows.

</details>



huangapple
  • 本文由 发表于 2020年1月4日 00:27:28
  • 转载请务必保留本文链接:https://go.coder-hub.com/59582004.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定