Can not pull container image to GKE Autopilot from private Artifact Registry even these in same project

huangapple go评论54阅读模式
英文:

Can not pull container image to GKE Autopilot from private Artifact Registry even these in same project

问题

根据下面的文章,看起来我们可以在相同项目中从Artifact Registry拉取容器镜像到GKE,而无需任何额外的身份验证。

但是当我尝试时,遇到了ImagePullBackOff错误。是否有任何错误?误解?还是我需要使用其他身份验证?

重现步骤:

在https://console.cloud.google.com的某个项目中使用Google Cloud Shell非常方便。

创建Artifact Registry

gcloud artifacts repositories create test \
    --repository-format=docker \
    --location=asia-northeast2

推送示例镜像

gcloud auth configure-docker asia-northeast2-docker.pkg.dev
docker pull nginx
docker tag nginx asia-northeast2-docker.pkg.dev/${PROJECT_NAME}/test/sample-nginx-image
docker push asia-northeast2-docker.pkg.dev/${PROJECT_NAME}/test/sample-nginx-image

创建GKE Autopilot集群

使用GUI控制台创建GKE Autopilot集群。

几乎所有选项都是默认的,但我更改了这两个选项。

  • 将集群名称设置为test。
  • 将区域设置为与Registry的区域相同(在此情况下为asia-northeast2)。
  • 启用Anthos Service Mesh。

从Artifact Registry部署容器镜像到GKE

gcloud container clusters get-credentials test --zone asia-northeast2
kubectl run test --image asia-northeast2-docker.pkg.dev/${PROJECT_NAME}/test/sample-nginx-image

检查Pod状态

kubectl describe po test

然后,我得到了ImagePullBackOff错误。

英文:

According to articles below, it seems we can pull container image to GKE from Artifact Registry without any additional authentication when these in same project.

https://cloud.google.com/artifact-registry/docs/integrate-gke

https://www.youtube.com/watch?v=BfS7mvPA-og

https://stackoverflow.com/questions/73205712/error-imagepullbackoff-and-error-errimagepull-errors-with-gke

But when I try it, I faced ImagePullBackOff error.
Is there any mistake? misunderstanding? Or should I need use another authentication?

Reproduce

It's convenient to use Google Cloud Shell in some project on https://console.cloud.google.com .

Create Artifact Registry

gcloud artifacts repositories create test \
    --repository-format=docker \
    --location=asia-northeast2

Push sample image

gcloud auth configure-docker asia-northeast2-docker.pkg.dev
docker pull nginx
docker tag nginx asia-northeast2-docker.pkg.dev/${PROJECT_NAME}/test/sample-nginx-image
docker push asia-northeast2-docker.pkg.dev/${PROJECT_NAME}/test/sample-nginx-image

Create GKE Autopilot cluster

Create GKE Autopilot cluster by using GUI console.

Almost all options is default but I changed these 2.

  • Set cluster name as test.
  • Set region same as registry's one. (In this case, asia-northeast2)
  • Enabled Anthos Service Mesh.

Deploy container image to GKE from Artifact Registry

gcloud container clusters get-credentials test --zone asia-northeast2
kubectl run test --image asia-northeast2-docker.pkg.dev/${PROJECT_NAME}/test/sample-nginx-image

Check Pod state

kubectl describe po test
Name:             test
Namespace:        default
Priority:         0
Service Account:  default
Node:             xxxxxxxxxxxxxxxxxxx
Start Time:       Wed, 08 Feb 2023 12:38:08 +0000
Labels:           run=test
Annotations:      autopilot.gke.io/resource-adjustment:
                    {"input":{"containers":[{"name":"test"}]},"output":{"containers":[{"limits":{"cpu":"500m","ephemeral-storage":"1Gi","memory":"2Gi"},"reque...
                  seccomp.security.alpha.kubernetes.io/pod: runtime/default
Status:           Pending
IP:               10.73.0.25
IPs:
  IP:  10.73.0.25
Containers:
  test:
    Container ID:
    Image:          asia-northeast2-docker.pkg.dev/${PROJECT_NAME}/test/sample-nginx-image
    Image ID:
    Port:           <none>
    Host Port:      <none>
    State:          Waiting
      Reason:       ErrImagePull
    Ready:          False
    Restart Count:  0
    Limits:
      cpu:                500m
      ephemeral-storage:  1Gi
      memory:             2Gi
    Requests:
      cpu:                500m
      ephemeral-storage:  1Gi
      memory:             2Gi
    Environment:          <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-szq85 (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  kube-api-access-szq85:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Guaranteed
Node-Selectors:              <none>
Tolerations:                 kubernetes.io/arch=amd64:NoSchedule
                             node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age   From                                   Message
  ----     ------     ----  ----                                   -------
  Normal   Scheduled  19s   gke.io/optimize-utilization-scheduler  Successfully assigned default/test to xxxxxxxxxxxxxxxxxxx
  Normal   Pulling    16s   kubelet                                Pulling image "asia-northeast2-docker.pkg.dev/${PROJECT_NAME}/test/sample-nginx-image"
  Warning  Failed     16s   kubelet                                Failed to pull image "asia-northeast2-docker.pkg.dev/${PROJECT_NAME}/test/sample-nginx-image": rpc error: code = Unknown desc = failed to pull and unpack image "asia-northeast2-docker.pkg.dev/${PROJECT_NAME}/test/sample-nginx-image:latest": failed to resolve reference "asia-northeast2-docker.pkg.dev/${PROJECT_NAME}/test/sample-nginx-image:latest": failed to authorize: failed to fetch oauth token: unexpected status: 403 Forbidden
  Warning  Failed     16s   kubelet                                Error: ErrImagePull
  Normal   BackOff    15s   kubelet                                Back-off pulling image "asia-northeast2-docker.pkg.dev/${PROJECT_NAME}/test/sample-nginx-image"
  Warning  Failed     15s   kubelet                                Error: ImagePullBackOff

then, I got ImagePullBackOff.

答案1

得分: 3

这可能是因为 GKE Autopilot 服务帐号没有足够的权限访问 Artifact Registry。您可以通过将 roles/artifactregistry.reader 角色添加到配置为使用 GKE Autopilot 节点池的服务帐号来授予所需的权限。此外,您可能需要调整服务帐号的 IAM 权限,以便它能够访问私有 Artifact Registry。

gcloud artifacts repositories add-iam-policy-binding <repository-name> \
  --location=<location> \
  --member=serviceAccount:<nnn>-compute@developer.gserviceaccount.com \
  --role="roles/artifactregistry.reader";

您可以尝试创建一个新的服务帐号并授予它拉取镜像所需的权限,然后尝试拉取镜像。

简单的故障排除步骤包括:

  1. 您应该确保您的 GKE 集群已配置为允许访问 Artifact Registry。您可以通过前往 GKE 仪表板并确保启用了“允许访问 Artifact Registry”选项来执行此操作。
  2. 您尝试拉取的容器镜像不存在于 Artifact Registry 中。您应该检查注册表,确保容器镜像已正确上传并可访问。
  3. 您可以查看错误日志以获取有关导致此问题的更多信息。此外,您可以查阅 GKE 文档以获取有关排除故障的更多信息。
英文:

This could be because the GKE Autopilot service account does not have the necessary permissions to access the Artifact Registry. You can grant the needed permissions by adding the roles/artifactregistry.reader role to the service account that the GKE Autopilot node pool is configured to use. Additionally, you may need to adjust the IAM permissions for the service account so that it has access to the private Artifact Registry.

gcloud artifacts repositories add-iam-policy-binding &lt;repository-name&gt; \
  --location=&lt;location&gt; \
  --member=serviceAccount:&lt;nnn&gt;-compute@developer.gserviceaccount.com \
  --role=&quot;roles/artifactregistry.reader&quot;

Can you try creating a new service account and granting it the necessary permissions to pull the image and try to pull the image once.

Simple troubleshooting steps are:

  1. you should ensure that your GKE cluster is configured to allow access to the Artifact Registry. You can do this by going to the GKE dashboard and making sure that the “Allow access to Artifact Registry” option is enabled.
  2. The container image you are trying to pull does not exist in the Artifact Registry. You should check the registry to make sure that the container image is correctly uploaded and can be accessed.
  3. you can look into the error logs to get more information on what is causing this issue. Additionally, you can check the GKE documentation for more information on troubleshooting this issue.

huangapple
  • 本文由 发表于 2023年2月8日 16:22:39
  • 转载请务必保留本文链接:https://go.coder-hub.com/75382976.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定