英文:
GKE Volume Attach/mount error for regional persistent disk
问题
I understand that you'd like a translation of the provided technical content. Here's the translation:
"我遇到了一个volumeattach错误。我有一个地区性持久磁盘,它与我的地区性GKE集群在同一个GCP项目中。我的地区性集群位于europe-west2,节点在europe-west2-a、b和c。该地区性磁盘在europe-west2-b和c区域复制。
我有一个nfs服务器部署清单,引用了gcePersistantDisk。
这是我的部署清单:
(清单内容)
和我的pv/pvc清单:
(清单内容)
当我应用上面的部署清单时,我收到以下错误:
'rpc错误:code = 不可用,描述 = 由于退避条件,不允许在节点“projects/ap-mc-qa-xxx-xxxx/zones/europe-west2-a/instances/node-instance-id”上执行ControllerPublish操作'
卷附加告诉我:
'附加错误:消息:rpc错误:code = 未找到,描述 = ControllerPublishVolume找不到ID为projects/UNSPECIFIED/zones/UNSPECIFIED/disks/my-regional-disk-name的卷:googleapi:错误0:未找到'
这些清单在部署到分区集群/磁盘时似乎工作正常。我已经检查了一些事项,比如确保集群服务帐户具有必要的权限。磁盘当前未被使用。
我漏掉了什么?"
Please note that the technical details and code remain in English, as requested. If you need further assistance or have any specific questions about this technical issue, please feel free to ask.
英文:
I am struggling with a volumeattach error. I have a regional persistent disk which is in the same GCP project as my regional GKE cluster. My regional cluster is in europe-west2 with nodes in europe-west2-a, b and c. the regional disk is replicated across zones europe-west2-b and c.
I have a nfs-server deployment manifest which refers to the gcePersistantDisk.
apiVersion: apps/v1
kind: Deployment
metadata:
annotations: []
labels:
app.kubernetes.io/managed-by: Helm
name: nfs-server
namespace: namespace
spec:
progressDeadlineSeconds: 600
replicas: 1
selector:
matchLabels:
role: nfs-server
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
labels:
role: nfs-server
spec:
serviceAccountName: nfs-server
containers:
- image: gcr.io/google_containers/volume-nfs:0.8
imagePullPolicy: IfNotPresent
name: nfs-server
ports:
- containerPort: 2049
name: nfs
protocol: TCP
- containerPort: 20048
name: mountd
protocol: TCP
- containerPort: 111
name: rpcbind
protocol: TCP
securityContext:
privileged: true
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /data
name: nfs-pvc
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
volumes:
- gcePersistentDisk:
fsType: ext4
pdName: my-regional-disk-name
name: nfs-pvc
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution :
nodeSelectorTerms:
- matchExpressions:
- key: topology.gke.io/zone
operator: In
values:
- europe-west2-b
- europe-west2-c
and my pv/pvc
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-pv
spec:
accessModes:
- ReadWriteMany
capacity:
storage: 200Gi
nfs:
path: /
server: nfs-server.namespace.svc.cluster.local
persistentVolumeReclaimPolicy: Retain
volumeMode: Filesystem
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
labels:
app.kubernetes.io/managed-by: Helm
name: nfs-pvc
namespace: namespace
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 8Gi
storageClassName: ""
volumeMode: Filesystem
volumeName: nfs-pv
When I apply my deployment manifest above I get the following error:
'rpc error: code = Unavailable desc = ControllerPublish not permitted on node "projects/ap-mc-qa-xxx-xxxx/zones/europe-west2-a/instances/node-instance-id" due to backoff condition'
The volume attachment tells me this:
Attach Error: Message: rpc error: code = NotFound desc = ControllerPublishVolume could not find volume with ID projects/UNSPECIFIED/zones/UNSPECIFIED/disks/my-regional-disk-name: googleapi: Error 0: , notFound
These manifests seemed to work fine when it was deployed for a zonal cluster/disk. I've checked things like making sure the cluster svc acct has the necessary permissions. Disk is currently not in use.
What am I missing???
答案1
得分: 0
I think we should focus on the type of Nodes that make up your Kubernetes cluster.
Regional persistent disks are restricted from being used with memory-optimized machines or compute-optimized machines.
如果使用区域性持久磁盘不是硬性要求,考虑使用非区域性持久磁盘存储类。如果使用区域性持久磁盘是硬性要求,考虑使用调度策略,例如污点和容忍,以确保需要区域性 PD 的 Pod 被调度到不是优化机器的节点池。
英文:
I think we should focus on the type of Nodes that make up your Kubernetes cluster.
> Regional persistent disks are restricted from being used with memory-optimized machines or compute-optimized machines.
> Consider using a non-regional persistent disk storage class if using a regional persistent disk is not a hard requirement. If using a regional persistent disk is a hard requirement, consider scheduling strategies such as taints and tolerations to ensure that the Pods that need regional PD are scheduled on a node pool that are not optimized machines.
答案2
得分: 0
I'll provide a translation for the code section you shared:
因此,上面的方法不可行的原因是,区域性持久磁盘功能允许创建在同一区域内的 2 个区域中可用的持久磁盘。为了使用该功能,卷必须作为 PersistentVolume 进行配置;不支持直接从 Pod 中引用卷。示例代码如下:
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-pv
spec:
capacity:
storage: 200Gi
accessModes:
- ReadWriteMany
gcePersistentDisk:
pdName: my-regional-disk
fsType: ext4
现在尝试弄清楚如何重新配置 NFS 服务器以使用区域性磁盘。
英文:
So the reason that the above won't work is because a regional persistant disk feature allows the creation of persistent disks that are available in 2 zones within the same region. In order to use that feature, the volume must be provisioned as a PersistentVolume; referencing the volume directly from a pod is not supported. Something like this:
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-pv
spec:
capacity:
storage: 200Gi
accessModes:
- ReadWriteMany
gcePersistentDisk:
pdName: my-regional-disk
fsType: ext4
Now trying to figure out how to re-configure the NFS sever to use a regional disk.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论