英文:
Kubernetes NFS storage class - where is the persistent data located?
问题
我在Kubernetes中使用PostgreSQL进行测试,并按照[官方说明][1]中的步骤安装了Kubegres,只有一个例外:我定义了自己的NFS存储类和2个持久性卷。
一切都运行正常:
- 我有2个Pod(主和从),如果我在一个Pod上创建表,表会在第二个Pod上同步。
- 如果我重新启动Kubernetes集群的所有节点(控制平面和所有工作节点),我仍然可以找到我的数据,所以数据是持久的。
问题是我找不到数据应该在哪里,看起来像一个愚蠢的问题...但不在192.168.88.3/caidoNFS,正如存储类所配置的那样。
我在另一台机器上挂载了192.168.88.3/caidoNFS,但这个文件夹是空的。所以可能出了问题,或者我漏掉了一些重要的东西。
存储类如下:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: caido-nfs
provisioner: caido.ro/caido-nfs
reclaimPolicy: Retain
parameters:
server: 192.168.88.3
path: /caidoNFS
readOnly: "false"
我有两个持久性卷,其大小恰好为Persistent Volume Claims请求的大小(500Mi):
apiVersion: v1
kind: PersistentVolume
metadata:
name: caido-pv1
labels:
type: nfs
spec:
storageClassName: caido-nfs
capacity:
storage: 500Mi
accessModes:
- ReadWriteOnce
hostPath:
path: "/mnt/data1"
apiVersion: v1
kind: PersistentVolume
metadata:
name: caido-pv2
labels:
type: nfs
spec:
storageClassName: caido-nfs
capacity:
storage: 500Mi
accessModes:
- ReadWriteOnce
hostPath:
path: "/mnt/data2"
如果我运行以下命令:
kubectl get pod,statefulset,svc,configmap,pv,pvc -o wide
这是输出:
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod/mypostgres-1-0 1/1 Running 1 105m 192.168.80.3 kubernetesworker1 <none> <none>
pod/mypostgres-2-0 1/1 Running 1 79m 192.168.80.66 kubernetesworker2 <none> <none>
NAME READY AGE CONTAINERS IMAGES
statefulset.apps/mypostgres-1 1/1 105m mypostgres-1 postgres:14.1
statefulset.apps/mypostgres-2 1/1 79m mypostgres-2 postgres:14.1
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 5d5h <none>
service/mypostgres ClusterIP None <none> 5432/TCP 79m app=mypostgres,replicationRole=primary
service/mypostgres-replica ClusterIP None <none> 5432/TCP 73m app=mypostgres,replicationRole=replica
NAME DATA AGE
configmap/base-kubegres-config 7 105m
configmap/kube-root-ca.crt 1 5d5h
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE VOLUMEMODE
persistentvolume/caido-pv1 500Mi RWO Retain Bound default/postgres-db-mypostgres-1-0 caido-nfs 79m Filesystem
persistentvolume/caido-pv2 500Mi RWO Retain Bound default/postgres-db-mypostgres-2-0 caido-nfs 74m Filesystem
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE VOLUMEMODE
persistentvolumeclaim/postgres-db-mypostgres-1-0 Bound caido-pv1 500Mi RWO caido-nfs 105m Filesystem
persistentvolumeclaim/postgres-db-mypostgres-2-0 Bound caido-pv2 500Mi RWO caido-nfs 79m Filesystem
使用命令"kubectl describe pod mypostgres-1"描述我的主PostgreSQL Pod如下:
Name: mypostgres-1-0
Namespace: default
Priority: 0
Service Account: default
Node: kubernetesworker1/192.168.88.71
Start Time: Thu, 22 Jun 2023 07:07:17 +0000
Labels: app=mypostgres
controller-revision-hash=mypostgres-1-6f46f6f669
index=1
replicationRole=primary
statefulset.kubernetes.io/pod-name=mypostgres-1-0
Annotations: cni.projectcalico.org/containerID: 910d046ac8b269cd67a48d8334c36a6d8849ba34ca2161403101ba507856e339
cni.projectcalico.org/podIP: 192.168.80.12/32
cni.projectcalico.org/podIPs: 192.168.80.12/32
Status: Running
IP: 192.168.80.12
IPs:
IP: 192.168.80.12
Controlled By: StatefulSet/mypostgres-1
Containers:
mypostgres-1:
Container ID: cri-o://d3196998458acec1797f12279c00e8e58366764e57e0ad3b58f5617a85c7d421
Image: postgres:14.1
Image ID: docker.io/library/postgres@sha256:043c256b5dc621860
<details>
<summary>英文:</summary>
I'm testing with PostgreSQL in Kubernetes and I installed Kubegres as in the [official instructions][1] with a single exception: I defined my own NFS storage class and 2 persistence volumes.
And everything works perfectly:
- I have 2 pods(primary and secondary), if I create a table on a pod the table is synchronized on the second pod.
- If I restart all the Kubernetes cluster nodes(the control-plane and all workers) I can still find my data so I have persistence.
The problem is that I can't find the data where is supposed to be, seems like a stupid question... but is not in 192.168.88.3/caidoNFS as the storage class is configured.
I mounted 192.168.88.3/caidoNFS on another machine and this folder is empty. So something is wrong or I'm missing something essential.
The storage class is:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: caido-nfs
provisioner: caido.ro/caido-nfs
reclaimPolicy: Retain
parameters:
server: 192.168.88.3
path: /caidoNFS
readOnly: "false"
I have 2 Persistence Volumes that have 500Mi - exactly the size requested by the Persistent Volume Claims:
apiVersion: v1
kind: PersistentVolume
metadata:
name: caido-pv1
labels:
type: nfs
spec:
storageClassName: caido-nfs
capacity:
storage: 500Mi
accessModes:
- ReadWriteOnce
hostPath:
path: "/mnt/data1"
apiVersion: v1
kind: PersistentVolume
metadata:
name: caido-pv2
labels:
type: nfs
spec:
storageClassName: caido-nfs
capacity:
storage: 500Mi
accessModes:
- ReadWriteOnce
hostPath:
path: "/mnt/data2"
If I run **"kubectl get pod,statefulset,svc,configmap,pv,pvc -o wide"** this is the output:
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod/mypostgres-1-0 1/1 Running 1 105m 192.168.80.3 kubernetesworker1 <none> <none>
pod/mypostgres-2-0 1/1 Running 1 79m 192.168.80.66 kubernetesworker2 <none> <none>
NAME READY AGE CONTAINERS IMAGES
statefulset.apps/mypostgres-1 1/1 105m mypostgres-1 postgres:14.1
statefulset.apps/mypostgres-2 1/1 79m mypostgres-2 postgres:14.1
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 5d5h <none>
service/mypostgres ClusterIP None <none> 5432/TCP 79m app=mypostgres,replicationRole=primary
service/mypostgres-replica ClusterIP None <none> 5432/TCP 73m app=mypostgres,replicationRole=replica
NAME DATA AGE
configmap/base-kubegres-config 7 105m
configmap/kube-root-ca.crt 1 5d5h
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE VOLUMEMODE
persistentvolume/caido-pv1 500Mi RWO Retain Bound default/postgres-db-mypostgres-1-0 caido-nfs 79m Filesystem
persistentvolume/caido-pv2 500Mi RWO Retain Bound default/postgres-db-mypostgres-2-0 caido-nfs 74m Filesystem
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE VOLUMEMODE
persistentvolumeclaim/postgres-db-mypostgres-1-0 Bound caido-pv1 500Mi RWO caido-nfs 105m Filesystem
persistentvolumeclaim/postgres-db-mypostgres-2-0 Bound caido-pv2 500Mi RWO caido-nfs 79m Filesystem
The description of my master PostgreSQL pod with the command "**kubectl describe pod mypostgres-1**" is:
Name: mypostgres-1-0
Namespace: default
Priority: 0
Service Account: default
Node: kubernetesworker1/192.168.88.71
Start Time: Thu, 22 Jun 2023 07:07:17 +0000
Labels: app=mypostgres
controller-revision-hash=mypostgres-1-6f46f6f669
index=1
replicationRole=primary
statefulset.kubernetes.io/pod-name=mypostgres-1-0
Annotations: cni.projectcalico.org/containerID: 910d046ac8b269cd67a48d8334c36a6d8849ba34ca2161403101ba507856e339
cni.projectcalico.org/podIP: 192.168.80.12/32
cni.projectcalico.org/podIPs: 192.168.80.12/32
Status: Running
IP: 192.168.80.12
IPs:
IP: 192.168.80.12
Controlled By: StatefulSet/mypostgres-1
Containers:
mypostgres-1:
Container ID: cri-o://d3196998458acec1797f12279c00e8e58366764e57e0ad3b58f5617a85c7d421
Image: postgres:14.1
Image ID: docker.io/library/postgres@sha256:043c256b5dc621860539d8036d906eaaef1bdfa69a0344b4509b483205f14e63
Port: 5432/TCP
Host Port: 0/TCP
Args:
-c
config_file=/etc/postgres.conf
-c
hba_file=/etc/pg_hba.conf
State: Running
Started: Thu, 22 Jun 2023 07:07:17 +0000
Ready: True
Restart Count: 0
Liveness: exec [sh -c exec pg_isready -U postgres -h $POD_IP] delay=60s timeout=15s period=20s #success=1 #failure=10
Readiness: exec [sh -c exec pg_isready -U postgres -h $POD_IP] delay=5s timeout=3s period=10s #success=1 #failure=3
Environment:
POD_IP: (v1:status.podIP)
PGDATA: /var/lib/postgresql/data/pgdata
POSTGRES_PASSWORD: <set to the key 'superUserPassword' in secret 'mypostgres-secret'> Optional: false
POSTGRES_REPLICATION_PASSWORD: <set to the key 'replicationUserPassword' in secret 'mypostgres-secret'> Optional: false
Mounts:
/docker-entrypoint-initdb.d/primary_create_replication_role.sh from base-config (rw,path="primary_create_replication_role.sh")
/docker-entrypoint-initdb.d/primary_init_script.sh from base-config (rw,path="primary_init_script.sh")
/etc/pg_hba.conf from base-config (rw,path="pg_hba.conf")
/etc/postgres.conf from base-config (rw,path="postgres.conf")
/var/lib/postgresql/data from postgres-db (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-tlgnh (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
postgres-db:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: postgres-db-mypostgres-1-0
ReadOnly: false
base-config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: base-kubegres-config
Optional: false
kube-api-access-tlgnh:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 56s default-scheduler 0/3 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/3 nodes are available: 3 No preemption victims found for incoming pod..
Normal Scheduled 54s default-scheduler Successfully assigned default/mypostgres-1-0 to kubernetesworker1
Normal Pulled 54s kubelet Container image "postgres:14.1" already present on machine
Normal Created 54s kubelet Created container mypostgres-1
Normal Started 54s kubelet Started container mypostgres-1
The message "pod has unbound immediate PersistentVolumeClaims" appears even if I have a Persistent Volume created and bound. Maybe this error appears because of some initial timeout? This is the output of "**kubectl describe pvc postgres-db-mypostgres-1-0**":
Name: postgres-db-mypostgres-1-0
Namespace: default
StorageClass: caido-nfs
Status: Bound
Volume: caido-pv1
Labels: app=mypostgres
index=1
Annotations: pv.kubernetes.io/bind-completed: yes
pv.kubernetes.io/bound-by-controller: yes
Finalizers: [kubernetes.io/pvc-protection]
Capacity: 500Mi
Access Modes: RWO
VolumeMode: Filesystem
Used By: mypostgres-1-0
Events: <none>
To summarize there are 2 questions:
- where is the persistent data located? I accessed \\192.168.88.3\caidoNFS on a Windows and I mounted it in a Linux fstab but the folder is empty.
//192.168.88.3/caidoNFS /mnt/nfs cifs username=admin,dom=mydomain,password=...# 0 0
- why does the message "pod has unbound immediate PersistentVolumeClaims" appears if the PVC is bound?
[1]: https://www.kubegres.io/doc/getting-started.html
</details>
# 答案1
**得分**: 1
我找到了我的数据,它位于每个工作节点上的/mnt/data1和/mnt/data2。我猜想NFS由于某种未知原因失败了,系统在每个工作节点上创建了本地存储。
我可能没有按照应该的方式配置NFS,我会再多了解一下这个问题。
<details>
<summary>英文:</summary>
I found my data, it was in /mnt/data1 and /mnt/data2 on every worker node. I guess the NFS failed for some unknown reason and the system created a local storage on every worker node.
I probably didn't configure the NFS as I should, I'll read more about this.
</details>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论