英文:
Kubeadm, initializing a control plane cluster fails on preflight checks (system behind proxy)
问题
我正在尝试在主节点上使用kubeadm init来运行控制平面。我已经在设备上安装了嵌入式Linux,并安装了Kubernetes所需的必要依赖项。现在我已经运行了Kubernetes,但无法初始化控制平面。我使用的系统信息如下:
- 设备型号为fpga xilinx zcu102。
- 操作系统为Yocto嵌入式Linux。
- 我所使用的Kubernetes版本为Kubernetes v1.22.2-dirty。
我的系统位于公司代理后面。Kubernetes可以访问互联网(可以进行版本检查)。Docker也可以访问互联网(可以拉取镜像)。
当我运行kubeadm init
时,我收到以下响应:
在此之前的输出在这里:https://jpst.it/3hI4d
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
Unfortunately, an error has occurred:
timed out waiting for the condition
This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'
Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all Kubernetes containers running in docker:
- 'docker ps -a | grep kube | grep -v pause'
Once you have found the failing container, you can inspect its logs with:
- 'docker logs CONTAINERID'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher
在这里,它告诉我检查kubelet是否健康。当我运行systemctl status kubelet
时,我得到以下输出,显示kubelet正在运行:
* kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (8;;file://xilinx-zcu102-20222/lib/systemd/system/kubelet.service/lib/systemd/system/kubelet.service8;;; enabled; vendor preset: en)
Drop-In: /lib/systemd/system/kubelet.service.d
`-8;;file://xilinx-zcu102-20222/lib/systemd/system/kubelet.service.d/10-kubeadm.conf10-kubeadm.conf8;;
Active: active (running) since Tue 2023-07-04 14:59:27 UTC; 11min ago
Docs: 8;;https://kubernetes.io/docs/home/https://kubernetes.io/docs/home/8;;
...
Jul 04 15:10:57 xilinx-zcu102-20222 kubelet[12765]: E0704 15:10:57.593367 12765 kubelet.go:2407] "Error getting node" err="node \"xilinx-...not found"
这里是kubelet的journalctl日志。
此外,您还可以看到属于Kubernetes的Docker容器:
docker ps -a | grep kube | grep -v pause
189a443c96ff 897a9db485af "kube-apiserver --ad…" 30 seconds ago Exited (1) 7 seconds ago k8s_kube-apiserver_kube-a5
53a685514bbc 2252d5eb703b "etcd --advertise-cl…" About a minute ago Exited (1) About a minute ago k8s_etcd_etcd-xilinx-zcu18
cde795979dd8 9894e0c256dc "kube-controller-man…" 58 minutes ago Up 58 minutes k8s_kube-controller-manag2
07ec1fd3ebb8 e06572384d3e "kube-scheduler --au…" 58 minutes ago Up 58 minutes k8s_kube-scheduler_kube-s3
现在我看到kubelet不断报错:Error getting node
,并且有一个错误Unable to update cni config
。
我尝试过这个问题中评分最高的解决方案。但是,我的系统中的journalctl输出与此问题不同。因此,问题可能也不同,我在互联网上找不到解决方法。
我已尝试关闭交换分区,如在该问题中所述,但未奏效:
sudo swapoff -a
sudo sed -i 's/^/#/' /etc/fstab
请问您能提供帮助吗?
英文:
I am trying to run control plane on master node using kubeadm init. I have installed embedded linux on the device with the necessary dependencies for kubernetes. Now I have running kubernetes but I couldnt initialize a control plane. The system I have:
- The device is fpga xilinx zcu102.
- The operating system is Yocto embedded linux.
- Kubernetes version that I have: Kubernetes v1.22.2-dirty
The system is behind company proxy. Kubernetes can access internet (It can do version checks). The docker can access internet (It can pull images)
When I run kubeadm init
, I get the following response:
The outputs that are prior are here: https://jpst.it/3hI4d
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
Unfortunately, an error has occurred:
timed out waiting for the condition
This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'
Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all Kubernetes containers running in docker:
- 'docker ps -a | grep kube | grep -v pause'
Once you have found the failing container, you can inspect its logs with:
- 'docker logs CONTAINERID'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher
Here It tells me to check if kubelet is healthy. When I run systemctl status kubelet
, I get the output below which shows that kubelet runs.
* kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (8;;file://xilinx-zcu102-20222/lib/systemd/system/kubelet.service/lib/systemd/system/kubelet.service8;;; enabled; vendor preset: en)
Drop-In: /lib/systemd/system/kubelet.service.d
`-8;;file://xilinx-zcu102-20222/lib/systemd/system/kubelet.service.d/10-kubeadm.conf10-kubeadm.conf8;;
Active: active (running) since Tue 2023-07-04 14:59:27 UTC; 11min ago
Docs: 8;;https://kubernetes.io/docs/home/https://kubernetes.io/docs/home/8;;
Main PID: 12765 (kubelet)
Tasks: 17 (limit: 4409)
Memory: 63.7M
CGroup: /system.slice/kubelet.service
`-12765 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --confi...
Jul 04 15:10:56 xilinx-zcu102-20222 kubelet[12765]: E0704 15:10:56.788129 12765 kubelet.go:2407] "Error getting node" err="node \"xilinx-...not found"
Jul 04 15:10:57 xilinx-zcu102-20222 kubelet[12765]: I0704 15:10:57.408216 12765 cni.go:239] "Unable to update cni config" err="no network...cni/net.d"
Jul 04 15:10:57 xilinx-zcu102-20222 kubelet[12765]: E0704 15:10:57.492613 12765 kubelet.go:2407] "Error getting node" err="node \"xilinx-...not found"
Jul 04 15:10:57 xilinx-zcu102-20222 kubelet[12765]: E0704 15:10:57.593367 12765 kubelet.go:2407] "Error getting node" err="node \"xilinx-...not found"
here is journalctl logs of kubelet
Here also you can see the docker containers that are belong to kubernetes:
docker ps -a | grep kube | grep -v pause
189a443c96ff 897a9db485af "kube-apiserver --ad…" 30 seconds ago Exited (1) 7 seconds ago k8s_kube-apiserver_kube-a5
53a685514bbc 2252d5eb703b "etcd --advertise-cl…" About a minute ago Exited (1) About a minute ago k8s_etcd_etcd-xilinx-zcu18
cde795979dd8 9894e0c256dc "kube-controller-man…" 58 minutes ago Up 58 minutes k8s_kube-controller-manag2
07ec1fd3ebb8 e06572384d3e "kube-scheduler --au…" 58 minutes ago Up 58 minutes k8s_kube-scheduler_kube-s3
Now I see that kubelet constantly giving error: Error getting node
. and there is one error Unable to update cni config
.
I have tried the highest rated solutions in this question. Journalctl output in my system is different than this question. So The problem is also different probably, I couldnt find solution on internet.
I have tried turning swap off as explained in that question but it didnt work:
sudo swapoff -a
sudo sed -i '/ swap / s/^/#/' /etc/fstab
Can you please help?
答案1
得分: 0
在我清理了所有内容并启用了docker和kubelet服务之后,它开始正常工作。
systemctl start docker kubelet && systemctl enable docker kubelet
sudo kubeadm reset
rm -rf .kube/
sudo rm -rf /etc/kubernetes/
sudo rm -rf /var/lib/kubelet/
sudo rm -rf /var/lib/etcd
kubeadm init --pod-network-cidr=10.244.0.0/16 --kubernetes-version=1.22.0 --v
英文:
It worked after I clean everything and enable docker and kubelet services.
systemctl start docker kubelet && systemctl enable docker kubelet
sudo kubeadm reset
rm -rf .kube/
sudo rm -rf /etc/kubernetes/
sudo rm -rf /var/lib/kubelet/
sudo rm -rf /var/lib/etcd
kubeadm init --pod-network-cidr=10.244.0.0/16 --kubernetes-version=1.22.0 --v
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论