Kubeadm, initializing a control plane cluster fails on preflight checks (system behind proxy)

huangapple go评论63阅读模式
英文:

Kubeadm, initializing a control plane cluster fails on preflight checks (system behind proxy)

问题

我正在尝试在主节点上使用kubeadm init来运行控制平面。我已经在设备上安装了嵌入式Linux,并安装了Kubernetes所需的必要依赖项。现在我已经运行了Kubernetes,但无法初始化控制平面。我使用的系统信息如下:

  • 设备型号为fpga xilinx zcu102。
  • 操作系统为Yocto嵌入式Linux。
  • 我所使用的Kubernetes版本为Kubernetes v1.22.2-dirty。

我的系统位于公司代理后面。Kubernetes可以访问互联网(可以进行版本检查)。Docker也可以访问互联网(可以拉取镜像)。

当我运行kubeadm init时,我收到以下响应:

在此之前的输出在这里:https://jpst.it/3hI4d

[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.

Unfortunately, an error has occurred:
    timed out waiting for the condition

This error is likely caused by:
    - The kubelet is not running
    - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
    - 'systemctl status kubelet'
    - 'journalctl -xeu kubelet'

Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.

Here is one example how you may list all Kubernetes containers running in docker:
    - 'docker ps -a | grep kube | grep -v pause'
Once you have found the failing container, you can inspect its logs with:
    - 'docker logs CONTAINERID'

error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher

在这里,它告诉我检查kubelet是否健康。当我运行systemctl status kubelet时,我得到以下输出,显示kubelet正在运行:

* kubelet.service - kubelet: The Kubernetes Node Agent
     Loaded: loaded (8;;file://xilinx-zcu102-20222/lib/systemd/system/kubelet.service/lib/systemd/system/kubelet.service8;;; enabled; vendor preset: en)
    Drop-In: /lib/systemd/system/kubelet.service.d
             `-8;;file://xilinx-zcu102-20222/lib/systemd/system/kubelet.service.d/10-kubeadm.conf10-kubeadm.conf8;;
     Active: active (running) since Tue 2023-07-04 14:59:27 UTC; 11min ago
       Docs: 8;;https://kubernetes.io/docs/home/https://kubernetes.io/docs/home/8;;

...

Jul 04 15:10:57 xilinx-zcu102-20222 kubelet[12765]: E0704 15:10:57.593367   12765 kubelet.go:2407] "Error getting node" err="node \"xilinx-...not found"

这里是kubelet的journalctl日志。

此外,您还可以看到属于Kubernetes的Docker容器:

docker ps -a | grep kube | grep -v pause
189a443c96ff   897a9db485af           "kube-apiserver --ad…"   30 seconds ago       Exited (1) 7 seconds ago                  k8s_kube-apiserver_kube-a5
53a685514bbc   2252d5eb703b           "etcd --advertise-cl…"   About a minute ago   Exited (1) About a minute ago             k8s_etcd_etcd-xilinx-zcu18
cde795979dd8   9894e0c256dc           "kube-controller-man…"   58 minutes ago       Up 58 minutes                             k8s_kube-controller-manag2
07ec1fd3ebb8   e06572384d3e           "kube-scheduler --au…"   58 minutes ago       Up 58 minutes                             k8s_kube-scheduler_kube-s3

现在我看到kubelet不断报错:Error getting node,并且有一个错误Unable to update cni config

我尝试过这个问题中评分最高的解决方案。但是,我的系统中的journalctl输出与此问题不同。因此,问题可能也不同,我在互联网上找不到解决方法。

我已尝试关闭交换分区,如在该问题中所述,但未奏效:

sudo swapoff -a
sudo sed -i 's/^/#/' /etc/fstab

请问您能提供帮助吗?

英文:

I am trying to run control plane on master node using kubeadm init. I have installed embedded linux on the device with the necessary dependencies for kubernetes. Now I have running kubernetes but I couldnt initialize a control plane. The system I have:

  • The device is fpga xilinx zcu102.
  • The operating system is Yocto embedded linux.
  • Kubernetes version that I have: Kubernetes v1.22.2-dirty

The system is behind company proxy. Kubernetes can access internet (It can do version checks). The docker can access internet (It can pull images)

When I run kubeadm init, I get the following response:


The outputs that are prior are here: https://jpst.it/3hI4d

[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.

	Unfortunately, an error has occurred:
		timed out waiting for the condition

	This error is likely caused by:
		- The kubelet is not running
		- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

	If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
		- 'systemctl status kubelet'
		- 'journalctl -xeu kubelet'

	Additionally, a control plane component may have crashed or exited when started by the container runtime.
	To troubleshoot, list all containers using your preferred container runtimes CLI.

	Here is one example how you may list all Kubernetes containers running in docker:
		- 'docker ps -a | grep kube | grep -v pause'
		Once you have found the failing container, you can inspect its logs with:
		- 'docker logs CONTAINERID'

error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher

Here It tells me to check if kubelet is healthy. When I run systemctl status kubelet, I get the output below which shows that kubelet runs.

* kubelet.service - kubelet: The Kubernetes Node Agent                                                                                                  
     Loaded: loaded (8;;file://xilinx-zcu102-20222/lib/systemd/system/kubelet.service/lib/systemd/system/kubelet.service8;;; enabled; vendor preset: en)
    Drop-In: /lib/systemd/system/kubelet.service.d                                                                                                      
             `-8;;file://xilinx-zcu102-20222/lib/systemd/system/kubelet.service.d/10-kubeadm.conf10-kubeadm.conf8;;                                     
     Active: active (running) since Tue 2023-07-04 14:59:27 UTC; 11min ago                                                                              
       Docs: 8;;https://kubernetes.io/docs/home/https://kubernetes.io/docs/home/8;;                                                                     
   Main PID: 12765 (kubelet)                                                                                                                            
      Tasks: 17 (limit: 4409)                                                                                                                           
     Memory: 63.7M                                                                                                                                      
     CGroup: /system.slice/kubelet.service                                                                                                              
             `-12765 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --confi...
                                                                                                                                                        
Jul 04 15:10:56 xilinx-zcu102-20222 kubelet[12765]: E0704 15:10:56.788129   12765 kubelet.go:2407] "Error getting node" err="node \"xilinx-...not found"
Jul 04 15:10:57 xilinx-zcu102-20222 kubelet[12765]: I0704 15:10:57.408216   12765 cni.go:239] "Unable to update cni config" err="no network...cni/net.d"
Jul 04 15:10:57 xilinx-zcu102-20222 kubelet[12765]: E0704 15:10:57.492613   12765 kubelet.go:2407] "Error getting node" err="node \"xilinx-...not found"
Jul 04 15:10:57 xilinx-zcu102-20222 kubelet[12765]: E0704 15:10:57.593367   12765 kubelet.go:2407] "Error getting node" err="node \"xilinx-...not found"

here is journalctl logs of kubelet

Here also you can see the docker containers that are belong to kubernetes:

docker ps -a | grep kube | grep -v pause                                                                                    
189a443c96ff   897a9db485af           "kube-apiserver --ad…"   30 seconds ago       Exited (1) 7 seconds ago                  k8s_kube-apiserver_kube-a5
53a685514bbc   2252d5eb703b           "etcd --advertise-cl…"   About a minute ago   Exited (1) About a minute ago             k8s_etcd_etcd-xilinx-zcu18
cde795979dd8   9894e0c256dc           "kube-controller-man…"   58 minutes ago       Up 58 minutes                             k8s_kube-controller-manag2
07ec1fd3ebb8   e06572384d3e           "kube-scheduler --au…"   58 minutes ago       Up 58 minutes                             k8s_kube-scheduler_kube-s3

Now I see that kubelet constantly giving error: Error getting node. and there is one error Unable to update cni config.

I have tried the highest rated solutions in this question. Journalctl output in my system is different than this question. So The problem is also different probably, I couldnt find solution on internet.

I have tried turning swap off as explained in that question but it didnt work:

    sudo swapoff -a
    sudo sed -i '/ swap / s/^/#/' /etc/fstab

Can you please help?

答案1

得分: 0

在我清理了所有内容并启用了docker和kubelet服务之后,它开始正常工作。

systemctl start docker kubelet && systemctl enable docker kubelet
sudo kubeadm reset
rm -rf .kube/
sudo rm -rf /etc/kubernetes/
sudo rm -rf /var/lib/kubelet/
sudo rm -rf /var/lib/etcd
kubeadm init --pod-network-cidr=10.244.0.0/16 --kubernetes-version=1.22.0 --v
英文:

It worked after I clean everything and enable docker and kubelet services.

systemctl start docker kubelet && systemctl enable docker kubelet
sudo kubeadm reset
rm -rf .kube/
sudo rm -rf /etc/kubernetes/
sudo rm -rf /var/lib/kubelet/
sudo rm -rf /var/lib/etcd
kubeadm init  --pod-network-cidr=10.244.0.0/16 --kubernetes-version=1.22.0 --v

huangapple
  • 本文由 发表于 2023年7月4日 20:59:52
  • 转载请务必保留本文链接:https://go.coder-hub.com/76612916.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定