一、swap分区未关闭
导致的启动失败。
因为挂载硬盘的问题,我重启了一下服务器,结果服务器启动完成了,kubelet 服务却挂掉了,报错如下:
[root@k8s2 tmp]# systemctl status kubelet
kubelet.service - Kubernetes Kubelet Server
Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Active: failed (Result: start-limit) since Thu 2022-11-24 11:02:16 CST; 421ms ago
Process: 9236 ExecStart=/usr/bin/kubelet $KUBELET_ARGS (code=exited, status=1/FAILURE)
Main PID: 9236 (code=exited, status=1/FAILURE)
Nov 24 11:02:16 k8s2 systemd[1]: Unit kubelet.service entered failed state.
Nov 24 11:02:16 k8s2 systemd[1]: kubelet.service failed.
Nov 24 11:02:16 k8s2 systemd[1]: kubelet.service holdoff time over, scheduling restart.
Nov 24 11:02:16 k8s2 systemd[1]: Stopped Kubernetes Kubelet Server.
Nov 24 11:02:16 k8s2 systemd[1]: start request repeated too quickly for kubelet.service
Nov 24 11:02:16 k8s2 systemd[1]: Failed to start Kubernetes Kubelet Server.
Nov 24 11:02:16 k8s2 systemd[1]: Unit kubelet.service entered failed state.
Nov 24 11:02:16 k8s2 systemd[1]: kubelet.service failed.
[root@k8s2 tmp]# journalctl -xu kubelet -f
-- Logs begin at Thu 2022-11-24 10:53:58 CST. --
Nov 24 11:02:16 k8s2 kubelet[9236]: E1123 22:02:16.731284 9236 run.go:74] "command failed" err="failed to run Kubelet: running with swap on is not supported, please disable swap! or set --fail-swap-on flag to false. /proc/swaps contained: [Filename\t\t\t\tType\t\tSize\tUsed\tPriority /dev/dm-1 partition\t4194300\t0\t-2]"
Nov 24 11:02:16 k8s2 systemd[1]: kubelet.service: main process exited, code=exited, status=1/FAILURE
Nov 24 11:02:16 k8s2 systemd[1]: Unit kubelet.service entered failed state.
Nov 24 11:02:16 k8s2 systemd[1]: kubelet.service failed.
提取一下报错信息:err="failed to run Kubelet: running with swap on is not supported, please disable swap
意思就是说:请关闭 swap 分区。我一想:我安装kubelet 时,已经执行过 swapoff -a
关闭过一次了,咋又让我关闭。后来网上查了下,发现得永久关闭才行。
解决办法:
swapoff -a
vim /etc/fstab
将 /etc/fstab
文件里的 swap 那一行给注释掉。
#
# /etc/fstab
# Created by anaconda on Mon Nov 7 00:37:04 2022
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
/dev/mapper/centos-root / xfs defaults 0 0
UUID=bc62c885-78ff-45ae-9670-fc26d9829e5e /boot xfs defaults 0 0
/dev/mapper/centos-home /home xfs defaults 0 0
#/dev/mapper/centos-swap swap swap defaults 0 0
UUID=55421852-192f-4d50-86db-d65b0e8c79e6 /var/lib/containerd xfs defaults 0 0
OK了。
二、systemd 启动顺序
导致的失败
今天重启服务器又发现 kubelet 没有自动启动成功,报错如下:
kubelet.service - Kubernetes Kubelet Server
Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Active: failed (Result: start-limit) since Fri 2022-11-25 19:34:10 CST; 2min 42s ago
Process: 1153 ExecStart=/usr/bin/kubelet $KUBELET_ARGS (code=exited, status=1/FAILURE)
Main PID: 1153 (code=exited, status=1/FAILURE)
Nov 25 19:34:10 k8s2 kubelet[1153]: }. Err: connection error: desc = "transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: no such file or directory"
- 意思是说,containerd 还没起来。
解决办法
将 /usr/lib/systemd/system/kubelet.service
里的Unit.After 改成 After=containerd.target
,我这里修改之前的值是After=docker.target
(升级kubernetes 移除docker 时,忘记修改这儿了)。
改完就OK了。