上接Kubernetes架构(3)
完成用户令牌配置后,根据初始化结果的提示需要进行网络配置。提示中的网址可提供很多信息,建议要记住。
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.227.10:6443 --token 8r3qrc.l6g3ygl4pncbk27b \
--discovery-token-ca-cert-hash sha256:46648385aa490e4ffdeb60c79f8b1b4797bfe3727a686925c8419e5f60816bd8
[root@k8s-master tmp]#
初始化后和令牌配置完成后,kubectl命令功能生效,可以正常使用,利用如下命令查看节点信息。
[root@k8s-master .kube]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master NotReady master 21h v1.14.1
可见master 节点 ,状态为NotReady。NotReady状态是因为k8s不具备网络功能。需要通过网络插件实现。
-
根据网址内容可见k8s支持的网络插件,本次我们选择[Flannel]作为网络插件。网络插件介质在/UAT中
image.png
11 部署flannel网络(仅在k8s-master执行)
步骤 1 部署网络前查看集群状态,此时因为网络问题,状态均异常
查看节点状态
[root@k8s-master UAT]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master NotReady master 21h v1.14.1
查看pod状态
[root@k8s-master UAT]# kubectl get pods --namespace=kube-system
NAME READY STATUS RESTARTS AGE
coredns-584795fc57-6rstd 0/1 Pending 0 21h
coredns-584795fc57-mknx6 0/1 Pending 0 21h
etcd-k8s-master 1/1 Running 1 21h
kube-apiserver-k8s-master 1/1 Running 1 21h
kube-controller-manager-k8s-master 1/1 Running 1 21h
kube-proxy-kpz7n 1/1 Running 1 21h
kube-scheduler-k8s-master 1/1 Running 1 21h
[root@k8s-master UAT]#
步骤2 部署flannel
根据init结果提示部署flannel网络插件
[root@k8s-master ~]# kubectl apply -f /tmp/UAT/kube-flannel.yml
podsecuritypolicy.policy/psp.flannel.unprivileged created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.apps/kube-flannel-ds-amd64 created
daemonset.apps/kube-flannel-ds-arm64 created
daemonset.apps/kube-flannel-ds-arm created
daemonset.apps/kube-flannel-ds-ppc64le created
daemonset.apps/kube-flannel-ds-s390x created
[root@k8s-master ~]#
再次检查node状态,状态变为master
[root@k8s-master ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master Ready master 21h v1.14.1
[root@k8s-master ~]#
再次检查pod状态(namespace 可缩写为-n),coredns变为
[root@k8s-master ~]# kubectl get pods --namespace=kube-system
NAME READY STATUS RESTARTS AGE
coredns-584795fc57-6rstd 0/1 CrashLoopBackOff 2 21h
coredns-584795fc57-mknx6 0/1 CrashLoopBackOff 2 21h
etcd-k8s-master 1/1 Running 1 21h
kube-apiserver-k8s-master 1/1 Running 1 21h
kube-controller-manager-k8s-master 1/1 Running 2 21h
kube-flannel-ds-amd64-pj8c7 1/1 Running 0 2m7s
kube-proxy-kpz7n 1/1 Running 1 21h
kube-scheduler-k8s-master 1/1 Running 2 21h
[root@k8s-master ~]#
检查docker 和 kubelete状态(这两个是以进程的方式部署的)
解释:
NAME :pod 名称
READY: 1/1 右边的1表示这个pod里有几个容器,左边的1表示有几个容器在启动着。
12 将其它节点加入k8s-master中组成集群
步骤1 在k8s-node1和k8s-node2上分别运行3.4节中初始化k8s-master之后的回显
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.227.10:6443 --token 8r3qrc.l6g3ygl4pncbk27b \
--discovery-token-ca-cert-hash sha256:46648385aa490e4ffdeb60c79f8b1b4797bfe3727a686925c8419e5f60816bd8
[root@k8s-master tmp]
问题:发现命令发出后卡住没有响应。现象如下
[root@k8s-node1 ~]# kubeadm join 192.168.227.10:6443 --token 8r3qrc.l6g3ygl4pncbk27b \
> --discovery-token-ca-cert-hash sha256:46648385aa490e4ffdeb60c79f8b1b4797bfe3727a686925c8419e5f60816bd8
[preflight] Running pre-flight checks
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
问题检查和过程中学到的命令
1.查看主节点node状态和pod状态,发现coredns的状态不正确
[root@k8s-master ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master Ready master 162m v1.14.1
[root@k8s-master ~]# kubectl get pod -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
coredns-584795fc57-48swl 0/1 CrashLoopBackOff 14 54m 10.244.0.23 k8s-master <none> <none>
coredns-584795fc57-4sklt 0/1 CrashLoopBackOff 14 54m 10.244.0.24 k8s-master <none> <none>
etcd-k8s-master 1/1 Running 0 53m 192.168.227.10 k8s-master <none> <none>
kube-apiserver-k8s-master 1/1 Running 0 53m 192.168.227.10 k8s-master <none> <none>
kube-controller-manager-k8s-master 1/1 Running 0 53m 192.168.227.10 k8s-master <none> <none>
kube-flannel-ds-amd64-6zdr2 1/1 Running 0 52m 192.168.227.10 k8s-master <none> <none>
kube-proxy-bcc2z 1/1 Running 0 54m 192.168.227.10 k8s-master <none> <none>
kube-scheduler-k8s-master 1/1 Running 0 53m 192.168.227.10 k8s-master <none> <none>
[root@k8s-master ~]#
2.检查 coredns-584795fc57-48swl POD日志信息,怀疑为网络问题或权限问题。
[root@k8s-master ~]# kubectl logs coredns-584795fc57-48swl -n kube-system
E0127 03:13:18.668414 1 reflector.go:134] github.com/coredns/coredns/plugin/kubernetes/controller.go:317: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: no route to host
E0127 03:13:18.668414 1 reflector.go:134] github.com/coredns/coredns/plugin/kubernetes/controller.go:317: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: no route to host
log: exiting because of error: log: cannot create log: open /tmp/coredns.coredns-584795fc57-48swl.unknownuser.log.ERROR.20220127-031318.1: no such file or directory
[root@k8s-master ~]#
3.先检查权限
k8s为令牌管理,检查token是否可用。
3.1记录join命令中的token 和 hash值
kubeadm join 192.168.227.10:6443 --token jzfit0.pcgjeo5bv1mtycxm \
--discovery-token-ca-cert-hash sha256:c6c5b8591e41dd84b2fdb477b44b0ce0d81b679dded38b6f0ae50a18ee6a425c
3.2 在master节点检查token 和 hash值的有效性。发现令牌不存在,重建token,并使token有效,参考http://blog.51yip.com/cloud/2404.html。和 https://zhuanlan.zhihu.com/p/111687358。主要命令见下
3.2.1 问题
问题:
[root@k8s-node-1 ~]# kubeadm join 192.168.122.201:6443 --token fmqvwn.6h11y2ayq23r7zmw --discovery-token-ca-cert-hash sha256:42e125ef64f5aabc67ae0e0f14b58270be35fde8ff4f7b9a47d5d76a74a97c4a
W0107 17:53:50.512517 14686 join.go:346] [preflight] WARNING: JoinControlPane.controlPlane settings will be ignored when control-plane flag is not set.
[preflight] Running pre-flight checks
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
error execution phase preflight: couldn't validate the identity of the API Server: abort connecting to API servers after timeout of 5m0s
3.2.2 解决方案
因为kubeadm在使用过程中token的有效期只有24h,需要重新生成,才能解决上述问题。
3.2.2.1.在master节点查看token
[root@k8s-master ~]# kubeadm token list //token不存在
TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS
3.2.2.2 创建token
[root@k8s-master ~]# kubeadm token create --ttl 0 //创建token 永久生效
a7awp1.xp48h5hztcd8b03g
[root@k8s-master ~]# kubeadm token create //创建token 仅有效1天
n4po1r.6ysc0sb7b2cuj80l
[root@k8s-master ~]# kubeadm token list //查看token
TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS
a7awp1.xp48h5hztcd8b03g <forever> <never> authentication,signing <none> system:bootstrappers:kubeadm:default-node-token
n4po1r.6ysc0sb7b2cuj80l 23h 2022-01-28T09:41:41+08:00 authentication,signing <none> system:bootstrappers:kubeadm:default-node-token
注意:TTL,一个显示forever(永久),一个显示23H
3.2.2.3 查看 token
[root@k8s-master ~]# ll /etc/kubernetes/pki/ //查看token
总用量 56
-rw-r--r--. 1 root root 1224 1月 24 16:32 apiserver.crt
-rw-r--r--. 1 root root 1090 1月 24 16:32 apiserver-etcd-client.crt
-rw-------. 1 root root 1675 1月 24 16:32 apiserver-etcd-client.key
-rw-------. 1 root root 1679 1月 24 16:32 apiserver.key
-rw-r--r--. 1 root root 1099 1月 24 16:32 apiserver-kubelet-client.crt
-rw-------. 1 root root 1679 1月 24 16:32 apiserver-kubelet-client.key
-rw-r--r--. 1 root root 1025 1月 24 16:32 ca.crt
-rw-------. 1 root root 1679 1月 24 16:32 ca.key
drwxr-xr-x. 2 root root 162 1月 24 16:32 etcd
-rw-r--r--. 1 root root 1038 1月 24 16:32 front-proxy-ca.crt
-rw-------. 1 root root 1675 1月 24 16:32 front-proxy-ca.key
-rw-r--r--. 1 root root 1058 1月 24 16:32 front-proxy-client.crt
-rw-------. 1 root root 1679 1月 24 16:32 front-proxy-client.key
-rw-------. 1 root root 1679 1月 24 16:32 sa.key
-rw-------. 1 root root 451 1月 24 16:32 sa.pub
[root@k8s-master ~]# openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
46648385aa490e4ffdeb60c79f8b1b4797bfe3727a686925c8419e5f60816bd8
3.2.2.4 记录token 和 hash 重组 join命令执行。
n4po1r.6ysc0sb7b2cuj80l
46648385aa490e4ffdeb60c79f8b1b4797bfe3727a686925c8419e5f60816bd8
3.3 执行后问题依然未解决。但已排除令牌的问题。
4.1 查看日志:kubectl logs得到具体的报错
[root@k8s-master ~]# kubectl get pod -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
coredns-584795fc57-48swl 0/1 CrashLoopBackOff 11 38m 10.244.0.23 k8s-master <none> <none>
coredns-584795fc57-4sklt 0/1 CrashLoopBackOff 11 38m 10.244.0.24 k8s-master <none> <none>
etcd-k8s-master 1/1 Running 0 37m 192.168.227.10 k8s-master <none> <none>
kube-apiserver-k8s-master 1/1 Running 0 37m 192.168.227.10 k8s-master <none> <none>
kube-controller-manager-k8s-master 1/1 Running 0 36m 192.168.227.10 k8s-master <none> <none>
kube-flannel-ds-amd64-6zdr2 1/1 Running 0 35m 192.168.227.10 k8s-master <none> <none>
kube-proxy-bcc2z 1/1 Running 0 38m 192.168.227.10 k8s-master <none> <none>
kube-scheduler-k8s-master 1/1 Running 0 36m 192.168.227.10 k8s-master <none> <none>
[root@k8s-master ~]# kubectl logs coredns-584795fc57-48swl -n kube-system
E0127 03:13:18.668414 1 reflector.go:134] github.com/coredns/coredns/plugin/kubernetes/controller.go:317: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: no route to host
E0127 03:13:18.668414 1 reflector.go:134] github.com/coredns/coredns/plugin/kubernetes/controller.go:317: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: no route to host
log: exiting because of error: log: cannot create log: open /tmp/coredns.coredns-584795fc57-48swl.unknownuser.log.ERROR.20220127-031318.1: no such file or directory
[root@k8s-master ~]#
4.2 查看pod具体信息:kubectl describe pod得到一些可能比较没用的信息
[root@k8s-master ~]# kubectl describe pod coredns-584795fc57-48swl -n kube-system
4.3 修改coredns配置信息也没有效果
kubectl edit deployment coredns -n kube-system
4.4 强制删除coredns pod没有效果
kubectl delete po coredns-fb8b8dccf-hhkfm --grace-period=0 --force -n kube-system
4.5 查看kubelet的访问也是coredns出错
journalctl -f -u kubelet
4.6 本地dns配置也没有什么问题
[root@k8s-master ~]# cat /etc/resolv.conf
nameserver 219.141.136.10
nameserver 219.141.140.10
4.7最后解决方案
这个问题很可能是由iptables规则的错乱或者缓存导致的,可以依次执行以下命令解决
[root@k8s-master ~]# systemctl stop kubelet
[root@k8s-master ~]# systemctl stop docker
[root@k8s-master ~]# iptables --flush //Delete all rules in chain or all chains
[root@k8s-master ~]# iptables -tnat --flush
[root@k8s-master ~]# systemctl start kubelet
[root@k8s-master ~]# systemctl start do```cker
步骤2 查看Node状态和Pod状态
[root@k8s-master ~]# kubectl get pod -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
coredns-584795fc57-4sklt 1/1 Running 16 3h57m 10.244.0.25 k8s-master <none> <none>
coredns-584795fc57-9nk6q 1/1 Running 0 3h1m 10.244.0.27 k8s-master <none> <none>
etcd-k8s-master 1/1 Running 1 3h56m 192.168.227.10 k8s-master <none> <none>
kube-apiserver-k8s-master 1/1 Running 1 3h57m 192.168.227.10 k8s-master <none> <none>
kube-controller-manager-k8s-master 1/1 Running 1 3h56m 192.168.227.10 k8s-master <none> <none>
kube-flannel-ds-amd64-6zdr2 1/1 Running 1 3h55m 192.168.227.10 k8s-master <none> <none>
kube-flannel-ds-amd64-hp2k7 1/1 Running 0 179m 192.168.227.12 k8s-node2 <none> <none>
kube-flannel-ds-amd64-wbbrw 1/1 Running 0 3h 192.168.227.11 k8s-node1 <none> <none>
kube-proxy-bcc2z 1/1 Running 1 3h57m 192.168.227.10 k8s-master <none> <none>
kube-proxy-c7kf4 1/1 Running 0 3h 192.168.227.11 k8s-node1 <none> <none>
kube-proxy-qpttz 1/1 Running 0 179m 192.168.227.12 k8s-node2 <none> <none>
kube-scheduler-k8s-master 1/1 Running 1 3h56m 192.168.227.10 k8s-master <none> <none>
[root@k8s-master ~]#
较之前多了,从结果可知 :
1.每个节点(master+worker)都需要proxy和flannel节点。
2.etcd apiserver scheduler 只有一个在master上
3.coredns 2个在master上
4.以上组件都是以pod方式进行管理的
5.kubelet 和 是进程管理的
1)kubectl get nodes 显示是ready状态
2)kubectl get pod -n kube-system 都是running状态
3)所有节点的kubelet 是否 active(进程管理)
4)所有节点的docker 是否 active (进程管理)
则表示k8s集群配置完成,状态正常。
tips 别名功能,执行下述命令,可用k替代kubectl命令
echo 'alias k=kubectl' >>~/.bashrc
echo 'complete -F __start_kubectl k' >>~/.bashrc
source ~/.bashrc
[root@k8s-master log]# k get nodes
NAME STATUS ROLES AGE VERSION
k8s-master Ready master 4h48m v1.14.1
k8s-node1 Ready <none> 3h50m v1.14.1
k8s-node2 Ready <none> 3h49m v1.14.1
[root@k8s-master log]# k get cs //查看组件状态
NAME AGE
scheduler <unknown>
controller-manager <unknown>
etcd-0 <unknown>
[root@k8s-master log]# kubectl get pods --field-selector spec.nodeName=k8s-node2 -n kube-system //查看运行在node2 上的pod
NAME READY STATUS RESTARTS AGE
kube-flannel-ds-amd64-hp2k7 1/1 Running 0 3h54m
kube-proxy-qpttz 1/1 Running 0 3h54m
