1.高可用K8S集群在硬件资源不足条件下的临时处理
资源不足的情况下,直接开两台master节点;
⇒由于etcd集群部署在三台master上,所有为了保持etcd最小运行,必须至少开两台master;
⇒虽然生产环境下master不运行业务运行的POD,资源不足的情况下直接让master节点跑业务POD。
主机名 | 环境功能 | IP | OS/应用版本 | 开关机状态 |
---|---|---|---|---|
k8s-master01 | K8S集群 --master |
172.26.37.121 | OS:AlmaLinux release 8.6 K8S Version:v1.23.8 资源:2C4G |
开机 |
k8s-master02 | K8S集群 --master |
172.26.37.122 | OS:AlmaLinux release 8.6 K8S Version:v1.23.8 资源:2C4G |
开机 |
k8s-master03 | K8S集群 --master |
172.26.37.123 | OS:AlmaLinux release 8.6 K8S Version:v1.23.8 资源:2C4G |
一般关机 |
k8s-node01 | K8S集群 --node |
172.26.37.124 | OS:AlmaLinux release 8.6 K8S Version:v1.23.8 资源:2C4G |
一般关机 |
k8s-node02 | K8S集群 --node |
172.26.37.125 | OS:AlmaLinux release 8.6 K8S Version:v1.23.8 资源:2C4G |
一般关机 |
k8s-master-lb | K8S集群 --master-LB |
172.26.37.126 | - | - |
查看各个节点状态:仅两台master节点运行
# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master01 Ready <none> 276d v1.23.8
k8s-master02 Ready <none> 276d v1.23.8
k8s-master03 NotReady <none> 276d v1.23.8
k8s-node01 NotReady <none> 276d v1.23.8
k8s-node02 NotReady <none> 276d v1.23.8
确认etcd集群工作状态
# export ETCDCTL_API=3
# etcdctl --endpoints="172.26.37.123:2379,172.26.37.122:2379,172.26.37.121:2379" --cacert=/etc/kubernetes/pki/etcd/etcd-ca.pem --cert=/etc/kubernetes/pki/etcd/etcd.pem --key=/etc/kubernetes/pki/etcd/etcd-key.pem endpoint status --write-out=table
{"level":"warn","ts":"2023-03-23T15:17:52.600+0800","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc00041e540/172.26.37.123:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = \"transport: Error while dialing dial tcp 172.26.37.123:2379: connect: no route to host\""}
Failed to get the status of endpoint 172.26.37.123:2379 (context deadline exceeded)
+--------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+--------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| 172.26.37.122:2379 | c79a1101ab7dd89c | 3.5.1 | 6.4 MB | true | false | 49 | 129359 | 129359 | |
| 172.26.37.121:2379 | 7ee2e2811cb6a7f9 | 3.5.1 | 6.4 MB | false | false | 49 | 129359 | 129359 | |
+--------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
查看两个节点角色标签
# kubectl describe node k8s-master01|grep Taints
Taints: node-role.kubernetes.io/master:NoSchedule
# kubectl describe node k8s-master02|grep Taints
Taints: node-role.kubernetes.io/master:NoSchedule
Kubernetes Taints状态说明:
- PreferNoSchedule:kubernetes 将尽量避免把 Pod 调度到具有该污点的 Node 上,除非没有其他节点可调度
- NoSchedule:kubernetes 将不会把 Pod 调度到具有该污点的 Node 上,但不会影响当前 Node 上已存在的Pod
- NoExecute:kubernetes 将不会把 Pod 调度到具有该污点的 Node 上,同时也会将 Node 上已存在的 Pod 驱离
将master02节点配置为可以可调度状态
# kubectl taint nodes k8s-master02 node-role.kubernetes.io/master=:NoSchedule-
node/k8s-master02 untainted
# kubectl describe node k8s-master02|grep Taints
Taints: <none>
污点语法:kubectl taint node [node] key=value[effect]
[effect] 可取值: [ NoSchedule | PreferNoSchedule | NoExecute ]
NoSchedule: 一定不能被调度
PreferNoSchedule: 尽量不要调度
NoExecute: 不仅不会调度, 还会驱逐Node上已有的Pod示例:
查看Taints污点:
# kubectl describe nodes k8s-master02 |grep Taints
添加Taints污点
# kubectl taint nodes k8s-master01 node-role.kubernetes.io/master=:NoSchedule
删除Taints污点(污点名后面➕减号即可)
# kubectl taint nodes k8s-master01 node-role.kubernetes.io/master=:NoSchedule-
给节点打上role标签
# kubectl label nodes k8s-master01 node-role.kubernetes.io/node=
给节点去除role标签
# kubectl label nodes k8s-master01 node-role.kubernetes.io/node-
2.验证K8S集群仍然可用
安装busybox
# cat<<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: busybox
namespace: default
spec:
containers:
- name: busybox
image: busybox:1.28
command:
- sleep
- "3600"
imagePullPolicy: IfNotPresent
restartPolicy: Always
EOF
POD部署在master02节点
# kubectl get po -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
busybox 1/1 Running 0 76s 172.36.122.144 k8s-master02 <none> <none>
登录容器并验证网络状态
# kubectl exec -it busybox -- /bin/sh
/ # nslookup kubernetes
Server: 192.168.0.10
Address 1: 192.168.0.10 kube-dns.kube-system.svc.cluster.local
Name: kubernetes
Address 1: 192.168.0.1 kubernetes.default.svc.cluster.local
/ # nslookup kube-dns.kube-system
Server: 192.168.0.10
Address 1: 192.168.0.10 kube-dns.kube-system.svc.cluster.local
Name: kube-dns.kube-system
Address 1: 192.168.0.10 kube-dns.kube-system.svc.cluster.local
/ # nslookup www.baidu.com
Server: 192.168.0.10
Address 1: 192.168.0.10 kube-dns.kube-system.svc.cluster.local
Name: www.baidu.com
Address 1: 14.119.104.189
Address 2: 14.215.177.38
/ #