1、环境说明
这里简单说明一下我使用的服务器情况:
服务器均采用 CentOS7.6 版本,未在其他系统版本中进行测试。
部署脚本包
链接:https://pan.baidu.com/s/1S8yXIKTqQpXmF3SnULdCgQ
提取码:16up
2、准备工作
1)修改以下内容
/data/config/environment.sh #修改ip为自己将要部署的机器ip
/data/config/Kcsh/hosts #修改ip为自己将要部署的机器ip
/data/config/Ketcd/etcd-csr.json #修改ip为自己将要部署的机器ip
/data/config/Kmaster/Kha/haproxy.cfg #修改ip为自己将要部署的机器ip
/data/config/Kmaster/Kapi/kubernetes-csr.json #修改ip为自己将要部署的机器ip
/data/config/Kmaster/Kmanage/kube-controller-manager-csr.json #修改ip为自己将要部署的机器ip
/data/config/Kmaster/Kscheduler/kube-scheduler-csr.json #修改ip为自己将要部署的机器ip
2)基础配置
在 kube-master 主机上执行执行 批量分发公钥-免交互方式
注意:请严格按照如下这几步操作进行,否则可能导致下边部署脚本无法正常走完
yum install -y sshpass # 安装 sshpass
cat /data/ip.txt
10.0.0.76
10.0.0.97
10.0.0.130
批量分发公钥 和 推送hosts的脚步
#!/bin/bash
echo "################ 批量分发公钥-免交互方式 ####################"
# 调用这个文件
. /etc/init.d/functions
# create key pair
rm -fr /root/.ssh/id_rsa*
ssh-keygen -t rsa -f /root/.ssh/id_rsa -P "" -q
IPtest=`cat /data/ip.txt`
# 批量推送key文件
for ip in $IPtest
do
echo "=======批量推送key=========="
# 前提是密码统一的情况
echo $ip
sshpass -pBK#6u12G+rARoVoc-+9 ssh-copy-id -i /root/.ssh/id_rsa.pub root@$ip -o StrictHostKeyChecking=no &>/dev/null
if [ $? -eq 0 ]
then
action "主机$ip [分发成功]" /bin/true
else
action "主机$ip [分发失败] " /bin/false
fi
done
# 批量推送hosts
for ip in $IPtest
do
echo $ip
echo "=======批量推送hosts=========="
scp /data/magic/config/Kcsh/hosts root@$ip:/etc/hosts
if [ $? -eq 0 ]
then
action "主机$ip [推送成功]" /bin/true
else
action "主机$ip [推送失败] " /bin/false
fi
done
修改主机名字,主机名称建议全是小写
ssh -o StrictHostKeyChecking=no root@10.0.0.76 "hostname kube-master"
ssh -o StrictHostKeyChecking=no root@10.0.0.97 "hostname kube-node01"
ssh -o StrictHostKeyChecking=no root@10.0.0.130 "hostname kube-node02"
如修改为大写的话,kubelet 会出现 tokenGroups: Invalid value: []string{"system:bootstrappers:KUBE-MASTER"}: bootstrap group "system:bootstrappers:KUBE-MASTER" is invalid (must match system:bootstrappers:[a-z0-9:-]{0,255}[a-z0-9])
在magic.sh脚本里,修改 root@主机名字
sed -n '322,324p' magic.sh
scp $base_dir/config/Kmaster/Kha/keepalived-master.conf root@kube-master:/etc/keepalived/keepalived.conf
scp $base_dir/config/Kmaster/Kha/keepalived-backup.conf root@kube-node01:/etc/keepalived/keepalived.conf
scp $base_dir/config/Kmaster/Kha/keepalived-backup.conf root@kube-node02:/etc/keepalived/keepalived.conf
3、正式部署
部署非常简单,直接执行magic.sh脚本即可
[root@kube-master data]# ll
total 508944
drwxr-xr-x 9 root root 4096 Sep 2 17:48 config
-rw-r--r-- 1 root root 26174 Sep 2 17:41 magic.sh # 直接执行magic.sh脚本
-rw-r--r-- 1 root root 521113600 Sep 2 17:02 magic.tar.gz
drwxr-xr-x 2 root root 4096 Sep 2 17:42 pack
drwxr-xr-x 2 root root 4096 Sep 3 10:36 script
不过有几点需要做一下简单说明:
- 1,启动正式部署之前,务必仔细认真检查各处配置是否与所需求的相匹配了,若不匹配,应当调整。
- 2,部署过程中如果有卡壳,或者未正常部署而退出,请根据对应的部署阶段进行排查,然后重新执行部署脚本,即可进行接续部署
4、简单验证
部署完成之后,可使用如下方式进行一些对集群可用性的初步检验:
1)检查服务是否均已正常启动
cat > magic01_ckeck_server.sh << "EOF"
#!/bin/bash
# 检查服务是否均已正常启动
set -e
source /opt/k8s/bin/environment.sh
##set color##
echoRed() { echo $'\e[0;31m'"$1"$'\e[0m'; }
echoGreen() { echo $'\e[0;32m'"$1"$'\e[0m'; }
echoYellow() { echo $'\e[0;33m'"$1"$'\e[0m'; }
##set color##
for node_ip in ${NODE_IPS[@]}
do
echoGreen ">>> ${node_ip}"
ssh root@${node_ip} "systemctl status etcd|grep Active"
ssh root@${node_ip} "systemctl status flanneld|grep Active"
ssh root@${node_ip} "systemctl status haproxy|grep Active"
ssh root@${node_ip} "systemctl status keepalived|grep Active"
ssh root@${node_ip} "systemctl status kube-apiserver |grep 'Active:'"
ssh root@${node_ip} "systemctl status kube-controller-manager|grep Active"
ssh root@${node_ip} "systemctl status kube-scheduler|grep Active"
ssh root@${node_ip} "systemctl status docker|grep Active"
ssh root@${node_ip} "systemctl status kubelet | grep Active"
ssh root@${node_ip} "systemctl status kube-proxy|grep Active"
done
EOF
2)查看相关服务可用性
2.1)验证 etcd 集群可用性
cat > magic02_verify_etcd.sh << "EOF"
#!/bin/bash
source /opt/k8s/bin/environment.sh
echoRed() { echo $'\e[0;31m'"$1"$'\e[0m'; }
echoGreen() { echo $'\e[0;32m'"$1"$'\e[0m'; }
echoYellow() { echo $'\e[0;33m'"$1"$'\e[0m'; }
# 验证 etcd 集群可用性
for node_ip in ${NODE_IPS[@]}
do
echoGreen ">>> ${node_ip}"
ETCDCTL_API=3 /opt/k8s/bin/etcdctl \
--endpoints=https://${node_ip}:2379 \
--cacert=/etc/kubernetes/cert/ca.pem \
--cert=/etc/etcd/cert/etcd.pem \
--key=/etc/etcd/cert/etcd-key.pem endpoint health
done
EOF
2.2)验证 flannel 网络
查看已分配的 Pod 子网段列表:
source /opt/k8s/bin/environment.sh
etcdctl \
--endpoints=${ETCD_ENDPOINTS} \
--ca-file=/etc/kubernetes/cert/ca.pem \
--cert-file=/etc/flanneld/cert/flanneld.pem \
--key-file=/etc/flanneld/cert/flanneld-key.pem \
ls ${FLANNEL_ETCD_PREFIX}/subnets
输出:
/kubernetes/network/subnets/172.30.84.0-24
/kubernetes/network/subnets/172.30.8.0-24
/kubernetes/network/subnets/172.30.29.0-24
2.3)验证各节点能通过 Pod 网段互通:
注意其中的IP段换成自己的
cat > magic03_ping_IP.sh << "EOF"
#!/bin/bash
source /opt/k8s/bin/environment.sh
echoRed() { echo $'\e[0;31m'"$1"$'\e[0m'; }
echoGreen() { echo $'\e[0;32m'"$1"$'\e[0m'; }
echoYellow() { echo $'\e[0;33m'"$1"$'\e[0m'; }
# 验证各节点能通过 Pod 网段互通
for node_ip in ${NODE_IPS[@]}
do
echoGreen ">>> ${node_ip}"
ssh ${node_ip} "ping -c 2 172.30.8.0"
ssh ${node_ip} "ping -c 2 172.30.29.0"
ssh ${node_ip} "ping -c 2 172.30.84.0"
done
EOF
2.4)高可用组件验证
查看 VIP 所在的节点,确保可以 ping 通 VIP:
cat > magic04_verify_module.sh << "EOF"
#!/bin/bash
source /opt/k8s/bin/environment.sh
echoRed() { echo $'\e[0;31m'"$1"$'\e[0m'; }
echoGreen() { echo $'\e[0;32m'"$1"$'\e[0m'; }
echoYellow() { echo $'\e[0;33m'"$1"$'\e[0m'; }
# 查看 VIP 所在的节点,确保可以 ping 通 VIP
for node_ip in ${NODE_IPS[@]}
do
echoGreen ">>> ${node_ip}"
ssh ${node_ip} "/usr/sbin/ip addr show ${VIP_IF}"
ssh ${node_ip} "ping -c 1 ${MASTER_VIP}"
done
EOF
2.5)高可用性试验
查看当前的 leader:
kubectl get endpoints kube-controller-manager --namespace=kube-system -o yaml
apiVersion: v1
kind: Endpoints
metadata:
annotations:
control-plane.alpha.kubernetes.io/leader: '{"holderIdentity":"kube-master_5b7afd9c-0c81-11ec-b95c-525400b20a8c","leaseDurationSeconds":15,"acquireTime":"2021-09-03T06:37:06Z","renewTime":"2021-09-03T06:46:06Z","leaderTransitions":0}'
creationTimestamp: 2021-09-03T06:37:06Z
name: kube-controller-manager
namespace: kube-system
resourceVersion: "957"
selfLink: /api/v1/namespaces/kube-system/endpoints/kube-controller-manager
uid: 5b7c2c9c-0c81-11ec-8ec4-5254006e8cb5
可见,当前的 leader 为 kube-master 节点。
现在停掉 kube-master 上的 kube-controller-manager。
systemctl stop kube-controller-manager
systemctl status kube-controller-manager |grep Active
Active: inactive (dead) since Fri 2021-09-03 14:47:40 CST; 4s ago
大概一分钟后,再查看一下当前的 leader:
kubectl get endpoints kube-controller-manager --namespace=kube-system -o yaml
apiVersion: v1
kind: Endpoints
metadata:
annotations:
control-plane.alpha.kubernetes.io/leader: '{"holderIdentity":"kube-node01_5bf8ccfa-0c81-11ec-9bdb-525400cc4651","leaseDurationSeconds":15,"acquireTime":"2021-09-03T06:47:57Z","renewTime":"2021-09-03T06:48:15Z","leaderTransitions":1}'
creationTimestamp: 2021-09-03T06:37:06Z
name: kube-controller-manager
namespace: kube-system
resourceVersion: "1117"
selfLink: /api/v1/namespaces/kube-system/endpoints/kube-controller-manager
uid: 5b7c2c9c-0c81-11ec-8ec4-5254006e8cb5
可以看到已经自动漂移到 kube-node01 上去了
2.5)查验 kube-proxy 功能
查看 ipvs 路由规则
cat > magic05_check_ipvs_rule.sh << "EOF"
#!/bin/bash
source /opt/k8s/bin/environment.sh
echoRed() { echo $'\e[0;31m'"$1"$'\e[0m'; }
echoGreen() { echo $'\e[0;32m'"$1"$'\e[0m'; }
echoYellow() { echo $'\e[0;33m'"$1"$'\e[0m'; }
# 查看 ipvs 路由规则
for node_ip in ${NODE_IPS[@]}
do
echoGreen ">>> ${node_ip}"
ssh root@${node_ip} "/usr/sbin/ipvsadm -ln"
done
EOF
输出:
[root@kube-master script]# bash magic05_check_ipvs_rule.sh
>>> 10.0.0.76
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 10.254.0.1:443 rr persistent 10800
-> 10.0.0.76:6443 Masq 1 0 0
-> 10.0.0.97:6443 Masq 1 0 0
-> 10.0.0.130:6443 Masq 1 0 0
>>> 10.0.0.97
.......
>>> 10.0.0.130
......
2.6)添加到 kubectl
环境变量
报错的解决方法:
1)首先用命令 find / -name kubectl
查找kubectl所在的位置
我的环境kubectl
所在的位置:/opt/k8s/bin/
2)将这个路径添加到系统的path
,编辑 vim /etc/profile
在profile
中添加:export PATH="/opt/k8s/bin/:$PATH"
3)source /etc/profile
环境变量
查看集群节点:
[root@kube-master data]# kubectl get node
NAME STATUS ROLES AGE VERSION
kube-master Ready <none> 11m v1.10.4
kube-node01 Ready <none> 11m v1.10.4
kube-node02 Ready <none> 11m v1.10.4
创建测试应用:
cat > nginx-ds.yml <<EOF
apiVersion: v1
kind: Service
metadata:
name: nginx-ds
labels:
app: nginx-ds
spec:
type: NodePort
selector:
app: nginx-ds
ports:
- name: http
port: 80
targetPort: 80
---
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: nginx-ds
labels:
addonmanager.kubernetes.io/mode: Reconcile
spec:
template:
metadata:
labels:
app: nginx-ds
spec:
containers:
- name: my-nginx
image: nginx:1.7.9
ports:
- containerPort: 80
EOF
执行定义文件,启动之前,可以先将上边定义的镜像 pull 下来
[root@kube-master script]# kubectl create -f nginx-ds.yml
service "nginx-ds" created
daemonset.extensions "nginx-ds" created
检查各 Node 上的 Pod IP 连通性
[root@kube-master script]# kubectl get pods -o wide|grep nginx-ds
nginx-ds-kjclg 1/1 Running 0 4m 172.30.26.2 kube-master
nginx-ds-nl2c7 1/1 Running 0 4m 172.30.30.2 kube-node02
nginx-ds-vczsg 1/1 Running 0 4m 172.30.98.2 kube-node01
可见nginx-ds 的 Pod IP 分别是 172.30.26.2、172.30.30.2、172.30.98.2,在所有 Node 上分别 ping 这三个 IP,看是否连通:
cat > magic06_ping_IP.sh << "EOF"
#!/bin/bash
source /opt/k8s/bin/environment.sh
echoRed() { echo $'\e[0;31m'"$1"$'\e[0m'; }
echoGreen() { echo $'\e[0;32m'"$1"$'\e[0m'; }
echoYellow() { echo $'\e[0;33m'"$1"$'\e[0m'; }
# ping一下IP是否通
for node_ip in ${NODE_IPS[@]}
do
echoGreen ">>> ${node_ip}"
ssh ${node_ip} "ping -c 1 172.30.26.2"
ssh ${node_ip} "ping -c 1 172.30.30.2"
ssh ${node_ip} "ping -c 1 172.30.98.2"
done
EOF
检查服务 IP 和端口可达性
[root@kube-master script]# kubectl get svc |grep nginx-ds
nginx-ds NodePort 10.254.255.104 <none> 80:8556/TCP 5m
在所有 Node 上 curl Service IP:
cat > magic07_All_Node_Service_IP.sh << "EOF"
#!/bin/bash
source /opt/k8s/bin/environment.sh
echoRed() { echo $'\e[0;31m'"$1"$'\e[0m'; }
echoGreen() { echo $'\e[0;32m'"$1"$'\e[0m'; }
echoYellow() { echo $'\e[0;33m'"$1"$'\e[0m'; }
# 在所有 Node 上 curl Service IP
for node_ip in ${NODE_IPS[@]}
do
echoGreen ">>> ${node_ip}"
ssh ${node_ip} "curl 10.254.255.104"
done
EOF
>>> 10.0.0.76
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 612 100 612 0 0 738k 0 --:--:-- --:--:-- --:--:-- 597k
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
body {
width: 35em;
margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif;
}
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>
<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>
<p><em>Thank you for using nginx.</em></p> # 提示:Thank you for using nginx 就说明测试OK
</body>
</html>
>>> 10.0.0.97
.....
>>> 10.0.0.130
.....
检查服务的 NodePort 可达性
cat > magic08_ckeck_service_NodePort.sh << "EOF"
#!/bin/bash
source /opt/k8s/bin/environment.sh
echoRed() { echo $'\e[0;31m'"$1"$'\e[0m'; }
echoGreen() { echo $'\e[0;32m'"$1"$'\e[0m'; }
echoYellow() { echo $'\e[0;33m'"$1"$'\e[0m'; }
# 检查服务的 NodePort 可达性
for node_ip in ${NODE_IPS[@]}
do
echoGreen ">>> ${node_ip}"
ssh ${node_ip} "curl ${node_ip}:8996"
done
EOF