部署一套完整的企业级K8s集群
一、准备环境
服务器要求:
• 建议最小硬件配置:4核CPU、4G内存、50G硬盘
• 服务器最好可以访问外网,会有从网上拉取镜像需求,如果服务器不能上网,需要提前下载对应镜像并导入节点
软件环境:
软件 版本
操作系统 CentOS7.8_x64
Docker 19+
Kubernetes 1.20
服务器整体规划:
角色 IP 其他单装组件
k8s-master1 192.168.172.40 docker,etcd,nginx,keepalived
k8s-master2 192.168.172.41 docker,etcd,nginx,keepalived
k8s-master3 192.168.172.42 docker,etcd,nginx,keepalived
k8s-node1 192.168.172.43 docker
负载均衡器对外IP 192.168.172.199 (VIP)
架构图:
操作系统初始化配置:
# 关闭防火墙
systemctl stop firewalld
systemctl disable firewalld
# 关闭selinux
sed -i 's/enforcing/disabled/' /etc/selinux/config # 永久
setenforce 0 # 临时
# 关闭swap
swapoff -a # 临时
sed -ri 's/.*swap.*/#&/' /etc/fstab # 永久
# 根据规划设置主机名
hostnamectl set-hostname <hostname>
# 在master添加hosts
cat >> /etc/hosts << EOF
192.168.172.40 k8s-master1
192.168.172.41 k8s-master2
192.168.172.42 k8s-master3
192.168.172.43 k8s-node1
EOF
# 将桥接的IPv4流量传递到iptables的链
cat > /etc/sysctl.d/k8s.conf << EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sysctl --system # 生效
# 时间同步
yum install ntpdate -y
ntpdate time.windows.com
二、部署Nginx+Keepalived高可用负载均衡器
2.1安装软件包(主/备)
yum install epel-release -y
yum install nginx keepalived -y
2.2 Nginx配置文件(主/备一样)
user nginx;
worker_processes auto;
error_log /var/log/nginx/error.log;
pid /run/nginx.pid;
include /usr/share/nginx/modules/*.conf;
events {
worker_connections 1024;
}
# 四层负载均衡,为两台Master apiserver组件提供负载均衡
stream {
log_format main '$remote_addr $upstream_addr - [$time_local] $status $upstream_bytes_sent';
access_log /var/log/nginx/k8s-access.log main;
upstream k8s-apiserver {
server 192.168.172.40:6443; # Master1 APISERVER IP:PORT
server 192.168.172.41:6443; # Master2 APISERVER IP:PORT
}
server {
listen 16443; # 由于nginx与master节点复用,这个监听端口不能是6443,否则会冲突
proxy_pass k8s-apiserver;
}
}
http {
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
access_log /var/log/nginx/access.log main;
sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
types_hash_max_size 2048;
include /etc/nginx/mime.types;
default_type application/octet-stream;
server {
listen 80 default_server;
server_name _;
location / {
}
}
}
2.3 keepalived配置文件(Nginx Master)
vi /etc/keepalived/keepalived.conf
global_defs {
notification_email {
acassen@firewall.loc
failover@firewall.loc
sysadmin@firewall.loc
}
notification_email_from Alexandre.Cassen@firewall.loc
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id NGINX_MASTER
}
vrrp_script check_nginx {
script "/etc/keepalived/check_nginx.sh"
}
vrrp_instance VI_1 {
state MASTER
interface ens33 # 修改为实际网卡名
virtual_router_id 51 # VRRP 路由 ID实例,每个实例是唯一的
priority 100 # 优先级,备服务器设置 90
advert_int 1 # 指定VRRP 心跳包通告间隔时间,默认1秒
authentication {
auth_type PASS
auth_pass 1111
}
# 虚拟IP
virtual_ipaddress {
192.168.172.199/24
}
track_script {
check_nginx
}
}
准备上述配置文件中检查nginx运行状态的脚本:
cat > /etc/keepalived/check_nginx.sh << "EOF"
#!/bin/bash
count=$(ss -antp |grep 16443 |egrep -cv "grep|$$")
if [ "$count" -eq 0 ];then
exit 1
else
exit 0
fi
EOF
赋权限:
chmod +x /etc/keepalived/check_nginx.sh
准备上述配置文件中检查nginx运行状态的脚本:
cat > /etc/keepalived/check_nginx.sh << "EOF"
cat > /etc/keepalived/check_nginx.sh << "EOF"
#!/bin/bash
count=$(ss -antp |grep 16443 |egrep -cv "grep|$$")
if [ "$count" -eq 0 ];then
exit 1
else
exit 0
fi
EOF
chmod +x /etc/keepalived/check_nginx.sh
systemctl daemon-reload
systemctl start nginx ; systemctl enable nginx
systemctl status nginx
systemctl start keepalived ; systemctl enable keepalived
systemctl status keepalived
ip addr 查看一下vip地址
每台导入离线的docker 镜像
docker load -i k8s-images-v1.20.4.tar.gz
gzip -dc k8s-images-v1.20.4.tar.gz |ssh root@主机名'cat | docker load'
第一台master创建kubeadm-config.yaml
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
kubernetesVersion: v1.20.4
controlPlaneEndpoint: 192.168.172.199:16443
imageRepository: registry.aliyuncs.com/google_containers
apiServer:
certSANs:
- 192.168.172.140
- 192.168.172.141
- 192.168.172.142
- 192.168.172.143
- 192.168.172.199
networking:
podSubnet: 10.244.0.0/16
serviceSubnet: 10.10.0.0/16
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: ipvs
使用kubeadm初始化k8s集群
kubeadm init --config kubeadm-config.yaml
注:--image-repository registry.aliyuncs.com/google_containers为保证拉取镜像不到国外站点拉取,手动指定仓库地址为registry.aliyuncs.com/google_containers。kubeadm默认从k8ss.grc.io拉取镜像。
出现 下面两段,有--control-plane到主节点敲,另外个是工作节点
kubeadm join 192.168.172.199:16443 --token 4thpb5.jbwmftjg9rmxkbw3 \
--discovery-token-ca-cert-hash sha256:85cf38fda29840a592102e676f9b491895b22e458de404f0401f3da58fc44eeb \
--control-plane
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.172.199:16443 --token 4thpb5.jbwmftjg9rmxkbw3 \
--discovery-token-ca-cert-hash sha256:85cf38fda29840a592102e676f9b491895b22e458de404f0401f3da58fc44eeb
配置kubectl的配置文件,保存一个证书,这样kubectl命令可以使用这个证书对k8s集群进行管理
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
查看节点是否正常,这时候只有一台
kubectl get nodes
NAME STATUS ROLES AGE VERSION
主机名 NotReady control-plane,master 60s v1.20.4
此时集群状态还是NotReady状态,因为网络组件没有启动。
#把第一台master节点的证书拷贝到其他两台上
cd /root && mkdir -p /etc/kubernetes/pki/etcd &&mkdir -p ~/.kube/
scp /etc/kubernetes/pki/ca.crt 主机名:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/ca.key 主机名:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/sa.key 主机名:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/sa.pub 主机名:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/front-proxy-ca.crt 主机名:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/front-proxy-ca.key 主机名:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/etcd/ca.crt 主机名:/etc/kubernetes/pki/etcd/
scp /etc/kubernetes/pki/etcd/ca.key 主机名:/etc/kubernetes/pki/etcd/
证书拷贝后,master上执行如下命令,每个人不一样
kubeadm join 192.168.172.199:16443 --token 4thpb5.jbwmftjg9rmxkbw3 \
--discovery-token-ca-cert-hash sha256:85cf38fda29840a592102e676f9b491895b22e458de404f0401f3da58fc44eeb \
--control-plane
node节点执行
kubeadm join 192.168.172.199:16443 --token 4thpb5.jbwmftjg9rmxkbw3 \
--discovery-token-ca-cert-hash sha256:85cf38fda29840a592102e676f9b491895b22e458de404f0401f3da58fc44eeb
查看集群情况 kubectl get nodes
NAME STATUS ROLES AGE VERSION
master1 NotReady control-plane,master 19m v1.20.4
master2 NotReady control-plane,master 6m22s v1.20.4
master3 NotReady control-plane,master 2m29s v1.20.4
node1 NotReady 78s v1.20.4
上面状态都是notready状态,说明没有安装网络插件
Calico简介
Calico 是一种容器之间互通的网络方案。在虚拟化平台中,比如OpenStack、Docker 等都需要实现主机之间互连,但同时也需要对容器做隔离控制。而在多数的虚拟化平台实现中,通常都使用二层隔离技术来实现容器的网络,这些二层的技术有一些弊端,比如需要依赖VLAN、bridge 和隧道等技术,其中bridge 带来了复杂性,vlan 隔离和tunnel 隧道在拆包或加包头时,则消耗更多的资源并对物理环境也有要求。随着网络规模的增大,整体会变得越加复杂。
Calico把Host 当作Internet 中的路由器,使用BGP 同步路由,并使用iptables 来做安全访问策略。
设计思想:Calico 不使用隧道或NAT 来实现转发,而是巧妙的把所有二三层流量转换成三层流量,并通过host 上路由配置完成跨Host 转发
常见的网络插件对比分析
flannel:支持地址分配,不支持网络策略。
calico:支持地址分配,支持网络策略。
flannel:
支持多种后端:
VxLAN:
(1) vxlan 叠加网络模式
(2) Directrouting
host-gw: Host Gateway 直接路由模式
UDP:一般不用这个模式
安装Calico网络组件
kubectl apply -f calico.yaml
注:在线下载配置文件地址是:https://docs.projectcalico.org/manifests/calico.yaml
拉取镜像需要一定时间,所以我们查看pod状态为running则安装成功。
kubectl get pod --all-namespaces
最后看一下集群状态
kubectl get nodes
NAME STATUS ROLES AGE VERSION
master1 Ready control-plane,master 22h v1.20.4
master2 Ready control-plane,master 22h v1.20.4
master3 Ready control-plane,master 22h v1.20.4
node1 Ready <none> 22h v1.20.4
测试在k8s创建pod是否可以正常访问网络
kubectl run busybox --image busybox:1.28 --restart=Never --rm -it busybox -- sh
If you don't see a command prompt, try pressing enter.
/ # ping www.qq.com
PING www.qq.com (183.194.238.19): 56 data bytes
64 bytes from 183.194.238.19: seq=0 ttl=127 time=11.423 ms
64 bytes from 183.194.238.19: seq=1 ttl=127 time=11.267 ms
^C
ping www.qq.com 能正常访问即成功