1. 概述
本文在rockylinux 9.2 中使用kubeadm部署 Kubernetes 1.27
、containerd
、calico
、BGP
等;
使用OpenELB
作为LoadBalancer
;
使用BIRD
模拟物理路由器;
使用kube-vip
实现control-plane
高可用;
本文所有k8s相关组件都固定版本安装,避免因版本更新导致各种问题;如
kubelet-1.27.2
、kubeadm-1.27.2
、kubectl-1.27.2
、calico-3.25.1
、calicoctl-3.24.6
、containerd-1.6.21
等
2. 环境说明
序号 | CPU | 内存(G) | 操作系统 | IP | 主机名 | 备注 |
---|---|---|---|---|---|---|
1 | 2 | 12 | Rockylinux 9.2 | 192.168.3.51 | bgp-k8s-01.tiga.cc | master |
2 | 2 | 12 | Rockylinux 9.2 | 192.168.3.52 | bgp-k8s-02.tiga.cc | master |
3 | 2 | 12 | Rockylinux 9.2 | 192.168.3.53 | bgp-k8s-03.tiga.cc | master |
4 | 2 | 12 | Rockylinux 9.2 | 192.168.3.54 | bgp-k8s-04.tiga.cc | worker |
5 | 2 | 12 | Rockylinux 9.2 | 192.168.3.55 | bgp-k8s-05.tiga.cc | worker |
6 | 2 | 12 | Rockylinux 9.2 | 192.168.3.56 | bgp-k8s-06.tiga.cc | worker |
7 | 2 | 12 | Rockylinux 9.2 | 192.168.3.57 | bgp-k8s-07.tiga.cc | worker |
8 | 2 | 12 | Rockylinux 9.2 | 192.168.3.58 | bgp-k8s-08.tiga.cc | worker |
9 | 2 | 2 | Rockylinux 9.2 | 192.168.3.61 | bird-01.tiga.cc | bird(模拟路由器) |
3. 准备工作
3.1 检查mac和product_uuid
同一个k8s集群内的所有节点需要确保mac
地址和product_uuid
均唯一,部署前需检查信息
# 检查mac地址
ip ad
# 检查product_uuid
cat /sys/class/dmi/id/product_uuid
3.2 修改host文件
echo '192.168.3.50 bgp-k8s-api-server.tiga.cc' >> /etc/hosts
echo '192.168.3.51 bgp-k8s-01.tiga.cc' >> /etc/hosts
echo '192.168.3.52 bgp-k8s-02.tiga.cc' >> /etc/hosts
echo '192.168.3.53 bgp-k8s-03.tiga.cc' >> /etc/hosts
echo '192.168.3.54 bgp-k8s-04.tiga.cc' >> /etc/hosts
echo '192.168.3.55 bgp-k8s-05.tiga.cc' >> /etc/hosts
echo '192.168.3.56 bgp-k8s-06.tiga.cc' >> /etc/hosts
echo '192.168.3.57 bgp-k8s-07.tiga.cc' >> /etc/hosts
echo '192.168.3.58 bgp-k8s-08.tiga.cc' >> /etc/hosts
echo '192.168.3.61 bird-01.tiga.cc' >> /etc/hosts
3.3 关闭firewalld
systemctl disable firewalld
systemctl stop firewalld
3.4 关闭swap
sed -i 's:/dev/mapper/rl-swap:#/dev/mapper/rl-swap:g' /etc/fstab
3.5 关闭selinux
sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/sysconfig/selinux
sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config
setenforce 0
3.6 安装ipvs
yum install -y ipvsadm
3.7 开启路由转发
echo 'net.ipv4.ip_forward=1' >> /etc/sysctl.conf
sysctl -p
3.8 加载bridge
yum install -y epel-release
yum install -y bridge-utils
modprobe br_netfilter
echo 'br_netfilter' >> /etc/modules-load.d/bridge.conf
echo 'net.bridge.bridge-nf-call-iptables=1' >> /etc/sysctl.conf
echo 'net.bridge.bridge-nf-call-ip6tables=1' >> /etc/sysctl.conf
sysctl -p
4. 安装containerd
官方文档: https://github.com/containerd/containerd/blob/main/docs/getting-started.md
4.1 安装containerd
yum install -y yum-utils
# DEB 和 RPM 格式的 containerd.io 包由 Docker(而不是 containerd 项目)分发
yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
# 查看yum源中所有containerd版本
# yum list containerd.io --showduplicates | sort -r
yum install -y containerd.io-1.6.21
systemctl start containerd
systemctl enable containerd
确认containerd是否安装成功
containerd -v
输出
containerd containerd.io 1.6.21 3dce8eb055cbb6872793272b4f20ed16117344f8
设置crictl的CRI endpoint为containerd
echo 'runtime-endpoint: unix:///run/containerd/containerd.sock' >> /etc/crictl.yaml
echo 'image-endpoint: unix:///run/containerd/containerd.sock' >> /etc/crictl.yaml
echo 'timeout: 10' >> /etc/crictl.yaml
echo 'debug: false' >> /etc/crictl.yaml
4.2 安装cni-plugins (可选)
如果使用
yum
安装kubelet
会自动安装kubernetes-cni
,无需执行本步骤
使用yum源安装containerd的方式会把runc
安装好,但是并不会安装cni-plugins
,还需要手动安装cni-plugins
。
wget https://github.com/containernetworking/plugins/releases/download/v1.3.0/cni-plugins-linux-amd64-v1.3.0.tgz
mkdir -p /opt/cni/bin
tar Cxzvf /opt/cni/bin cni-plugins-linux-amd64-v1.3.0.tgz
确认cni-plugins
是否安装成功
/opt/cni/bin/host-local
输出
CNI host-local plugin v1.3.0
CNI protocol versions supported: 0.1.0, 0.2.0, 0.3.0, 0.3.1, 0.4.0, 1.0.0
4.3 配置cgroup driver
4.3.1 确认系统当前的cgroup版本
stat -fc %T /sys/fs/cgroup/
如果是cgroup v2,则输出: cgroup2fs
如果是cgroup v1,则输出: tmps
支持cgroup v2需要Linux Kernel 5.8或者更高;需要containerd v1.4或者更高。
4.3.2 containerd配置cgroup driver
修改配置文件
# 配置文件说明: https://github.com/containerd/containerd/blob/main/docs/man/containerd-config.toml.5.md
containerd config default > /etc/containerd/config.toml
# 启用systemd cgroup
sed -i 's/SystemdCgroup = false/SystemdCgroup = true/g' /etc/containerd/config.toml
systemctl restart containerd
5. 安装kubelet、kubectl、kubeadm
5.1 添加yum源
# 注意,这里就是用el7的源,google没有为rhel8、rhel9再单独打包
cat <<EOF | tee /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=kubernetes
baseurl=https://mirrors.tuna.tsinghua.edu.cn/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
EOF
5.2 安装kubelet、kubectl、kubeadm
- kubelet所有节点都需要安装
- kubectl可以安装在任意机器,只要能远程连接到k8s的节点即可
- kubeadm所有节点都需要安装
# 安装yum源中最新版本
# yum install -y kubelet kubeadm kubectl
# 查看当前yum源有哪些kubelet版本
# yum list kubelet kubeadm kubectl --showduplicates
# yum 安装指定1.27.2版本
yum install -y kubelet-1.27.2-0 kubeadm-1.27.2-0 kubectl-1.27.2-0
systemctl enable kubelet
systemctl start kubelet
6. 初始化集群
6.1 控制平面高可用(kube-vip)
官方文档: https://github.com/kubernetes/kubeadm/blob/main/docs/ha-considerations.md#kube-vip
kube-vip是一个 keepalived 和 haproxy 的更“传统”方法的替代方案,kube-vip 在一项服务中实现了虚拟 IP 的管理和负载均衡。它可以在第 2 层(使用 ARP 和 leaderElection )或第 3 层使用 BGP 对等实现。kube-vip 将在控制平面节点上作为静态 pod 运行。
export VIP=192.168.3.50
export INTERFACE='enp1s0'
# KVVERSION=$(curl -sL https://api.github.com/repos/kube-vip/kube-vip/releases | jq -r ".[0].name")
export KVVERSION='v0.6.0'
# 简化命令,将命令设置为别名
alias kube-vip="ctr run --rm --net-host ghcr.io/kube-vip/kube-vip:$KVVERSION vip /kube-vip"
# 下载镜像
ctr images pull ghcr.io/kube-vip/kube-vip:v0.6.0
# 执行命令创建yaml
kube-vip manifest pod \
--interface $INTERFACE \
--vip $VIP \
--controlplane \
--arp \
--leaderElection | tee /etc/kubernetes/manifests/kube-vip.yaml
# 修改镜像策略
sed -i 's/Always//g' /etc/kubernetes/manifests/kube-vip.yaml
kube-vip.yaml
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
name: kube-vip
namespace: kube-system
spec:
containers:
- args:
- manager
env:
- name: vip_arp
value: "true"
- name: port
value: "6443"
- name: vip_interface
value: enp1s0 # 填写网卡名称
- name: vip_cidr
value: "32"
- name: cp_enable
value: "true"
- name: cp_namespace
value: kube-system
- name: vip_ddns
value: "false"
- name: vip_leaderelection
value: "true"
- name: vip_leaseduration
value: "5"
- name: vip_renewdeadline
value: "3"
- name: vip_retryperiod
value: "1"
- name: vip_address
value: 192.168.3.50 # api server对外提供服务的VIP
- name: prometheus_server
value: :2112
image: ghcr.io/kube-vip/kube-vip:v0.6.0
imagePullPolicy: IfNotPresent
name: kube-vip
resources: {}
securityContext:
capabilities:
add:
- NET_ADMIN
- NET_RAW
volumeMounts:
- mountPath: /etc/kubernetes/admin.conf
name: kubeconfig
hostAliases:
- hostnames:
- kubernetes
ip: 127.0.0.1
hostNetwork: true
volumes:
- hostPath:
path: /etc/kubernetes/admin.conf
name: kubeconfig
status: {}
最后将该配置文件放到所有控制平面的/etc/kubernetes/manifests/下
scp -rp kube-vip.yaml root@192.168.3.52:/etc/kubernetes/manifests/
scp -rp kube-vip.yaml root@192.168.3.53:/etc/kubernetes/manifests/
6.2 etcd
本文使用kubeadm部署etcd;
6.3 修改kubeadm初始化用的配置文件
官方文档: https://kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm-init/#custom-images
# 将当前kubeadm的初始化配置文件输出到文件中
kubeadm config print init-defaults > kubeadm-init.yaml
修改后的配置文件
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
# advertiseAddress: 1.2.3.4
advertiseAddress: 192.168.3.51 #
bindPort: 6443
nodeRegistration:
criSocket: unix:///var/run/containerd/containerd.sock
imagePullPolicy: IfNotPresent
taints: null
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
etcd:
local:
dataDir: /var/lib/etcd
# imageRepository: registry.k8s.io
imageRepository: registry.aliyuncs.com/google_containers # 拉取镜像使用阿里源
kind: ClusterConfiguration
# 指定控制平面(高可用)访问地址, 本文环境该域名解析到api server的vip 192.168.3.50
controlPlaneEndpoint: "bgp-k8s-api-server.tiga.cc:6443"
# kubernetesVersion: 1.27.0
kubernetesVersion: 1.27.2 # 指定k8s版本
networking:
dnsDomain: cluster.local
serviceSubnet: 10.96.0.0/12
podSubnet: 10.112.0.0/12 # 指定pod网段
scheduler: {}
6.4 kubeadm init初始化集群
检查配置文件是否生效
# 检查阿里源是否生效
kubeadm config images list --config kubeadm-init.yaml
输出
W0520 20:00:04.542807 3059 images.go:80] could not find officially supported version of etcd for Kubernetes v1.27.2, falling back to the nearest etcd version (3.5.7-0)
registry.aliyuncs.com/google_containers/kube-apiserver:v1.27.2
registry.aliyuncs.com/google_containers/kube-controller-manager:v1.27.2
registry.aliyuncs.com/google_containers/kube-scheduler:v1.27.2
registry.aliyuncs.com/google_containers/kube-proxy:v1.27.2
registry.aliyuncs.com/google_containers/pause:3.9
registry.aliyuncs.com/google_containers/etcd:3.5.7-0
registry.aliyuncs.com/google_containers/coredns:v1.10.1
下载镜像
# 下载镜像
kubeadm config images pull --config kubeadm-init.yaml
输出
W0520 20:23:04.712897 1857 images.go:80] could not find officially supported version of etcd for Kubernetes v1.27.2, falling back to the nearest etcd version (3.5.7-0)
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-apiserver:v1.27.2
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-controller-manager:v1.27.2
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-scheduler:v1.27.2
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-proxy:v1.27.2
[config/images] Pulled registry.aliyuncs.com/google_containers/pause:3.9
[config/images] Pulled registry.aliyuncs.com/google_containers/etcd:3.5.7-0
[config/images] Pulled registry.aliyuncs.com/google_containers/coredns:v1.10.1
# 检查镜像
crictl images
输出
IMAGE TAG IMAGE ID SIZE
registry.aliyuncs.com/google_containers/coredns v1.10.1 ead0a4a53df89 16.2MB
registry.aliyuncs.com/google_containers/etcd 3.5.7-0 86b6af7dd652c 102MB
registry.aliyuncs.com/google_containers/kube-apiserver v1.27.2 c5b13e4f7806d 33.4MB
registry.aliyuncs.com/google_containers/kube-controller-manager v1.27.2 ac2b7465ebba9 31MB
registry.aliyuncs.com/google_containers/kube-proxy v1.27.2 b8aa50768fd67 23.9MB
registry.aliyuncs.com/google_containers/kube-scheduler v1.27.2 89e70da428d29 18.2MB
registry.aliyuncs.com/google_containers/pause 3.9 e6f1816883972 322kB
修改containerd的sandbox_image
# 替换成阿里源的pause
old_sanbox_image=`grep sandbox_image /etc/containerd/config.toml`
sed -i 's#'"${old_sanbox_image}"'#sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.9"#g' /etc/containerd/config.toml
systemctl restart containerd
执行kubeadm init初始化集群
kubeadm init --config kubeadm-init.yaml --upload-certs
出现下列信息表示成功
... 省略
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of the control-plane node running the following command on each as root:
kubeadm join bgp-k8s-api-server.tiga.cc:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:b0b51ba58c2d65463541b7dcbf63c78e95b1d9f1b349a698c4d00c54602569cc \
--control-plane --certificate-key a803128a1c14b8a64ad8146d19ca745c922fcafb56733595e632032b56bab198
Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join bgp-k8s-api-server.tiga.cc:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:b0b51ba58c2d65463541b7dcbf63c78e95b1d9f1b349a698c4d00c54602569cc
配置kubeconfig
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config
6.5 各节点加入集群
control-plane节点加入集群
记得也需要执行 修改containerd的sandbox_image
kubeadm join bgp-k8s-api-server.tiga.cc:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:b0b51ba58c2d65463541b7dcbf63c78e95b1d9f1b349a698c4d00c54602569cc \
--control-plane --certificate-key a803128a1c14b8a64ad8146d19ca745c922fcafb56733595e632032b56bab198
输出
... 省略
This node has joined the cluster and a new control plane instance was created:
* Certificate signing request was sent to apiserver and approval was received.
* The Kubelet was informed of the new secure connection details.
* Control plane label and taint were applied to the new node.
* The Kubernetes control plane instances scaled up.
* A new etcd member was added to the local/stacked etcd cluster.
To start administering your cluster from this node, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Run 'kubectl get nodes' to see this node join the cluster.
确认节点成功加入集群
kubectl get nodes
输出
NAME STATUS ROLES AGE VERSION
bgp-k8s-01.tiga.cc NotReady control-plane 4m57s v1.27.2
bgp-k8s-02.tiga.cc NotReady control-plane 38s v1.27.2
如果出现以下错误,则说明证书密钥过期了,需要重新获取certificate-key
[download-certs] Downloading the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
error execution phase control-plane-prepare/download-certs: error downloading certs: error downloading the secret: Secret "kubeadm-certs" was not found in the "kube-system" Namespace. This Secret might have expired. Please, run `kubeadm init phase upload-certs --upload-certs` on a control plane to generate a new one
To see the stack trace of this error execute with --v=5 or higher
如果没保存到,可以使用命令获取certificate-key
kubeadm init phase upload-certs --upload-certs --config kubeadm-init.yaml
输出
a803128a1c14b8a64ad8146d19ca745c922fcafb56733595e632032b56bab198
获取discovery-token-ca-cert-hash
openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
输出
b0b51ba58c2d65463541b7dcbf63c78e95b1d9f1b349a698c4d00c54602569cc
获取token
kubeadm token create
kubeadm token list
work节点加入集群
记得也需要执行 修改containerd的sandbox_image
kubeadm join bgp-k8s-api-server.tiga.cc:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:b0b51ba58c2d65463541b7dcbf63c78e95b1d9f1b349a698c4d00c54602569cc
7. 部署calico
7.1 安装calico
官方文档: https://docs.tigera.io/calico/latest/getting-started/kubernetes/quickstart
7.1.1 配置NetworkManager
配置 NetworkManager
, 防止NetworkManager 操作默认网络命名空间中接口的路由表,干扰 Calico 代理正确路由的能力。
echo '[keyfile]' >> /etc/NetworkManager/conf.d/calico.conf
echo 'unmanaged-devices=interface-name:cali*;interface-name:tunl*;interface-name:vxlan.calico;interface-name:vxlan-v6.calico;interface-name:wireguard.cali;interface-name:wg-v6.cali' >> /etc/NetworkManager/conf.d/calico.conf
7.1.2 基于operator安装calico
wget https://raw.githubusercontent.com/projectcalico/calico/v3.25.1/manifests/tigera-operator.yaml -O tigera-operator.yaml
wget https://raw.githubusercontent.com/projectcalico/calico/v3.25.1/manifests/custom-resources.yaml -O custom-resources.yaml
# 安装tigera operator CRD
kubectl create -f tigera-operator.yaml
# 修改calico默认的Pod IP段为上面6.3中的kubeadm-init.yaml 定义好的10.112.0.0/12
sed -i 's#192.168.0.0/16#10.112.0.0/12#g' custom-resources.yaml
# 安装calico 必要 CRD
kubectl create -f custom-resources.yaml
由于tigera operator CRD 包的大小很大,
kubectl apply
可能会超出请求限制。请使用kubectl create
或kubectl replace
。
7.1.3 删除控制平面上的污点
kubectl taint nodes --all node-role.kubernetes.io/control-plane-
kubectl taint nodes --all node-role.kubernetes.io/master-
输出
node/bgp-k8s-01.tiga.cc untainted
node/bgp-k8s-02.tiga.cc untainted
node/bgp-k8s-03.tiga.cc untainted
taint "node-role.kubernetes.io/control-plane" not found
taint "node-role.kubernetes.io/control-plane" not found
taint "node-role.kubernetes.io/control-plane" not found
taint "node-role.kubernetes.io/control-plane" not found
taint "node-role.kubernetes.io/control-plane" not found
taint "node-role.kubernetes.io/master" not found
taint "node-role.kubernetes.io/master" not found
taint "node-role.kubernetes.io/master" not found
taint "node-role.kubernetes.io/master" not found
taint "node-role.kubernetes.io/master" not found
taint "node-role.kubernetes.io/master" not found
taint "node-role.kubernetes.io/master" not found
taint "node-role.kubernetes.io/master" not found
7.1.4 确认calico-system的所有pod正常运行
kubectl get pods -n calico-system -o wide
每隔两秒刷新
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
calico-kube-controllers-789dc4c76b-4k84l 1/1 Running 0 26m 10.116.201.1 bgp-k8s-06.tiga.cc <none> <none>
calico-node-265n7 1/1 Running 0 26m 192.168.3.58 bgp-k8s-08.tiga.cc <none> <none>
calico-node-5njl9 1/1 Running 0 26m 192.168.3.55 bgp-k8s-05.tiga.cc <none> <none>
calico-node-7lh2k 1/1 Running 0 4m5s 192.168.3.52 bgp-k8s-02.tiga.cc <none> <none>
calico-node-cps8m 1/1 Running 0 26m 192.168.3.56 bgp-k8s-06.tiga.cc <none> <none>
calico-node-ddtwj 1/1 Running 0 26m 192.168.3.53 bgp-k8s-03.tiga.cc <none> <none>
calico-node-k5h59 1/1 Running 0 26m 192.168.3.54 bgp-k8s-04.tiga.cc <none> <none>
calico-node-p8p9v 1/1 Running 0 26m 192.168.3.51 bgp-k8s-01.tiga.cc <none> <none>
calico-node-tv49n 1/1 Running 0 26m 192.168.3.57 bgp-k8s-07.tiga.cc <none> <none>
calico-typha-5b88557fb6-qtwmt 1/1 Running 0 26m 192.168.3.57 bgp-k8s-07.tiga.cc <none> <none>
calico-typha-5b88557fb6-wlwkj 1/1 Running 0 26m 192.168.3.55 bgp-k8s-05.tiga.cc <none> <none>
calico-typha-5b88557fb6-z5wzj 1/1 Running 0 26m 192.168.3.56 bgp-k8s-06.tiga.cc <none> <none>
csi-node-driver-2qlph 2/2 Running 0 26m 10.113.110.3 bgp-k8s-04.tiga.cc <none> <none>
csi-node-driver-4t8fx 2/2 Running 0 13m 10.123.163.1 bgp-k8s-02.tiga.cc <none> <none>
csi-node-driver-8txs7 2/2 Running 0 26m 10.118.139.1 bgp-k8s-01.tiga.cc <none> <none>
csi-node-driver-h6jtk 2/2 Running 0 26m 10.116.201.2 bgp-k8s-06.tiga.cc <none> <none>
csi-node-driver-sl64h 2/2 Running 0 26m 10.120.222.1 bgp-k8s-03.tiga.cc <none> <none>
csi-node-driver-x6zt5 2/2 Running 0 26m 10.119.93.1 bgp-k8s-05.tiga.cc <none> <none>
csi-node-driver-x9kqg 2/2 Running 0 26m 10.120.202.1 bgp-k8s-07.tiga.cc <none> <none>
csi-node-driver-z4p72 2/2 Running 0 26m 10.127.183.1 bgp-k8s-08.tiga.cc <none> <none>
由于受限于网络,下载镜像需要较长时间,所有pod的状态要较长时间才能为running
7.1.5 确认集群状态和网络
kubectl get nodes -o wide
输出
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
bgp-k8s-01.tiga.cc Ready control-plane 47m v1.27.2 192.168.3.51 <none> Rocky Linux 9.2 (Blue Onyx) 5.14.0-284.11.1.el9_2.x86_64 containerd://1.6.21
bgp-k8s-02.tiga.cc Ready control-plane 46m v1.27.2 192.168.3.52 <none> Rocky Linux 9.2 (Blue Onyx) 5.14.0-284.11.1.el9_2.x86_64 containerd://1.6.21
bgp-k8s-03.tiga.cc Ready control-plane 44m v1.27.2 192.168.3.53 <none> Rocky Linux 9.2 (Blue Onyx) 5.14.0-284.11.1.el9_2.x86_64 containerd://1.6.21
bgp-k8s-04.tiga.cc Ready <none> 43m v1.27.2 192.168.3.54 <none> Rocky Linux 9.2 (Blue Onyx) 5.14.0-284.11.1.el9_2.x86_64 containerd://1.6.21
bgp-k8s-05.tiga.cc Ready <none> 43m v1.27.2 192.168.3.55 <none> Rocky Linux 9.2 (Blue Onyx) 5.14.0-284.11.1.el9_2.x86_64 containerd://1.6.21
bgp-k8s-06.tiga.cc Ready <none> 43m v1.27.2 192.168.3.56 <none> Rocky Linux 9.2 (Blue Onyx) 5.14.0-284.11.1.el9_2.x86_64 containerd://1.6.21
bgp-k8s-07.tiga.cc Ready <none> 43m v1.27.2 192.168.3.57 <none> Rocky Linux 9.2 (Blue Onyx) 5.14.0-284.11.1.el9_2.x86_64 containerd://1.6.21
bgp-k8s-08.tiga.cc Ready <none> 43m v1.27.2 192.168.3.58 <none> Rocky Linux 9.2 (Blue Onyx) 5.14.0-284.11.1.el9_2.x86_64 containerd://1.6.21
7.2 安装calicoctl
官方文档: https://docs.tigera.io/calico/latest/operations/calicoctl/install
为了方便使用,将calicoctl以kubectl的插件的形式安装,这样可以直接使用kubectl calico
curl -L https://github.com/projectcalico/calico/releases/download/v3.24.6/calicoctl-linux-amd64 -o /usr/sbin/kubectl-calico
chmod +x /usr/sbin/kubectl-calico
# 验证插件是否有效
kubectl calico node status
输出
Calico process is running.
IPv4 BGP status
+--------------+-------------------+-------+----------+-------------+
| PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO |
+--------------+-------------------+-------+----------+-------------+
| 192.168.3.53 | node-to-node mesh | up | 14:26:17 | Established |
| 192.168.3.54 | node-to-node mesh | up | 14:26:17 | Established |
| 192.168.3.55 | node-to-node mesh | up | 14:26:17 | Established |
| 192.168.3.56 | node-to-node mesh | up | 14:26:17 | Established |
| 192.168.3.57 | node-to-node mesh | up | 14:26:17 | Established |
| 192.168.3.58 | node-to-node mesh | up | 14:26:17 | Established |
| 192.168.3.52 | node-to-node mesh | up | 14:42:11 | Established |
+--------------+-------------------+-------+----------+-------------+
8. 准备BGP基础环境
8.1 BIRD介绍
BIRD(Bird Internet Routing Daemon)是一种开源的路由软件,可用于实现多种路由协议,例如 BGP、OSPF、RIPv2 等。它是一个运行在 Linux 和 Unix 操作系统上的守护进程,用于控制网络路由器之间的数据流动。
官方文档: https://bird.network.cz/
为了降低学习成本,本文不使用物理路由器来配置BGP,使用BIRD
来模拟路由器;
8.2 安装bird
登录服务器 192.168.3.61
安装BIRD
yum install epel-release
yum install -y bird-2.13
8.3 bird配置文件
修改后的 /etc/bird.conf
如下
router id 192.168.3.61;
protocol kernel {
scan time 60;
ipv4 {
import none;
export all;
};
merge paths yes;
}
protocol device {
scan time 60;
}
protocol bgp neighbor1 {
local as 65001; # 指定本地as号为65001
neighbor 192.168.3.51 port 178 as 64512; # 指定对等体IP端口为192.168.3.51:178,对等体as号为64512
source address 192.168.3.61;
enable route refresh off;
ipv4 {
import all;
export all;
};
}
重启 bird
systemctl restart bird
systemctl status bird
9. 配置calico BGP
9.1 calico BGP概述
官方介绍: https://docs.tigera.io/calico/latest/networking/configuring/bgp#bgp
以下为机翻:
calico 的常见BGP拓扑结构有3种
Full-mesh
: 全网状,启用 BGP 后,Calico 的默认行为是创建内部 BGP (iBGP) 连接的全网状连接,其中每个节点相互对等。全网状结构非常适合 100 个或更少节点的中小型部署,但在规模明显更大的情况下,全网状结构的效率会降低,我们建议使用路由反射器。Route reflectors
: 路由反射器, 要构建大型内部 BGP (iBGP) 集群,可以使用 BGP 路由反射器来减少每个节点上使用的 BGP 对等体的数量。在这个模型中,一些节点充当路由反射器,并被配置为在它们之间建立一个完整的网格。然后将其他节点配置为与这些路由反射器的子集对等(通常为 2 个用于冗余),与全网状相比减少了 BGP 对等连接的总数。Top of Rack(ToR)
: 架顶式, 在本地部署中,您可以将 Calico 配置为直接与您的物理网络基础设施对等。通常,这涉及禁用 Calico 的默认全网状行为,而是将 Calico 与您的 L3 ToR 路由器对等。
本文采用ToR方式实现。
9.2 bgp-configuration.yaml
apiVersion: projectcalico.org/v3
kind: BGPConfiguration
metadata:
name: default
spec:
logSeverityScreen: Info
nodeToNodeMeshEnabled: false
asNumber: 64512
serviceClusterIPs:
- cidr: 10.96.0.0/12
serviceExternalIPs:
- cidr: 192.168.3.51/32
listenPort: 178
bindMode: NodeIP
communities:
- name: bgp-large-community
value: 64512:300:100
prefixAdvertisements:
- cidr: 172.16.2.0/24
communities:
- bgp-large-community
- 64512:120
9.3 bgp-peer.yaml
apiVersion: projectcalico.org/v3
kind: BGPPeer
metadata:
name: my-global-peer
spec:
peerIP: 192.168.3.61
asNumber: 65001
9.4 部署bgp
kubectl create -f bgp-configuration.yaml
kubectl create -f bgp-peer.yaml
9.5 查看bgp状态
kubectl calico node status
输出
Calico process is running.
IPv4 BGP status
+--------------+-----------+-------+----------+-------------+
| PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO |
+--------------+-----------+-------+----------+-------------+
| 192.168.3.61 | global | up | 08:48:36 | Established |
+--------------+-----------+-------+----------+-------------+
IPv6 BGP status
No IPv6 peers found.
9.6 登录bird服务器查看
登录服务器 192.168.3.61
ip route
输出
default via 192.168.3.1 dev enp1s0 proto static metric 100
10.96.0.0/12 via 192.168.3.51 dev enp1s0 proto bird metric 32
10.113.110.0/24 via 192.168.3.51 dev enp1s0 proto bird metric 32
10.116.201.0/24 via 192.168.3.51 dev enp1s0 proto bird metric 32
10.118.139.0/24 via 192.168.3.51 dev enp1s0 proto bird metric 32
10.119.93.0/24 via 192.168.3.51 dev enp1s0 proto bird metric 32
10.120.202.0/24 via 192.168.3.51 dev enp1s0 proto bird metric 32
10.120.222.0/24 via 192.168.3.51 dev enp1s0 proto bird metric 32
10.123.163.0/24 via 192.168.3.51 dev enp1s0 proto bird metric 32
10.127.183.0/24 via 192.168.3.51 dev enp1s0 proto bird metric 32
192.168.3.0/24 dev enp1s0 proto kernel scope link src 192.168.3.61 metric 100
192.168.3.51 via 192.168.3.51 dev enp1s0 proto bird metric 32
可以看到已经获取到了 10.96.0.0/12 的路由表,下一跳指向 192.168.3.51
再查看一下bgp peer
birdc show protocols
# birdc show protocols all neighbor1
输出
0001 BIRD 2.13 ready.
2002-Name Proto Table State Since Info
1002-kernel1 Kernel master4 up 12:04:16.556
device1 Device --- up 12:04:16.556
neighbor1 BGP --- up 16:48:36.567 Established
10. 部署LoadBalance
10.1 OpenELB概述
OpenELB
是一个开源的云原生负载均衡器实现,可以在基于裸金属服务器、边缘以及虚拟化的 Kubernetes
环境中使用 LoadBalancer
类型的 Service
对外暴露服务。
OpenELB 项目最初由KubeSphere社区发起,目前已作为 CNCF 沙箱项目加入 CNCF 基金会,由 OpenELB 开源社区维护与支持。
官方介绍: https://github.com/openelb/openelb/blob/master/README_zh.md
10.2 部署OpenELB
官方文档: https://openelb.io/docs/getting-started/usage/use-openelb-in-bgp-mode/
10.2.1 安装OpenELB
wget https://raw.githubusercontent.com/openelb/openelb/v0.5.1/deploy/openelb.yaml
# 替换镜像为国内源
sed -i 's#k8s.gcr.io#k8s.dockerproxy.com#g' openelb.yaml
kubectl apply -f openelb.yaml
kubectl get po -n openelb-system
输出
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
openelb-admission-create-5mxch 0/1 Completed 0 30m 10.120.202.3 bgp-k8s-07.tiga.cc <none> <none>
openelb-admission-patch-v9t42 0/1 Completed 2 30m 10.127.183.4 bgp-k8s-08.tiga.cc <none> <none>
openelb-keepalive-vip-hbxt9 1/1 Running 0 28m 192.168.3.58 bgp-k8s-08.tiga.cc <none> <none>
openelb-keepalive-vip-hxmst 1/1 Running 0 28m 192.168.3.57 bgp-k8s-07.tiga.cc <none> <none>
openelb-keepalive-vip-kk4gz 1/1 Running 0 28m 192.168.3.51 bgp-k8s-01.tiga.cc <none> <none>
openelb-keepalive-vip-l8fj4 1/1 Running 0 28m 192.168.3.56 bgp-k8s-06.tiga.cc <none> <none>
openelb-keepalive-vip-vnpt9 1/1 Running 0 28m 192.168.3.53 bgp-k8s-03.tiga.cc <none> <none>
openelb-keepalive-vip-wzn7n 1/1 Running 0 28m 192.168.3.52 bgp-k8s-02.tiga.cc <none> <none>
openelb-keepalive-vip-xhcl8 1/1 Running 0 28m 192.168.3.54 bgp-k8s-04.tiga.cc <none> <none>
openelb-keepalive-vip-zf8b2 1/1 Running 0 28m 192.168.3.55 bgp-k8s-05.tiga.cc <none> <none>
openelb-manager-cc779c856-54rlp 1/1 Running 0 30m 192.168.3.55 bgp-k8s-05.tiga.cc <none> <none>
注意
openelb-admission-create
和openelb-admission-patch
此时就是处于complated
状态。
10.2.2 创建OpenELB所需BgpConf对象
BgpConf 对象用于在 OpenELB 上配置本地(Kubernetes 集群)BGP 属性。
创建文件openelb-bgp-conf.yaml
apiVersion: network.kubesphere.io/v1alpha2
kind: BgpConf
metadata:
name: default
spec:
as: 64513 # 注意, 不能与calico配置的相同
listenPort: 179 # 注意, 不能calico配置的相同
routerId: 192.168.3.55
kubectl apply -f openelb-bgp-conf.yaml
10.2.3 创建OpenELB所需bgpPeer对象
BgpPeer 对象用于在 OpenELB 上配置对等(BIRD 机器)BGP 属性。
创建文件openelb-bgp-peer.yaml
apiVersion: network.kubesphere.io/v1alpha2
kind: BgpPeer
metadata:
name: bgp-peer
spec:
conf:
peerAs: 65001
neighborAddress: 192.168.3.61
kubectl apply -f openelb-bgp-peer.yaml
10.2.4 创建OpenELB所需Eip对象
Eip 对象作为 OpenELB 的 IP 地址池。
创建文件openelb-bgp-eip.yaml
apiVersion: network.kubesphere.io/v1alpha2
kind: Eip
metadata:
name: bgp-eip
spec:
address: 172.22.0.2-172.22.0.10
kubectl apply -f openelb-bgp-eip.yaml
查看Eip
kubectl get eip
输出
NAME CIDR USAGE TOTAL
bgp-eip 172.22.0.2-172.22.0.10 9
10.2.5 创建openElb测试使用的pod
下面使用 luksa/kubia 镜像创建了两个 Pod 的 Deployment。每个 Pod 向外部请求返回自己的 Pod 名称。
创建文件openelb-bgp-deploy.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: bgp-openelb
spec:
replicas: 2
selector:
matchLabels:
app: bgp-openelb
template:
metadata:
labels:
app: bgp-openelb
spec:
containers:
- image: luksa/kubia
name: kubia
ports:
- containerPort: 8080
kubectl apply -f openelb-bgp-deploy.yaml
10.2.6 创建OpenELB测试使用的service
创建文件openelb-bgp-svc.yaml
kind: Service
apiVersion: v1
metadata:
name: bgp-svc
annotations:
lb.kubesphere.io/v1alpha1: openelb
protocol.openelb.kubesphere.io/v1alpha1: bgp
eip.openelb.kubesphere.io/v1alpha2: bgp-eip
spec:
selector:
app: bgp-openelb
type: LoadBalancer
ports:
- name: http
port: 80
targetPort: 8080
externalTrafficPolicy: Cluster
kubectl apply -f openelb-bgp-svc.yaml
10.2.7 在BIRD服务器创建对等体
- 修改/etc/bird.conf, 增加OpenELB对等体信息
router id 192.168.3.61;
protocol kernel {
scan time 60;
ipv4 {
import none;
export all;
};
merge paths yes;
}
protocol device {
scan time 60;
}
protocol bgp neighbor1 {
local as 65001; # 指定本地as号为65001
neighbor 192.168.3.51 port 178 as 64512; # 指定对等体IP端口为192.168.3.51:178,对等体as号为64512
source address 192.168.3.61;
enable route refresh off;
ipv4 {
import all;
export all;
};
}
protocol bgp neighbor2 {
local as 65001; # 指定本地as号为65001
neighbor 192.168.3.55 port 179 as 64513; 这里为什么指定为192.168.3.55呢? 因为当前环境中命名空间openelb-system内的容器openelb-manager被调度到了192.168.3.55, 可以自定义调度到任意一个节点
source address 192.168.3.61;
enable route refresh off;
ipv4 {
import all;
export all;
};
}
- 重载bird配置文件
systemctl reload bird
10.3 验证OpenELB
- 查看service
kubectl get svc
输出
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
bgp-svc LoadBalancer 10.99.222.207 172.22.0.2 80:30502/TCP 60m
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 24h
- 登录BIRD服务器 192.168.3.61 查看路由表
ip route | grep 172.22
输出
172.22.0.2 via 192.168.3.58 dev enp1s0 proto bird metric 32
这时可以看到BIRD服务器已经有了一条172.22.02的路由,指向192.168.3.58;
- 登录BIRD服务器 192.168.3.61 使用curl测试
curl 172.22.0.2
输出
You've hit bgp-openelb-769cf5cbc8-2m6lp