RockyLinux9.2安装k8s 1.27+calico+BGP+OpenELB

1. 概述


本文在rockylinux 9.2 中使用kubeadm部署 Kubernetes 1.27containerdcalicoBGP等;

使用OpenELB作为LoadBalancer;

使用BIRD模拟物理路由器;

使用kube-vip实现control-plane高可用;

本文所有k8s相关组件都固定版本安装,避免因版本更新导致各种问题;如kubelet-1.27.2kubeadm-1.27.2kubectl-1.27.2calico-3.25.1calicoctl-3.24.6containerd-1.6.21

2. 环境说明

序号 CPU 内存(G) 操作系统 IP 主机名 备注
1 2 12 Rockylinux 9.2 192.168.3.51 bgp-k8s-01.tiga.cc master
2 2 12 Rockylinux 9.2 192.168.3.52 bgp-k8s-02.tiga.cc master
3 2 12 Rockylinux 9.2 192.168.3.53 bgp-k8s-03.tiga.cc master
4 2 12 Rockylinux 9.2 192.168.3.54 bgp-k8s-04.tiga.cc worker
5 2 12 Rockylinux 9.2 192.168.3.55 bgp-k8s-05.tiga.cc worker
6 2 12 Rockylinux 9.2 192.168.3.56 bgp-k8s-06.tiga.cc worker
7 2 12 Rockylinux 9.2 192.168.3.57 bgp-k8s-07.tiga.cc worker
8 2 12 Rockylinux 9.2 192.168.3.58 bgp-k8s-08.tiga.cc worker
9 2 2 Rockylinux 9.2 192.168.3.61 bird-01.tiga.cc bird(模拟路由器)

3. 准备工作

3.1 检查mac和product_uuid


同一个k8s集群内的所有节点需要确保mac地址和product_uuid均唯一,部署前需检查信息

# 检查mac地址
ip ad

# 检查product_uuid
cat /sys/class/dmi/id/product_uuid

3.2 修改host文件


echo '192.168.3.50 bgp-k8s-api-server.tiga.cc' >> /etc/hosts
echo '192.168.3.51 bgp-k8s-01.tiga.cc' >> /etc/hosts
echo '192.168.3.52 bgp-k8s-02.tiga.cc' >> /etc/hosts
echo '192.168.3.53 bgp-k8s-03.tiga.cc' >> /etc/hosts
echo '192.168.3.54 bgp-k8s-04.tiga.cc' >> /etc/hosts
echo '192.168.3.55 bgp-k8s-05.tiga.cc' >> /etc/hosts
echo '192.168.3.56 bgp-k8s-06.tiga.cc' >> /etc/hosts
echo '192.168.3.57 bgp-k8s-07.tiga.cc' >> /etc/hosts
echo '192.168.3.58 bgp-k8s-08.tiga.cc' >> /etc/hosts
echo '192.168.3.61 bird-01.tiga.cc'    >> /etc/hosts

3.3 关闭firewalld


systemctl disable firewalld
systemctl stop firewalld

3.4 关闭swap


sed -i 's:/dev/mapper/rl-swap:#/dev/mapper/rl-swap:g' /etc/fstab

3.5 关闭selinux


sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/sysconfig/selinux
sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config
setenforce 0

3.6 安装ipvs


yum install -y ipvsadm

3.7 开启路由转发

echo 'net.ipv4.ip_forward=1' >> /etc/sysctl.conf
sysctl -p

3.8 加载bridge

yum install -y epel-release
yum install -y bridge-utils

modprobe br_netfilter
echo 'br_netfilter' >> /etc/modules-load.d/bridge.conf
echo 'net.bridge.bridge-nf-call-iptables=1' >> /etc/sysctl.conf
echo 'net.bridge.bridge-nf-call-ip6tables=1' >> /etc/sysctl.conf
sysctl -p

4. 安装containerd


官方文档: https://github.com/containerd/containerd/blob/main/docs/getting-started.md

4.1 安装containerd


yum install -y yum-utils

# DEB 和 RPM 格式的 containerd.io 包由 Docker(而不是 containerd 项目)分发
yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo

# 查看yum源中所有containerd版本
# yum list containerd.io --showduplicates | sort -r

yum install -y containerd.io-1.6.21

systemctl start containerd
systemctl enable containerd

确认containerd是否安装成功

containerd -v

输出

containerd containerd.io 1.6.21 3dce8eb055cbb6872793272b4f20ed16117344f8

设置crictl的CRI endpoint为containerd

echo 'runtime-endpoint: unix:///run/containerd/containerd.sock' >> /etc/crictl.yaml
echo 'image-endpoint: unix:///run/containerd/containerd.sock' >> /etc/crictl.yaml
echo 'timeout: 10' >> /etc/crictl.yaml
echo 'debug: false' >> /etc/crictl.yaml

4.2 安装cni-plugins (可选)


如果使用yum安装kubelet会自动安装kubernetes-cni,无需执行本步骤

使用yum源安装containerd的方式会把runc安装好,但是并不会安装cni-plugins,还需要手动安装cni-plugins

wget https://github.com/containernetworking/plugins/releases/download/v1.3.0/cni-plugins-linux-amd64-v1.3.0.tgz

mkdir -p /opt/cni/bin
tar Cxzvf /opt/cni/bin cni-plugins-linux-amd64-v1.3.0.tgz

确认cni-plugins是否安装成功

/opt/cni/bin/host-local

输出

CNI host-local plugin v1.3.0
CNI protocol versions supported: 0.1.0, 0.2.0, 0.3.0, 0.3.1, 0.4.0, 1.0.0

4.3 配置cgroup driver

4.3.1 确认系统当前的cgroup版本

stat -fc %T /sys/fs/cgroup/

如果是cgroup v2,则输出: cgroup2fs
如果是cgroup v1,则输出: tmps

支持cgroup v2需要Linux Kernel 5.8或者更高;需要containerd v1.4或者更高。

4.3.2 containerd配置cgroup driver


修改配置文件

# 配置文件说明: https://github.com/containerd/containerd/blob/main/docs/man/containerd-config.toml.5.md
containerd config default > /etc/containerd/config.toml
# 启用systemd cgroup
sed -i 's/SystemdCgroup = false/SystemdCgroup = true/g' /etc/containerd/config.toml
systemctl restart containerd

5. 安装kubelet、kubectl、kubeadm


官方文档: https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/#installing-kubeadm-kubelet-and-kubectl

5.1 添加yum源


# 注意,这里就是用el7的源,google没有为rhel8、rhel9再单独打包
cat <<EOF |  tee /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=kubernetes
baseurl=https://mirrors.tuna.tsinghua.edu.cn/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
EOF

5.2 安装kubelet、kubectl、kubeadm


  • kubelet所有节点都需要安装
  • kubectl可以安装在任意机器,只要能远程连接到k8s的节点即可
  • kubeadm所有节点都需要安装
# 安装yum源中最新版本
# yum install -y kubelet kubeadm kubectl

# 查看当前yum源有哪些kubelet版本
# yum list kubelet kubeadm kubectl  --showduplicates

# yum 安装指定1.27.2版本
yum install -y kubelet-1.27.2-0  kubeadm-1.27.2-0  kubectl-1.27.2-0 

systemctl enable kubelet
systemctl start kubelet

6. 初始化集群

6.1 控制平面高可用(kube-vip)


官方文档: https://github.com/kubernetes/kubeadm/blob/main/docs/ha-considerations.md#kube-vip

kube-vip是一个 keepalived 和 haproxy 的更“传统”方法的替代方案,kube-vip 在一项服务中实现了虚拟 IP 的管理和负载均衡。它可以在第 2 层(使用 ARP 和 leaderElection )或第 3 层使用 BGP 对等实现。kube-vip 将在控制平面节点上作为静态 pod 运行。

export VIP=192.168.3.50
export INTERFACE='enp1s0'

# KVVERSION=$(curl -sL https://api.github.com/repos/kube-vip/kube-vip/releases | jq -r ".[0].name")
export KVVERSION='v0.6.0'

# 简化命令,将命令设置为别名
alias kube-vip="ctr run --rm --net-host ghcr.io/kube-vip/kube-vip:$KVVERSION vip /kube-vip"

# 下载镜像
ctr images  pull ghcr.io/kube-vip/kube-vip:v0.6.0

# 执行命令创建yaml
kube-vip manifest pod \
    --interface $INTERFACE \
    --vip $VIP \
    --controlplane \
    --arp \
    --leaderElection | tee /etc/kubernetes/manifests/kube-vip.yaml

# 修改镜像策略
sed -i 's/Always//g' /etc/kubernetes/manifests/kube-vip.yaml

kube-vip.yaml

apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  name: kube-vip
  namespace: kube-system
spec:
  containers:
  - args:
    - manager
    env:
    - name: vip_arp
      value: "true"
    - name: port
      value: "6443"
    - name: vip_interface
      value: enp1s0 # 填写网卡名称
    - name: vip_cidr
      value: "32"
    - name: cp_enable
      value: "true"
    - name: cp_namespace
      value: kube-system
    - name: vip_ddns
      value: "false"
    - name: vip_leaderelection
      value: "true"
    - name: vip_leaseduration
      value: "5"
    - name: vip_renewdeadline
      value: "3"
    - name: vip_retryperiod
      value: "1"
    - name: vip_address
      value: 192.168.3.50  # api server对外提供服务的VIP
    - name: prometheus_server
      value: :2112
    image: ghcr.io/kube-vip/kube-vip:v0.6.0
    imagePullPolicy: IfNotPresent 
    name: kube-vip
    resources: {}
    securityContext:
      capabilities:
        add:
        - NET_ADMIN
        - NET_RAW
    volumeMounts:
    - mountPath: /etc/kubernetes/admin.conf
      name: kubeconfig
  hostAliases:
  - hostnames:
    - kubernetes
    ip: 127.0.0.1
  hostNetwork: true
  volumes:
  - hostPath:
      path: /etc/kubernetes/admin.conf
    name: kubeconfig
status: {}

最后将该配置文件放到所有控制平面的/etc/kubernetes/manifests/下

scp -rp kube-vip.yaml root@192.168.3.52:/etc/kubernetes/manifests/
scp -rp kube-vip.yaml root@192.168.3.53:/etc/kubernetes/manifests/

6.2 etcd


本文使用kubeadm部署etcd;

6.3 修改kubeadm初始化用的配置文件


官方文档: https://kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm-init/#custom-images

# 将当前kubeadm的初始化配置文件输出到文件中
kubeadm config print init-defaults > kubeadm-init.yaml

修改后的配置文件

apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
  - system:bootstrappers:kubeadm:default-node-token
  token: abcdef.0123456789abcdef
  ttl: 24h0m0s
  usages:
  - signing
  - authentication
kind: InitConfiguration
localAPIEndpoint:
  # advertiseAddress: 1.2.3.4
  advertiseAddress: 192.168.3.51 # 
  bindPort: 6443
nodeRegistration:
  criSocket: unix:///var/run/containerd/containerd.sock
  imagePullPolicy: IfNotPresent
  taints: null
---
apiServer:
  timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
etcd:
  local:
    dataDir: /var/lib/etcd
# imageRepository: registry.k8s.io
imageRepository: registry.aliyuncs.com/google_containers  # 拉取镜像使用阿里源
kind: ClusterConfiguration
# 指定控制平面(高可用)访问地址, 本文环境该域名解析到api server的vip 192.168.3.50
controlPlaneEndpoint: "bgp-k8s-api-server.tiga.cc:6443"
# kubernetesVersion: 1.27.0
kubernetesVersion: 1.27.2  # 指定k8s版本
networking:
  dnsDomain: cluster.local
  serviceSubnet: 10.96.0.0/12
  podSubnet: 10.112.0.0/12  # 指定pod网段
scheduler: {}

6.4 kubeadm init初始化集群

检查配置文件是否生效


# 检查阿里源是否生效
kubeadm config images list --config kubeadm-init.yaml

输出

W0520 20:00:04.542807    3059 images.go:80] could not find officially supported version of etcd for Kubernetes v1.27.2, falling back to the nearest etcd version (3.5.7-0)
registry.aliyuncs.com/google_containers/kube-apiserver:v1.27.2
registry.aliyuncs.com/google_containers/kube-controller-manager:v1.27.2
registry.aliyuncs.com/google_containers/kube-scheduler:v1.27.2
registry.aliyuncs.com/google_containers/kube-proxy:v1.27.2
registry.aliyuncs.com/google_containers/pause:3.9
registry.aliyuncs.com/google_containers/etcd:3.5.7-0
registry.aliyuncs.com/google_containers/coredns:v1.10.1

下载镜像

# 下载镜像 
kubeadm config images  pull --config kubeadm-init.yaml

输出

W0520 20:23:04.712897    1857 images.go:80] could not find officially supported version of etcd for Kubernetes v1.27.2, falling back to the nearest etcd version (3.5.7-0)
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-apiserver:v1.27.2
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-controller-manager:v1.27.2
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-scheduler:v1.27.2
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-proxy:v1.27.2
[config/images] Pulled registry.aliyuncs.com/google_containers/pause:3.9
[config/images] Pulled registry.aliyuncs.com/google_containers/etcd:3.5.7-0
[config/images] Pulled registry.aliyuncs.com/google_containers/coredns:v1.10.1
# 检查镜像
crictl images

输出

IMAGE                                                             TAG                 IMAGE ID            SIZE
registry.aliyuncs.com/google_containers/coredns                   v1.10.1             ead0a4a53df89       16.2MB
registry.aliyuncs.com/google_containers/etcd                      3.5.7-0             86b6af7dd652c       102MB
registry.aliyuncs.com/google_containers/kube-apiserver            v1.27.2             c5b13e4f7806d       33.4MB
registry.aliyuncs.com/google_containers/kube-controller-manager   v1.27.2             ac2b7465ebba9       31MB
registry.aliyuncs.com/google_containers/kube-proxy                v1.27.2             b8aa50768fd67       23.9MB
registry.aliyuncs.com/google_containers/kube-scheduler            v1.27.2             89e70da428d29       18.2MB
registry.aliyuncs.com/google_containers/pause                     3.9                 e6f1816883972       322kB

修改containerd的sandbox_image

# 替换成阿里源的pause
old_sanbox_image=`grep sandbox_image  /etc/containerd/config.toml`
sed -i 's#'"${old_sanbox_image}"'#sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.9"#g' /etc/containerd/config.toml

systemctl restart containerd

执行kubeadm init初始化集群


kubeadm init --config kubeadm-init.yaml --upload-certs

出现下列信息表示成功

... 省略

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of the control-plane node running the following command on each as root:

  kubeadm join bgp-k8s-api-server.tiga.cc:6443 --token abcdef.0123456789abcdef \
    --discovery-token-ca-cert-hash sha256:b0b51ba58c2d65463541b7dcbf63c78e95b1d9f1b349a698c4d00c54602569cc \
    --control-plane --certificate-key a803128a1c14b8a64ad8146d19ca745c922fcafb56733595e632032b56bab198

Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join bgp-k8s-api-server.tiga.cc:6443 --token abcdef.0123456789abcdef \
    --discovery-token-ca-cert-hash sha256:b0b51ba58c2d65463541b7dcbf63c78e95b1d9f1b349a698c4d00c54602569cc 

配置kubeconfig

mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config

6.5 各节点加入集群

control-plane节点加入集群

记得也需要执行 修改containerd的sandbox_image

kubeadm join bgp-k8s-api-server.tiga.cc:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:b0b51ba58c2d65463541b7dcbf63c78e95b1d9f1b349a698c4d00c54602569cc \
--control-plane --certificate-key a803128a1c14b8a64ad8146d19ca745c922fcafb56733595e632032b56bab198

输出

... 省略

This node has joined the cluster and a new control plane instance was created:

* Certificate signing request was sent to apiserver and approval was received.
* The Kubelet was informed of the new secure connection details.
* Control plane label and taint were applied to the new node.
* The Kubernetes control plane instances scaled up.
* A new etcd member was added to the local/stacked etcd cluster.

To start administering your cluster from this node, you need to run the following as a regular user:

    mkdir -p $HOME/.kube
    sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
    sudo chown $(id -u):$(id -g) $HOME/.kube/config

Run 'kubectl get nodes' to see this node join the cluster.

确认节点成功加入集群

kubectl get nodes

输出

NAME                 STATUS     ROLES           AGE     VERSION
bgp-k8s-01.tiga.cc   NotReady   control-plane   4m57s   v1.27.2
bgp-k8s-02.tiga.cc   NotReady   control-plane   38s     v1.27.2

如果出现以下错误,则说明证书密钥过期了,需要重新获取certificate-key

[download-certs] Downloading the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
error execution phase control-plane-prepare/download-certs: error downloading certs: error downloading the secret: Secret "kubeadm-certs" was not found in the "kube-system" Namespace. This Secret might have expired. Please, run `kubeadm init phase upload-certs --upload-certs` on a control plane to generate a new one
To see the stack trace of this error execute with --v=5 or higher

如果没保存到,可以使用命令获取certificate-key

kubeadm init phase upload-certs --upload-certs --config kubeadm-init.yaml

输出

a803128a1c14b8a64ad8146d19ca745c922fcafb56733595e632032b56bab198

获取discovery-token-ca-cert-hash

openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'

输出

b0b51ba58c2d65463541b7dcbf63c78e95b1d9f1b349a698c4d00c54602569cc

获取token

kubeadm token create
kubeadm token list

work节点加入集群

记得也需要执行 修改containerd的sandbox_image

kubeadm join bgp-k8s-api-server.tiga.cc:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:b0b51ba58c2d65463541b7dcbf63c78e95b1d9f1b349a698c4d00c54602569cc

7. 部署calico

7.1 安装calico

官方文档: https://docs.tigera.io/calico/latest/getting-started/kubernetes/quickstart

7.1.1 配置NetworkManager

配置 NetworkManager, 防止NetworkManager 操作默认网络命名空间中接口的路由表,干扰 Calico 代理正确路由的能力。

echo '[keyfile]' >> /etc/NetworkManager/conf.d/calico.conf
echo 'unmanaged-devices=interface-name:cali*;interface-name:tunl*;interface-name:vxlan.calico;interface-name:vxlan-v6.calico;interface-name:wireguard.cali;interface-name:wg-v6.cali' >> /etc/NetworkManager/conf.d/calico.conf

7.1.2 基于operator安装calico

wget https://raw.githubusercontent.com/projectcalico/calico/v3.25.1/manifests/tigera-operator.yaml -O tigera-operator.yaml
wget https://raw.githubusercontent.com/projectcalico/calico/v3.25.1/manifests/custom-resources.yaml -O custom-resources.yaml
# 安装tigera operator CRD
kubectl create -f tigera-operator.yaml

# 修改calico默认的Pod IP段为上面6.3中的kubeadm-init.yaml 定义好的10.112.0.0/12
sed -i 's#192.168.0.0/16#10.112.0.0/12#g' custom-resources.yaml

# 安装calico 必要 CRD
kubectl create -f custom-resources.yaml

由于tigera operator CRD 包的大小很大,kubectl apply可能会超出请求限制。请使用 kubectl createkubectl replace

7.1.3 删除控制平面上的污点

kubectl taint nodes --all node-role.kubernetes.io/control-plane-
kubectl taint nodes --all node-role.kubernetes.io/master-

输出

node/bgp-k8s-01.tiga.cc untainted
node/bgp-k8s-02.tiga.cc untainted
node/bgp-k8s-03.tiga.cc untainted
taint "node-role.kubernetes.io/control-plane" not found
taint "node-role.kubernetes.io/control-plane" not found
taint "node-role.kubernetes.io/control-plane" not found
taint "node-role.kubernetes.io/control-plane" not found
taint "node-role.kubernetes.io/control-plane" not found
taint "node-role.kubernetes.io/master" not found
taint "node-role.kubernetes.io/master" not found
taint "node-role.kubernetes.io/master" not found
taint "node-role.kubernetes.io/master" not found
taint "node-role.kubernetes.io/master" not found
taint "node-role.kubernetes.io/master" not found
taint "node-role.kubernetes.io/master" not found
taint "node-role.kubernetes.io/master" not found

7.1.4 确认calico-system的所有pod正常运行

kubectl get pods -n calico-system -o wide

每隔两秒刷新

NAME                                       READY   STATUS    RESTARTS   AGE    IP             NODE                 NOMINATED NODE   READINESS GATES
calico-kube-controllers-789dc4c76b-4k84l   1/1     Running   0          26m    10.116.201.1   bgp-k8s-06.tiga.cc   <none>           <none>
calico-node-265n7                          1/1     Running   0          26m    192.168.3.58   bgp-k8s-08.tiga.cc   <none>           <none>
calico-node-5njl9                          1/1     Running   0          26m    192.168.3.55   bgp-k8s-05.tiga.cc   <none>           <none>
calico-node-7lh2k                          1/1     Running   0          4m5s   192.168.3.52   bgp-k8s-02.tiga.cc   <none>           <none>
calico-node-cps8m                          1/1     Running   0          26m    192.168.3.56   bgp-k8s-06.tiga.cc   <none>           <none>
calico-node-ddtwj                          1/1     Running   0          26m    192.168.3.53   bgp-k8s-03.tiga.cc   <none>           <none>
calico-node-k5h59                          1/1     Running   0          26m    192.168.3.54   bgp-k8s-04.tiga.cc   <none>           <none>
calico-node-p8p9v                          1/1     Running   0          26m    192.168.3.51   bgp-k8s-01.tiga.cc   <none>           <none>
calico-node-tv49n                          1/1     Running   0          26m    192.168.3.57   bgp-k8s-07.tiga.cc   <none>           <none>
calico-typha-5b88557fb6-qtwmt              1/1     Running   0          26m    192.168.3.57   bgp-k8s-07.tiga.cc   <none>           <none>
calico-typha-5b88557fb6-wlwkj              1/1     Running   0          26m    192.168.3.55   bgp-k8s-05.tiga.cc   <none>           <none>
calico-typha-5b88557fb6-z5wzj              1/1     Running   0          26m    192.168.3.56   bgp-k8s-06.tiga.cc   <none>           <none>
csi-node-driver-2qlph                      2/2     Running   0          26m    10.113.110.3   bgp-k8s-04.tiga.cc   <none>           <none>
csi-node-driver-4t8fx                      2/2     Running   0          13m    10.123.163.1   bgp-k8s-02.tiga.cc   <none>           <none>
csi-node-driver-8txs7                      2/2     Running   0          26m    10.118.139.1   bgp-k8s-01.tiga.cc   <none>           <none>
csi-node-driver-h6jtk                      2/2     Running   0          26m    10.116.201.2   bgp-k8s-06.tiga.cc   <none>           <none>
csi-node-driver-sl64h                      2/2     Running   0          26m    10.120.222.1   bgp-k8s-03.tiga.cc   <none>           <none>
csi-node-driver-x6zt5                      2/2     Running   0          26m    10.119.93.1    bgp-k8s-05.tiga.cc   <none>           <none>
csi-node-driver-x9kqg                      2/2     Running   0          26m    10.120.202.1   bgp-k8s-07.tiga.cc   <none>           <none>
csi-node-driver-z4p72                      2/2     Running   0          26m    10.127.183.1   bgp-k8s-08.tiga.cc   <none>           <none>

由于受限于网络,下载镜像需要较长时间,所有pod的状态要较长时间才能为running

7.1.5 确认集群状态和网络

kubectl get nodes -o wide

输出

NAME                 STATUS   ROLES           AGE   VERSION   INTERNAL-IP    EXTERNAL-IP   OS-IMAGE                      KERNEL-VERSION                 CONTAINER-RUNTIME
bgp-k8s-01.tiga.cc   Ready    control-plane   47m   v1.27.2   192.168.3.51   <none>        Rocky Linux 9.2 (Blue Onyx)   5.14.0-284.11.1.el9_2.x86_64   containerd://1.6.21
bgp-k8s-02.tiga.cc   Ready    control-plane   46m   v1.27.2   192.168.3.52   <none>        Rocky Linux 9.2 (Blue Onyx)   5.14.0-284.11.1.el9_2.x86_64   containerd://1.6.21
bgp-k8s-03.tiga.cc   Ready    control-plane   44m   v1.27.2   192.168.3.53   <none>        Rocky Linux 9.2 (Blue Onyx)   5.14.0-284.11.1.el9_2.x86_64   containerd://1.6.21
bgp-k8s-04.tiga.cc   Ready    <none>          43m   v1.27.2   192.168.3.54   <none>        Rocky Linux 9.2 (Blue Onyx)   5.14.0-284.11.1.el9_2.x86_64   containerd://1.6.21
bgp-k8s-05.tiga.cc   Ready    <none>          43m   v1.27.2   192.168.3.55   <none>        Rocky Linux 9.2 (Blue Onyx)   5.14.0-284.11.1.el9_2.x86_64   containerd://1.6.21
bgp-k8s-06.tiga.cc   Ready    <none>          43m   v1.27.2   192.168.3.56   <none>        Rocky Linux 9.2 (Blue Onyx)   5.14.0-284.11.1.el9_2.x86_64   containerd://1.6.21
bgp-k8s-07.tiga.cc   Ready    <none>          43m   v1.27.2   192.168.3.57   <none>        Rocky Linux 9.2 (Blue Onyx)   5.14.0-284.11.1.el9_2.x86_64   containerd://1.6.21
bgp-k8s-08.tiga.cc   Ready    <none>          43m   v1.27.2   192.168.3.58   <none>        Rocky Linux 9.2 (Blue Onyx)   5.14.0-284.11.1.el9_2.x86_64   containerd://1.6.21

7.2 安装calicoctl

官方文档: https://docs.tigera.io/calico/latest/operations/calicoctl/install

为了方便使用,将calicoctl以kubectl的插件的形式安装,这样可以直接使用kubectl calico

curl -L https://github.com/projectcalico/calico/releases/download/v3.24.6/calicoctl-linux-amd64 -o /usr/sbin/kubectl-calico
chmod +x /usr/sbin/kubectl-calico

# 验证插件是否有效
kubectl calico node status

输出

Calico process is running.

IPv4 BGP status
+--------------+-------------------+-------+----------+-------------+
| PEER ADDRESS |     PEER TYPE     | STATE |  SINCE   |    INFO     |
+--------------+-------------------+-------+----------+-------------+
| 192.168.3.53 | node-to-node mesh | up    | 14:26:17 | Established |
| 192.168.3.54 | node-to-node mesh | up    | 14:26:17 | Established |
| 192.168.3.55 | node-to-node mesh | up    | 14:26:17 | Established |
| 192.168.3.56 | node-to-node mesh | up    | 14:26:17 | Established |
| 192.168.3.57 | node-to-node mesh | up    | 14:26:17 | Established |
| 192.168.3.58 | node-to-node mesh | up    | 14:26:17 | Established |
| 192.168.3.52 | node-to-node mesh | up    | 14:42:11 | Established |
+--------------+-------------------+-------+----------+-------------+

8. 准备BGP基础环境

8.1 BIRD介绍

BIRD(Bird Internet Routing Daemon)是一种开源的路由软件,可用于实现多种路由协议,例如 BGP、OSPF、RIPv2 等。它是一个运行在 Linux 和 Unix 操作系统上的守护进程,用于控制网络路由器之间的数据流动。

官方文档: https://bird.network.cz/

为了降低学习成本,本文不使用物理路由器来配置BGP,使用BIRD来模拟路由器;

8.2 安装bird

登录服务器 192.168.3.61 安装BIRD

yum install epel-release
yum install -y bird-2.13

8.3 bird配置文件

修改后的 /etc/bird.conf 如下

router id 192.168.3.61;

protocol kernel {
    scan time 60;
    ipv4 {
      import none;
      export all;
    };
    merge paths yes;
}

protocol device {
    scan time 60;
}

protocol bgp neighbor1 {
    local as 65001;  # 指定本地as号为65001
    neighbor 192.168.3.51 port 178 as 64512; # 指定对等体IP端口为192.168.3.51:178,对等体as号为64512
    source address 192.168.3.61;
    enable route refresh off;
    ipv4 {
      import all;
      export all;
    };
}

重启 bird

systemctl restart bird 
systemctl status bird 

9. 配置calico BGP

9.1 calico BGP概述

官方介绍: https://docs.tigera.io/calico/latest/networking/configuring/bgp#bgp
以下为机翻:
calico 的常见BGP拓扑结构有3种

  • Full-mesh: 全网状,启用 BGP 后,Calico 的默认行为是创建内部 BGP (iBGP) 连接的全网状连接,其中每个节点相互对等。全网状结构非常适合 100 个或更少节点的中小型部署,但在规模明显更大的情况下,全网状结构的效率会降低,我们建议使用路由反射器。

  • Route reflectors: 路由反射器, 要构建大型内部 BGP (iBGP) 集群,可以使用 BGP 路由反射器来减少每个节点上使用的 BGP 对等体的数量。在这个模型中,一些节点充当路由反射器,并被配置为在它们之间建立一个完整的网格。然后将其他节点配置为与这些路由反射器的子集对等(通常为 2 个用于冗余),与全网状相比减少了 BGP 对等连接的总数。

  • Top of Rack(ToR): 架顶式, 在本地部署中,您可以将 Calico 配置为直接与您的物理网络基础设施对等。通常,这涉及禁用 Calico 的默认全网状行为,而是将 Calico 与您的 L3 ToR 路由器对等。

本文采用ToR方式实现。

9.2 bgp-configuration.yaml

apiVersion: projectcalico.org/v3
kind: BGPConfiguration
metadata:
  name: default
spec:
  logSeverityScreen: Info
  nodeToNodeMeshEnabled: false
  asNumber: 64512
  serviceClusterIPs:
    - cidr: 10.96.0.0/12
  serviceExternalIPs:
    - cidr: 192.168.3.51/32
  listenPort: 178
  bindMode: NodeIP
  communities:
    - name: bgp-large-community
      value: 64512:300:100
  prefixAdvertisements:
    - cidr: 172.16.2.0/24
      communities:
        - bgp-large-community
        - 64512:120

9.3 bgp-peer.yaml

apiVersion: projectcalico.org/v3
kind: BGPPeer
metadata:
  name: my-global-peer
spec:
  peerIP: 192.168.3.61
  asNumber: 65001

9.4 部署bgp

kubectl create -f bgp-configuration.yaml
kubectl create -f bgp-peer.yaml

9.5 查看bgp状态

kubectl calico node status

输出

Calico process is running.

IPv4 BGP status
+--------------+-----------+-------+----------+-------------+
| PEER ADDRESS | PEER TYPE | STATE |  SINCE   |    INFO     |
+--------------+-----------+-------+----------+-------------+
| 192.168.3.61 | global    | up    | 08:48:36 | Established |
+--------------+-----------+-------+----------+-------------+

IPv6 BGP status
No IPv6 peers found.

9.6 登录bird服务器查看

登录服务器 192.168.3.61

ip route

输出

default via 192.168.3.1 dev enp1s0 proto static metric 100 
10.96.0.0/12 via 192.168.3.51 dev enp1s0 proto bird metric 32 
10.113.110.0/24 via 192.168.3.51 dev enp1s0 proto bird metric 32 
10.116.201.0/24 via 192.168.3.51 dev enp1s0 proto bird metric 32 
10.118.139.0/24 via 192.168.3.51 dev enp1s0 proto bird metric 32 
10.119.93.0/24 via 192.168.3.51 dev enp1s0 proto bird metric 32 
10.120.202.0/24 via 192.168.3.51 dev enp1s0 proto bird metric 32 
10.120.222.0/24 via 192.168.3.51 dev enp1s0 proto bird metric 32 
10.123.163.0/24 via 192.168.3.51 dev enp1s0 proto bird metric 32 
10.127.183.0/24 via 192.168.3.51 dev enp1s0 proto bird metric 32 
192.168.3.0/24 dev enp1s0 proto kernel scope link src 192.168.3.61 metric 100 
192.168.3.51 via 192.168.3.51 dev enp1s0 proto bird metric 32 

可以看到已经获取到了 10.96.0.0/12 的路由表,下一跳指向 192.168.3.51

再查看一下bgp peer

birdc  show protocols
# birdc show protocols  all neighbor1

输出

0001 BIRD 2.13 ready.
2002-Name       Proto      Table      State  Since         Info
1002-kernel1    Kernel     master4    up     12:04:16.556  
     device1    Device     ---        up     12:04:16.556  
     neighbor1  BGP        ---        up     16:48:36.567  Established  

10. 部署LoadBalance

10.1 OpenELB概述

OpenELB 是一个开源的云原生负载均衡器实现,可以在基于裸金属服务器、边缘以及虚拟化的 Kubernetes 环境中使用 LoadBalancer 类型的 Service 对外暴露服务。

OpenELB 项目最初由KubeSphere社区发起,目前已作为 CNCF 沙箱项目加入 CNCF 基金会,由 OpenELB 开源社区维护与支持。

官方介绍: https://github.com/openelb/openelb/blob/master/README_zh.md

10.2 部署OpenELB

官方文档: https://openelb.io/docs/getting-started/usage/use-openelb-in-bgp-mode/


10.2.1 安装OpenELB

wget https://raw.githubusercontent.com/openelb/openelb/v0.5.1/deploy/openelb.yaml

# 替换镜像为国内源
sed -i 's#k8s.gcr.io#k8s.dockerproxy.com#g' openelb.yaml

kubectl apply -f openelb.yaml
kubectl get po -n openelb-system

输出

NAME                              READY   STATUS              RESTARTS   AGE     IP             NODE                 NOMINATED NODE   READINESS GATES
openelb-admission-create-5mxch    0/1     Completed           0          30m     10.120.202.3   bgp-k8s-07.tiga.cc   <none>           <none>
openelb-admission-patch-v9t42     0/1     Completed           2          30m     10.127.183.4   bgp-k8s-08.tiga.cc   <none>           <none>
openelb-keepalive-vip-hbxt9       1/1     Running             0          28m     192.168.3.58   bgp-k8s-08.tiga.cc   <none>           <none>
openelb-keepalive-vip-hxmst       1/1     Running             0          28m     192.168.3.57   bgp-k8s-07.tiga.cc   <none>           <none>
openelb-keepalive-vip-kk4gz       1/1     Running             0          28m     192.168.3.51   bgp-k8s-01.tiga.cc   <none>           <none>
openelb-keepalive-vip-l8fj4       1/1     Running             0          28m     192.168.3.56   bgp-k8s-06.tiga.cc   <none>           <none>
openelb-keepalive-vip-vnpt9       1/1     Running             0          28m     192.168.3.53   bgp-k8s-03.tiga.cc   <none>           <none>
openelb-keepalive-vip-wzn7n       1/1     Running             0          28m     192.168.3.52   bgp-k8s-02.tiga.cc   <none>           <none>
openelb-keepalive-vip-xhcl8       1/1     Running             0          28m     192.168.3.54   bgp-k8s-04.tiga.cc   <none>           <none>
openelb-keepalive-vip-zf8b2       1/1     Running             0          28m     192.168.3.55   bgp-k8s-05.tiga.cc   <none>           <none>
openelb-manager-cc779c856-54rlp   1/1     Running             0          30m     192.168.3.55   bgp-k8s-05.tiga.cc   <none>           <none>

注意openelb-admission-createopenelb-admission-patch此时就是处于 complated 状态。

10.2.2 创建OpenELB所需BgpConf对象

BgpConf 对象用于在 OpenELB 上配置本地(Kubernetes 集群)BGP 属性。

创建文件openelb-bgp-conf.yaml

apiVersion: network.kubesphere.io/v1alpha2
kind: BgpConf
metadata:
  name: default
spec:
  as: 64513   # 注意, 不能与calico配置的相同
  listenPort: 179  # 注意, 不能calico配置的相同
  routerId: 192.168.3.55
kubectl apply -f openelb-bgp-conf.yaml

10.2.3 创建OpenELB所需bgpPeer对象

BgpPeer 对象用于在 OpenELB 上配置对等(BIRD 机器)BGP 属性。

创建文件openelb-bgp-peer.yaml

apiVersion: network.kubesphere.io/v1alpha2
kind: BgpPeer
metadata:
  name: bgp-peer
spec:
  conf:
    peerAs: 65001
    neighborAddress: 192.168.3.61
kubectl apply -f openelb-bgp-peer.yaml

10.2.4 创建OpenELB所需Eip对象

Eip 对象作为 OpenELB 的 IP 地址池。

创建文件openelb-bgp-eip.yaml

apiVersion: network.kubesphere.io/v1alpha2
kind: Eip
metadata:
  name: bgp-eip
spec:
  address: 172.22.0.2-172.22.0.10
kubectl apply -f openelb-bgp-eip.yaml

查看Eip

kubectl get eip

输出

NAME      CIDR                     USAGE   TOTAL
bgp-eip   172.22.0.2-172.22.0.10           9

10.2.5 创建openElb测试使用的pod

下面使用 luksa/kubia 镜像创建了两个 Pod 的 Deployment。每个 Pod 向外部请求返回自己的 Pod 名称。

创建文件openelb-bgp-deploy.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: bgp-openelb
spec:
  replicas: 2
  selector:
    matchLabels:
      app: bgp-openelb
  template:
    metadata:
      labels:
        app: bgp-openelb
    spec:
      containers:
        - image: luksa/kubia
          name: kubia
          ports:
            - containerPort: 8080
kubectl apply -f openelb-bgp-deploy.yaml

10.2.6 创建OpenELB测试使用的service

创建文件openelb-bgp-svc.yaml

kind: Service
apiVersion: v1
metadata:
  name: bgp-svc
  annotations:
    lb.kubesphere.io/v1alpha1: openelb
    protocol.openelb.kubesphere.io/v1alpha1: bgp
    eip.openelb.kubesphere.io/v1alpha2: bgp-eip
spec:
  selector:
    app: bgp-openelb
  type: LoadBalancer
  ports:
    - name: http
      port: 80
      targetPort: 8080
  externalTrafficPolicy: Cluster
kubectl apply -f openelb-bgp-svc.yaml

10.2.7 在BIRD服务器创建对等体

  1. 修改/etc/bird.conf, 增加OpenELB对等体信息
router id 192.168.3.61;

protocol kernel {
    scan time 60;
    ipv4 {
      import none;
      export all;
    };
    merge paths yes;
}

protocol device {
    scan time 60;
}

protocol bgp neighbor1 {
    local as 65001;  # 指定本地as号为65001
    neighbor 192.168.3.51 port 178 as 64512; # 指定对等体IP端口为192.168.3.51:178,对等体as号为64512
    source address 192.168.3.61;
    enable route refresh off;
    ipv4 {
      import all;
      export all;
    };
}

protocol bgp neighbor2 {
    local as 65001;  # 指定本地as号为65001
    neighbor 192.168.3.55 port 179 as 64513; 这里为什么指定为192.168.3.55呢? 因为当前环境中命名空间openelb-system内的容器openelb-manager被调度到了192.168.3.55, 可以自定义调度到任意一个节点
    source address 192.168.3.61;
    enable route refresh off;
    ipv4 {
      import all;
      export all;
    };
}
  1. 重载bird配置文件
systemctl reload bird

10.3 验证OpenELB

  1. 查看service
kubectl get svc

输出

NAME         TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE
bgp-svc      LoadBalancer   10.99.222.207   172.22.0.2    80:30502/TCP   60m
kubernetes   ClusterIP      10.96.0.1       <none>        443/TCP        24h
  1. 登录BIRD服务器 192.168.3.61 查看路由表
ip route | grep 172.22

输出

172.22.0.2 via 192.168.3.58 dev enp1s0 proto bird metric 32

这时可以看到BIRD服务器已经有了一条172.22.02的路由,指向192.168.3.58;

  1. 登录BIRD服务器 192.168.3.61 使用curl测试
curl 172.22.0.2

输出

You've hit bgp-openelb-769cf5cbc8-2m6lp
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 213,047评论 6 492
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 90,807评论 3 386
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 158,501评论 0 348
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 56,839评论 1 285
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 65,951评论 6 386
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 50,117评论 1 291
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 39,188评论 3 412
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 37,929评论 0 268
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 44,372评论 1 303
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 36,679评论 2 327
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 38,837评论 1 341
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 34,536评论 4 335
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 40,168评论 3 317
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 30,886评论 0 21
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,129评论 1 267
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 46,665评论 2 362
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 43,739评论 2 351

推荐阅读更多精彩内容