k8s HA 集群

k8s HA 集群

[toc]

参考网站

K8s HA集群架构与部署

image.png

架构要点

集群数据库

使用内部etcd集群

Apiserver 高可用

使用VIP将apiserver暴露给工作程序节点
VIP可以使用keepalived + haproxy或者keeplived + nginx 来实现

本次测试使用的是keepalived + haproxy集群

执行

部署etcd集群

使用内部etcd集群

即各个k8smaster节点上都有etcd服务,形成集群,不用手动安装部署etcd集群,etcd以pod方式运行,Kubernetes会自动部署成集群

Apiserver 高可用

软件介绍

keepalived

Keepalived是基于vrrp协议的一款高可用软件。Keepailived有一台主服务器和多台备份服务器,在主服务器和备份服务器上面部署相同的服务配置,使用一个VIP地址对外提供服务,当主服务器出现故障时,VIP地址会自动漂移到备份服务器

Nginx

默认使用80端口的Web服务

Haproxy

配置服务

keepalived+nginx架构

[图片上传失败...(image-b79bc3-1727511940339)]

Nginx配置

在所有节点执行软件安装

apt install -y nginx

在所有节点改写html文件,此html文件就是nginx向客户端展示的内容,本测试为了直观,3个节点应该写入不同内容,此处写入节点hostname

echo k8s1 > /var/www/html/index.nginx-debian.html 

检查nginx服务

root@k8s2:/etc/keepalived# curl 10.203.1.82:80 
k8s1
Keepalived配置

在所有节点执行软件安装

apt install -y keepalived

编写配置文件,此处展示Master节点配置,backup节点应修改router_id,state以及priority

root@k8s1:~# cat /etc/keepalived/keepalived.conf
! Configuration File for keepalived
 
global_defs {
   router_id k8s1   #在一个网络应该是唯一的
}
 
vrrp_script chk_nginx {
    script "/etc/keepalived/nginx_check.sh" #定时检查nginx是否正常运行的脚本
    interval 2   #脚本执行间隔,每2s检测一次
    weight -5    #脚本结果导致的优先级变更,检测失败(脚本返回非0)则优先级 -5
    fall 2       #检测连续2次失败才算确定是真失败。会用weight减少优先级(1-255之间)
    rise 1       #检测1次成功就算成功。但不修改优先级
}
 
 
 
vrrp_instance VI_1 {
    #指定keepalived的角色,这里指定的不一定就是MASTER,实际会根据优先级调整,另一台为BACKUP
    state MASTER   
    interface ens160        #当前进行vrrp通讯的网卡
    virtual_router_id 200  #虚拟路由编号(数字1-255),主从要一致
    # mcast_src_ip 192.168.79.191  #
    priority 100  #定义优先级,数字越大,优先级越高,MASTER的优先级必须大于BACKUP的优先级
    nopreempt
    advert_int 1   #设定MASTER与BACKUP负载均衡器之间同步检查的时间间隔,单位是秒
    authentication {
        auth_type PASS
        auth_pass 2222
    }
    #执行监控的服务。注意这个设置不能紧挨着写在vrrp_script配置块的后面(实验中碰过的坑),
    #否则nginx监控失效!!
    track_script {
        chk_nginx    #引用VRRP脚本,即在 vrrp_script 部分指定的名字。
                     #定期运行它们来改变优先级,并最终引发主备切换。
    }
 
    virtual_ipaddress {#VRRP HA 虚拟地址 如果有多个VIP,继续换行填写
        10.203.1.85
    }
}

在所有节点编写nginx_check.sh脚本,脚本会检测nginx进程,如果进程不存在,尝试开启一次,如果开启不成功,杀死Keepalived

#!/bin/bash
counter=`ps -C nginx --no-heading|wc -l`
echo "$counter"
if [ "${counter}" = 0 ]; then
    /etc/init.d/nginx start
    sleep 2
    counter=`ps -C nginx --no-heading|wc -l`
    if [ "${counter}" = 0 ]; then
        /etc/init.d/keepalived stop
    fi
fi

增加可执行权限到nginx_check.sh脚本

chmod +x /etc/keepalived/nginx_check.sh

开启keepalived服务

systemctl daemon-reload
service keepalived start
测试

访问10.203.1.85

root@k8smaster:~# curl 10.203.1.85      
k8s1

在Master节点手动stop nginx服务

/etc/init.d/nginx stop

再次访问10.203.1.85,由于nginx_check.sh脚本会重启nginx服务,所以master还是k8s1

root@k8smaster:~# curl 10.203.1.85      
k8s1

shutdown k8s1节点,再次访问10.203.1.85,可以看到VIP以及迁移到k8s2节点

root@k8smaster:~# curl 10.203.1.85      
k8s2

开启k8s1节点,再次访问10.203.1.85,可以看到VIP回到k8s1节点

root@k8smaster:~# curl 10.203.1.85      
k8s1
keepalived+haproxy架构
image.png
工作流程
1.master节点通过apiserver来接收命令

2.haproxy有两个参数:
frontend:
   bind :8080
backend
    master1:apiserver
    master2:apiserver
目的是把apiserver通过frontend端口转发

3.keepalived可以创建VIP并实现failover

4.所以,最后kubectl的命令可以下发到VIP:8080端口,进而转发给apiserver进行工作,无论哪个master节点存活,都能够掌控整个集群

5.Worker node是由Kubernetes集群实现HA的,例如在一个Worker node上运行一个Deployment,当这个Worker node宕机,Kubernetes会自动在另外的Worker node运行此Deployment
Keepalived配置

在所有节点安装Keepalived

apt install -y keepalived

编写配置文件,此处展示Master节点配置,backup节点应修改router_id,state以及priority

root@k8s1:~# cat /etc/keepalived/keepalived.conf
! Configuration File for keepalived
 
global_defs {
   router_id k8s1   #在一个网络应该是唯一的
}
 
vrrp_script chk_api {
    script "/etc/keepalived/check_apiserver.sh" #定时检查apiserver是否正常运行的脚本
    interval 2   #脚本执行间隔,每2s检测一次
    weight -5    #脚本结果导致的优先级变更,检测失败(脚本返回非0)则优先级 -5
    fall 2       #检测连续2次失败才算确定是真失败。会用weight减少优先级(1-255之间)
    rise 1       #检测1次成功就算成功。但不修改优先级
}
 
 
 
vrrp_instance VI_1 {
    #指定keepalived的角色,这里指定的不一定就是MASTER,实际会根据优先级调整,另一台为BACKUP
    state MASTER   
    interface ens160        #当前进行vrrp通讯的网卡
    virtual_router_id 200  #虚拟路由编号(数字1-255),主从要一致
    # mcast_src_ip 192.168.79.191  #
    priority 100  #定义优先级,数字越大,优先级越高,MASTER的优先级必须大于BACKUP的优先级
    nopreempt
    advert_int 1   #设定MASTER与BACKUP负载均衡器之间同步检查的时间间隔,单位是秒
    authentication {
        auth_type PASS
        auth_pass 2222
    }
    #执行监控的服务。注意这个设置不能紧挨着写在vrrp_script配置块的后面(实验中碰过的坑),
    #否则nginx监控失效!!
    track_script {
        chk_api    #引用VRRP脚本,即在 vrrp_script 部分指定的名字。
                     #定期运行它们来改变优先级,并最终引发主备切换。
    }
 
    virtual_ipaddress {#VRRP HA 虚拟地址 如果有多个VIP,继续换行填写
        10.203.1.85
    }
}

在所有节点编写check_apiserver.sh脚本,脚本会检测apiserver,如果apiserver不存在,杀死Keepalived,VIP就会飘到其他节点

#!/bin/sh

errorExit() {
    echo "*** $*" 1>&2
    exit 1
}

curl --silent --max-time 2 --insecure https://localhost:6443/ -o /dev/null || errorExit "Error GET https://localhost:6443/"
if ip addr | grep -q 10.203.1.85; then
    curl --silent --max-time 2 --insecure https://10.203.1.85:6443/ -o /dev/null || errorExit "Error GET https://10.203.1.85:6443/"
fi

增加可执行权限到nginx_check.sh脚本

chmod +x /etc/keepalived/check_apiserver.sh

开启keepalived服务

systemctl daemon-reload
service keepalived start
Haproxy配置

在所有节点安装haproxy

apt install -y haproxy

编辑配置文件/etc/haproxy/haproxy.cfg,3个节点的配置一致

# /etc/haproxy/haproxy.cfg
#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
    log /dev/log local0
    log /dev/log local1 notice
    daemon

#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
    mode                    http
    log                     global
    option                  httplog
    option                  dontlognull
    option http-server-close
    option forwardfor       except 127.0.0.0/8
    option                  redispatch
    retries                 1
    timeout http-request    10s
    timeout queue           20s
    timeout connect         5s
    timeout client          20s
    timeout server          20s
    timeout http-keep-alive 10s
    timeout check           10s

#---------------------------------------------------------------------
# apiserver frontend which proxys to the masters
#---------------------------------------------------------------------
frontend apiserver
    bind *:8443
    mode tcp
    option tcplog
    default_backend apiserver

#---------------------------------------------------------------------
# round robin balancing for apiserver
#---------------------------------------------------------------------
backend apiserver
    option httpchk GET /healthz
    http-check expect status 200
    mode tcp
    option ssl-hello-chk
    balance     roundrobin
        server k8s1 10.203.1.82:6443 check
        server k8s2 10.203.1.83:6443 check
        server k8s3 10.203.1.84:6443 check
        # [...]

重启服务

systemctl restart haproxy

部署kubernetes HA 集群

所有节点安装docker,kubeadm,kubectl,kubelet

Docker
curl -fsSL https://get.docker.com | bash -s docker --mirror Aliyun
Kubernetes组件
curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | apt-key add -
cat <<EOF >/etc/apt/sources.list.d/kubernetes.list
deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main
EOF
apt update
apt install -y kubelet kubeadm kubectl

Master节点

初始化Master节点

在任意一个节点执行以下命令,--control-plane-endpoint 10.203.1.85:8443就是VIP加上haproxy的frontend端口

kubeadm init --image-repository registry.aliyuncs.com/google_containers --pod-network-cidr=10.244.0.0/16 --control-plane-endpoint 10.203.1.85:8443 --upload-certs

结果如下

root@k8s1:~# kubeadm init --image-repository registry.aliyuncs.com/google_containers --pod-network-cidr=10.244.0.0/16 --control-plane-endpoint 10.203.1.85:8443 --upload-certs
[init] Using Kubernetes version: v1.20.5
[preflight] Running pre-flight checks
        [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
        [WARNING SystemVerification]: this Docker version is not on the list of validated versions: 20.10.5. Latest validated version: 19.03
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [k8s1 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 10.203.1.82 10.203.1.85]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [k8s1 localhost] and IPs [10.203.1.82 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [k8s1 localhost] and IPs [10.203.1.82 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "admin.conf" kubeconfig file
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
[apiclient] All control plane components are healthy after 79.519175 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.20" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[upload-certs] Using certificate key:
a30f2492d765a17e244ffc650f09ead393397f7f1d05efbe1c7525eb9c5f721b
[mark-control-plane] Marking the node k8s1 as control-plane by adding the labels "node-role.kubernetes.io/master=''" and "node-role.kubernetes.io/control-plane='' (deprecated)"
[mark-control-plane] Marking the node k8s1 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: p42ggz.1lc9jebaqoag8ca6
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of the control-plane node running the following command on each as root:

  kubeadm join 10.203.1.85:8443 --token p42ggz.1lc9jebaqoag8ca6 \
    --discovery-token-ca-cert-hash sha256:45297952d1b812be3c4ef88bf8060f5583e7a292e414f1eb82f0aa8bdcd71a3f \
    --control-plane --certificate-key a30f2492d765a17e244ffc650f09ead393397f7f1d05efbe1c7525eb9c5f721b

Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 10.203.1.85:8443 --token p42ggz.1lc9jebaqoag8ca6 \
    --discovery-token-ca-cert-hash sha256:45297952d1b812be3c4ef88bf8060f5583e7a292e414f1eb82f0aa8bdcd71a3f

其余两个master节点执行以下命令

kubeadm join 10.203.1.85:8443 --token p42ggz.1lc9jebaqoag8ca6 \
    --discovery-token-ca-cert-hash sha256:45297952d1b812be3c4ef88bf8060f5583e7a292e414f1eb82f0aa8bdcd71a3f \
    --control-plane --certificate-key a30f2492d765a17e244ffc650f09ead393397f7f1d05efbe1c7525eb9c5f721b

结果如下

root@k8s2:~# kubeadm join 10.203.1.85:8443 --token p42ggz.1lc9jebaqoag8ca6 \
>     --discovery-token-ca-cert-hash sha256:45297952d1b812be3c4ef88bf8060f5583e7a292e414f1eb82f0aa8bdcd71a3f \
>     --control-plane --certificate-key a30f2492d765a17e244ffc650f09ead393397f7f1d05efbe1c7525eb9c5f721b
[preflight] Running pre-flight checks
        [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
        [WARNING SystemVerification]: this Docker version is not on the list of validated versions: 20.10.5. Latest validated version: 19.03
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[preflight] Running pre-flight checks before initializing the new control plane instance
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[download-certs] Downloading the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [k8s2 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 10.203.1.83 10.203.1.85]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [k8s2 localhost] and IPs [10.203.1.83 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [k8s2 localhost] and IPs [10.203.1.83 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[certs] Using the existing "sa" key
[kubeconfig] Generating kubeconfig files
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "admin.conf" kubeconfig file
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[check-etcd] Checking that the etcd cluster is healthy
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
[etcd] Announced new etcd member joining to the existing etcd cluster
[etcd] Creating static Pod manifest for "etcd"
[etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[mark-control-plane] Marking the node k8s2 as control-plane by adding the labels "node-role.kubernetes.io/master=''" and "node-role.kubernetes.io/control-plane='' (deprecated)"
[mark-control-plane] Marking the node k8s2 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]

This node has joined the cluster and a new control plane instance was created:

* Certificate signing request was sent to apiserver and approval was received.
* The Kubelet was informed of the new secure connection details.
* Control plane (master) label and taint were applied to the new node.
* The Kubernetes control plane instances scaled up.
* A new etcd member was added to the local/stacked etcd cluster.

To start administering your cluster from this node, you need to run the following as a regular user:

        mkdir -p $HOME/.kube
        sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
        sudo chown $(id -u):$(id -g) $HOME/.kube/config

Run 'kubectl get nodes' to see this node join the cluster.

Worker Node

两个worker nodes执行以下命令

kubeadm join 10.203.1.85:8443 --token p42ggz.1lc9jebaqoag8ca6 \
>     --discovery-token-ca-cert-hash sha256:45297952d1b812be3c4ef88bf8060f5583e7a292e414f1eb82f0aa8bdcd71a3f

结果

root@k8s4:~# kubeadm join 10.203.1.85:8443 --token p42ggz.1lc9jebaqoag8ca6 \
>     --discovery-token-ca-cert-hash sha256:45297952d1b812be3c4ef88bf8060f5583e7a292e414f1eb82f0aa8bdcd71a3f
[preflight] Running pre-flight checks
        [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
        [WARNING SystemVerification]: this Docker version is not on the list of validated versions: 20.10.5. Latest validated version: 19.03
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

配置kubectl工具

在master节点执行以下命令

mkdir -p /root/.kube && \
cp /etc/kubernetes/admin.conf /root/.kube/config

如果局域网中有其他节点有下载kubectl工具,可以按照以下步骤配置用于管理此集群

mkdir -p /root/.kube

创建config文件,把Kubernetes master节点/etc/kubernetes/admin.conf文件的内容复制到config文件即可

部署flannel网络

kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

修复cs

在所有master节点编辑注释controller-manager以及scheduler的yaml文件中的默认端口,使服务正常

vi /etc/kubernetes/manifests/kube-controller-manager.yaml
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    component: kube-controller-manager
    tier: control-plane
  name: kube-controller-manager
  namespace: kube-system
spec:
  containers:
  - command:
    - kube-controller-manager
    - --allocate-node-cidrs=true
    - --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf
    - --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf
    - --bind-address=127.0.0.1
    - --client-ca-file=/etc/kubernetes/pki/ca.crt
    - --cluster-cidr=10.244.0.0/16
    - --cluster-name=kubernetes
    - --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt
    - --cluster-signing-key-file=/etc/kubernetes/pki/ca.key
    - --controllers=*,bootstrapsigner,tokencleaner
    - --kubeconfig=/etc/kubernetes/controller-manager.conf
    - --leader-elect=true
      # - --port=0
    - --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
    - --root-ca-file=/etc/kubernetes/pki/ca.crt
    - --service-account-private-key-file=/etc/kubernetes/pki/sa.key
    - --service-cluster-ip-range=10.96.0.0/12
    - --use-service-account-credentials=true
    image: registry.aliyuncs.com/google_containers/kube-controller-manager:v1.20.5
    imagePullPolicy: IfNotPresent
    livenessProbe:
      failureThreshold: 8
      httpGet:
        host: 127.0.0.1
vi /etc/kubernetes/manifests/kube-scheduler.yaml
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    component: kube-scheduler
    tier: control-plane
  name: kube-scheduler
  namespace: kube-system
spec:
  containers:
  - command:
    - kube-scheduler
    - --authentication-kubeconfig=/etc/kubernetes/scheduler.conf
    - --authorization-kubeconfig=/etc/kubernetes/scheduler.conf
    - --bind-address=127.0.0.1
    - --kubeconfig=/etc/kubernetes/scheduler.conf
    - --leader-elect=true
      #- --port=0
    image: registry.aliyuncs.com/google_containers/kube-scheduler:v1.20.5
    imagePullPolicy: IfNotPresent
    livenessProbe:
      failureThreshold: 8
      httpGet:
        host: 127.0.0.1
        path: /healthz
        port: 10259
        scheme: HTTPS
      initialDelaySeconds: 10
      periodSeconds: 10
      timeoutSeconds: 15
    name: kube-scheduler
    resources:

查看状态

Nodes

root@k8s1:~# kubectl get node
NAME   STATUS   ROLES                  AGE     VERSION
k8s1   Ready    control-plane,master   4d19h   v1.20.5
k8s2   Ready    control-plane,master   4d19h   v1.20.5
k8s3   Ready    control-plane,master   4d19h   v1.20.5
k8s4   Ready    <none>                 4d2h    v1.20.5
k8s5   Ready    <none>                 4d2h    v1.20.5

ComponentStatus

root@k8s1:~# kubectl get cs 
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME                 STATUS    MESSAGE             ERROR
scheduler            Healthy   ok                  
controller-manager   Healthy   ok                  
etcd-0               Healthy   {"health":"true"}   

测试

Master HA测试

测试步骤

在一个master节点执行资源创建

可以创建任意资源,例如deployment等

shutdown 节点

使用kubectl get 命令查看创建的资源,创建成功之后shutdown此master节点

查看状态

此时使用kubectl get node命令可以看到有一个master节点处于NotReady的状态
查看刚刚创建的资源,依旧处于正常状态,也能够进行删除,编辑等

重启节点

重启节点片刻之后,使用kubectl get node能够看到节点恢复Ready状态

资源HA 测试

测试步骤

创建一个deployment资源

查看资源运行情况

使用kubectl get all -o wide查看pod运行在哪个worker node

shutdown 节点

shutdown 查看到的pod运行的worker node,能够看到pod资源转移到另外的worker node

重启节点

重启节点之后,pod资源不会failback回来,节点重新变成Ready状态

Kubernetes HA 集群 & LINSTOR 测试

架构

[图片上传失败...(image-7bb5d8-1727511940339)]

说明

LINSTOR 部署

在此前的Kubernetes HA集群的worker nodes上安装DRBD+LINSTOR,并加入LINSTOR集群,此测试新增一个LINSTOR diskful节点。在Kubernetes集群创建LINSTOR CSI 相关服务,创建storage class

测试思路

  1. 创建一个LINSTOR 类型的persistent volume claim
  2. 创建一个使用此pvc的deployment
  3. 查看Pod运行在哪个worker node,shutdown 该node,查看pod是否会转移到其他worker node,数据是否保存

部署LINSTOR CSI服务

安装软件

在两个worker nodes执行

apt install software-properties-common
add-apt-repository ppa:linbit/linbit-drbd9-stack
apt update
apt install drbd-utils drbd-dkms lvm2
modprobe drbd
echo drbd > /etc/modules-load.d/drbd.conf
apt install linstor-controller linstor-satellite  linstor-client
#命令含义
#安装software-properties-common工具,安装之后才能执行第二个命令
#添加DRBD9 ppa源
#更新apt源
#安装DRBD9以及相应软件
#加载DRBD9
#DRBD9开机启动
#安装LINSTOR相关软件

将worker nodes加入LINSTOR 集群

root@k8s5:~# linstor n c k8s5 10.203.1.96
SUCCESS:
Description:
    New node 'k8s5' registered.
Details:
    Node 'k8s5' UUID is: 20d41b64-f6b4-4712-88df-c151c8f00e37
SUCCESS:
Description:
    Node 'k8s5' authenticated
Details:
    Supported storage providers: [diskless, lvm, lvm_thin, file, file_thin, openflex_target]
    Supported resource layers  : [drbd, luks, cache, storage]
    Unsupported storage providers:
        ZFS: 'cat /sys/module/zfs/version' returned with exit code 1
        ZFS_THIN: 'cat /sys/module/zfs/version' returned with exit code 1
        SPDK: IO exception occured when running 'rpc.py get_spdk_version': Cannot run program "rpc.py": error=2, No such file or directory
    
    Unsupported resource layers:
        NVME: IO exception occured when running 'nvme version': Cannot run program "nvme": error=2, No such file or directory
        WRITECACHE: 'modprobe dm-writecache' returned with exit code 1
        OPENFLEX: IO exception occured when running 'nvme version': Cannot run program "nvme": error=2, No such file or directory
        
root@k8s5:~# linstor n c k8s4 10.203.1.95
SUCCESS:
Description:
    New node 'k8s4' registered.
Details:
    Node 'k8s4' UUID is: db78f129-a23d-4245-a744-534a5365925e
SUCCESS:
Description:
    Node 'k8s4' authenticated
Details:
    Supported storage providers: [diskless, lvm, lvm_thin, file, file_thin, openflex_target]
    Supported resource layers  : [drbd, luks, cache, storage]
    Unsupported storage providers:
        ZFS: 'cat /sys/module/zfs/version' returned with exit code 1
        ZFS_THIN: 'cat /sys/module/zfs/version' returned with exit code 1
        SPDK: IO exception occured when running 'rpc.py get_spdk_version': Cannot run program "rpc.py": error=2, No such file or directory
    
    Unsupported resource layers:
        NVME: IO exception occured when running 'nvme version': Cannot run program "nvme": error=2, No such file or directory
        WRITECACHE: 'modprobe dm-writecache' returned with exit code 1
        OPENFLEX: IO exception occured when running 'nvme version': Cannot run program "nvme": error=2, No such file or directory

部署LINSTOR CSI 服务

Apply以下yaml文件(yaml文件中的LINSTOR_IP需要根据实际情况修改成LINSTOR controller 的IP)

---
kind: StatefulSet
apiVersion: apps/v1
metadata:
  name: linstor-csi-controller
  namespace: kube-system
spec:
  serviceName: "linstor-csi"
  replicas: 1
  selector:
    matchLabels:
      app: linstor-csi-controller
      role: linstor-csi
  template:
    metadata:
      labels:
        app: linstor-csi-controller
        role: linstor-csi
    spec:
      priorityClassName: system-cluster-critical
      serviceAccount: linstor-csi-controller-sa
      containers:
        - name: csi-provisioner
          image: teym88/csi-provisioner:v1.5.0
          args:
            - "--csi-address=$(ADDRESS)"
            - "--v=5"
            - "--feature-gates=Topology=true"
            - "--timeout=120s"
          env:
            - name: ADDRESS
              value: /var/lib/csi/sockets/pluginproxy/csi.sock
          imagePullPolicy: "Always"
          volumeMounts:
            - name: socket-dir
              mountPath: /var/lib/csi/sockets/pluginproxy/
        - name: csi-attacher
          image: teym88/csi-attacher:v2.1.1
          args:
            - "--v=5"
            - "--csi-address=$(ADDRESS)"
            - "--timeout=120s"
          env:
            - name: ADDRESS
              value: /var/lib/csi/sockets/pluginproxy/csi.sock
          imagePullPolicy: "Always"
          volumeMounts:
            - name: socket-dir
              mountPath: /var/lib/csi/sockets/pluginproxy/
        - name: csi-resizer
          image: teym88/csi-resizer:v0.5.0
          args:
          - "--v=5"
          - "--csi-address=$(ADDRESS)"
          env:
          - name: ADDRESS
            value: /var/lib/csi/sockets/pluginproxy/csi.sock
          imagePullPolicy: "Always"
          volumeMounts:
          - mountPath: /var/lib/csi/sockets/pluginproxy/
            name: socket-dir
        - name: csi-snapshotter
          image: teym88/csi-snapshotter:v2.0.1
          args:
            - "-csi-address=$(ADDRESS)"
            - "-timeout=120s"
          env:
            - name: ADDRESS
              value: /var/lib/csi/sockets/pluginproxy/csi.sock
          imagePullPolicy: Always
          volumeMounts:
            - name: socket-dir
              mountPath: /var/lib/csi/sockets/pluginproxy/
        - name: linstor-csi-plugin
          image: teym88/piraeus-csi:v0.11.0
          args:
            - "--csi-endpoint=$(CSI_ENDPOINT)"
            - "--node=$(KUBE_NODE_NAME)"
            - "--linstor-endpoint=$(LINSTOR_IP)"
            - "--log-level=debug"
          env:
            - name: CSI_ENDPOINT
              value: unix:///var/lib/csi/sockets/pluginproxy/csi.sock
            - name: KUBE_NODE_NAME
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
            - name: LINSTOR_IP
              value: "http://10.203.1.81:3370"
          imagePullPolicy: "Always"
          volumeMounts:
            - name: socket-dir
              mountPath: /var/lib/csi/sockets/pluginproxy/
      volumes:
        - name: socket-dir
          emptyDir: {}
---

kind: ServiceAccount
apiVersion: v1
metadata:
  name: linstor-csi-controller-sa
  namespace: kube-system

---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: linstor-csi-provisioner-role
rules:
  - apiGroups: [""]
    resources: ["persistentvolumes"]
    verbs: ["get", "list", "watch", "create", "delete"]
  - apiGroups: [""]
    resources: ["persistentvolumeclaims"]
    verbs: ["get", "list", "watch", "update"]
  - apiGroups: ["storage.k8s.io"]
    resources: ["storageclasses"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["events"]
    verbs: ["list", "watch", "create", "update", "patch"]
  - apiGroups: ["snapshot.storage.k8s.io"]
    resources: ["volumesnapshots"]
    verbs: ["get", "list"]
  - apiGroups: ["snapshot.storage.k8s.io"]
    resources: ["volumesnapshotcontents"]
    verbs: ["get", "list"]

---

kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: linstor-csi-provisioner-binding
subjects:
  - kind: ServiceAccount
    name: linstor-csi-controller-sa
    namespace: kube-system
roleRef:
  kind: ClusterRole
  name: linstor-csi-provisioner-role
  apiGroup: rbac.authorization.k8s.io

---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: linstor-csi-attacher-role
rules:
  - apiGroups: [""]
    resources: ["persistentvolumes"]
    verbs: ["get", "list", "watch", "update", "patch"]
  - apiGroups: [""]
    resources: ["nodes"]
    verbs: ["get", "list", "watch"]
  - apiGroups: ["storage.k8s.io"]
    resources: ["csinodes"]
    verbs: ["get", "list", "watch"]
  - apiGroups: ["storage.k8s.io"]
    resources: ["volumeattachments"]
    verbs: ["get", "list", "watch", "update", "patch"]

---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: linstor-csi-attacher-binding
subjects:
  - kind: ServiceAccount
    name: linstor-csi-controller-sa
    namespace: kube-system
roleRef:
  kind: ClusterRole
  name: linstor-csi-attacher-role
  apiGroup: rbac.authorization.k8s.io

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: linstor-csi-resizer-role
rules:
  - apiGroups: [""]
    resources: ["persistentvolumes"]
    verbs: ["get", "list", "watch", "patch"]
  - apiGroups: [""]
    resources: ["persistentvolumeclaims"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["persistentvolumeclaims/status"]
    verbs: ["patch"]
  - apiGroups: [""]
    resources: ["events"]
    verbs: ["list", "watch", "create", "update", "patch"]

---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: linstor-csi-resizer-binding
subjects:
  - kind: ServiceAccount
    name: linstor-csi-controller-sa
    namespace: kube-system
roleRef:
  kind: ClusterRole
  name: linstor-csi-resizer-role
  apiGroup: rbac.authorization.k8s.io

---

kind: DaemonSet
apiVersion: apps/v1
metadata:
  name: linstor-csi-node
  namespace: kube-system
spec:
  selector:
    matchLabels:
      app: linstor-csi-node
      role: linstor-csi
  template:
    metadata:
      labels:
        app: linstor-csi-node
        role: linstor-csi
    spec:
      priorityClassName: system-node-critical
      serviceAccount: linstor-csi-node-sa
      containers:
        - name: csi-node-driver-registrar
          image: teym88/csi-node-driver-registrar:v1.2.0
          args:
            - "--v=5"
            - "--csi-address=$(ADDRESS)"
            - "--kubelet-registration-path=$(DRIVER_REG_SOCK_PATH)"
          lifecycle:
            preStop:
              exec:
                command: ["/bin/sh", "-c", "rm -rf /registration/linstor.csi.linbit.com /registration/linstor.csi.linbit.com-reg.sock"]
          env:
            - name: ADDRESS
              value: /csi/csi.sock
            - name: DRIVER_REG_SOCK_PATH
              value: /var/lib/kubelet/plugins/linstor.csi.linbit.com/csi.sock
            - name: KUBE_NODE_NAME
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
          volumeMounts:
            - name: plugin-dir
              mountPath: /csi/
            - name: registration-dir
              mountPath: /registration/
        - name: linstor-csi-plugin
          image: teym88/piraeus-csi:v0.11.0
          args:
            - "--csi-endpoint=$(CSI_ENDPOINT)"
            - "--node=$(KUBE_NODE_NAME)"
            - "--linstor-endpoint=$(LINSTOR_IP)"
            - "--log-level=debug"
          env:
            - name: CSI_ENDPOINT
              value: unix:///csi/csi.sock
            - name: KUBE_NODE_NAME
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
            - name: LINSTOR_IP
              value: "http://10.203.1.81:3370"
          imagePullPolicy: "Always"
          securityContext:
            privileged: true
            capabilities:
              add: ["SYS_ADMIN"]
            allowPrivilegeEscalation: true
          volumeMounts:
            - name: plugin-dir
              mountPath: /csi
            - name: pods-mount-dir
              mountPath: /var/lib/kubelet
              mountPropagation: "Bidirectional"
            - name: device-dir
              mountPath: /dev
      volumes:
        - name: registration-dir
          hostPath:
            path: /var/lib/kubelet/plugins_registry/
            type: DirectoryOrCreate
        - name: plugin-dir
          hostPath:
            path: /var/lib/kubelet/plugins/linstor.csi.linbit.com/
            type: DirectoryOrCreate
        - name: pods-mount-dir
          hostPath:
            path: /var/lib/kubelet
            type: Directory
        - name: device-dir
          hostPath:
            path: /dev
---

apiVersion: v1
kind: ServiceAccount
metadata:
  name: linstor-csi-node-sa
  namespace: kube-system

---

kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: linstor-csi-driver-registrar-role
  namespace: kube-system
rules:
  - apiGroups: [""]
    resources: ["events"]
    verbs: ["get", "list", "watch", "create", "update", "patch"]

---

apiVersion: storage.k8s.io/v1beta1
kind: CSIDriver
metadata:
  name: linstor.csi.linbit.com
spec:
  attachRequired: true
  podInfoOnMount: true

---

kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: linstor-csi-driver-registrar-binding
subjects:
  - kind: ServiceAccount
    name: linstor-csi-node-sa
    namespace: kube-system
roleRef:
  kind: ClusterRole
  name: linstor-csi-driver-registrar-role
  apiGroup: rbac.authorization.k8s.io

---

kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: linstor-csi-snapshotter-role
rules:
  - apiGroups: [""]
    resources: ["persistentvolumes"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["persistentvolumeclaims"]
    verbs: ["get", "list", "watch", "update"]
  - apiGroups: ["storage.k8s.io"]
    resources: ["storageclasses"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["events"]
    verbs: ["list", "watch", "create", "update", "patch"]
  - apiGroups: ["snapshot.storage.k8s.io"]
    resources: ["volumesnapshotclasses"]
    verbs: ["get", "list", "watch"]
  - apiGroups: ["snapshot.storage.k8s.io"]
    resources: ["volumesnapshotcontents"]
    verbs: ["create", "get", "list", "watch", "update", "delete"]
  - apiGroups: ["snapshot.storage.k8s.io"]
    resources: ["volumesnapshotcontents/status"]
    verbs: ["update"]
  - apiGroups: ["snapshot.storage.k8s.io"]
    resources: ["volumesnapshots"]
    verbs: ["get", "list", "watch", "update"]
  - apiGroups: ["apiextensions.k8s.io"]
    resources: ["customresourcedefinitions"]
    verbs: ["create", "list", "watch", "delete"]
  - apiGroups: ["snapshot.storage.k8s.io"]
    resources: ["volumesnapshots/status"]
    verbs: ["update"]

查看服务状态

执行以下命令

kubectl get all -A | grep linstor

如查看到状态如下,则说明LINSTOR CSI 服务部署成功

root@ubuntu:~/k8sYaml/linstor# kubectl get all -A | grep linstor
kube-system            pod/linstor-csi-controller-0                     5/5     Running   0          179m
kube-system            pod/linstor-csi-node-6gl45                       2/2     Running   0          179m
kube-system            pod/linstor-csi-node-6s969                       2/2     Running   0          179m
kube-system   daemonset.apps/linstor-csi-node   2         2         2       2            2           <none>                   179m
kube-system   statefulset.apps/linstor-csi-controller   1/1     179m

测试

创建LINSTOR storage class

Apply以下yaml文件(如果有两个diskful节点,autoPlace可以设置为2,以此类推,storagePool是之前已经在diskful节点创建好的存储池名字)

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: linstor
provisioner: linstor.csi.linbit.com
parameters:
  autoPlace: "1"
  storagePool: "poola"

Apply之后查看SC状态

root@ubuntu:~/k8sYaml/linstor# kubectl get sc
NAME      PROVISIONER              RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
linstor   linstor.csi.linbit.com   Delete          Immediate           false                  9s

创建persistent volume claim

Apply以下yaml文件

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: fs-pvc5g
spec:
  storageClassName: linstor
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi

查看pvc和pv

root@ubuntu:~/k8sYaml/linstor# kubectl get pvc
NAME       STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
fs-pvc5g   Bound    pvc-64c11eac-5ac8-4f22-ab22-6cd5f87d2811   5Gi        RWO            linstor        56m
root@ubuntu:~/k8sYaml/linstor# kubectl get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM              STORAGECLASS   REASON   AGE
pvc-64c11eac-5ac8-4f22-ab22-6cd5f87d2811   5Gi        RWO            Delete           Bound    default/fs-pvc5g   linstor                 56m
root@ubuntu:~/k8sYaml/linstor# 

创建使用这个pvc的Deployment

Apply以下yaml文件,这里使用nginx作为image,名称是ng1

apiVersion: apps/v1
kind: Deployment
metadata:
  name: ng1
spec:
  replicas: 2
  strategy:
    type: Recreate
  selector:
    matchLabels:
      run: ng1
  template:
    metadata:
      labels:
        run: ng1
    spec:
      containers:
      - name: ng1
        image: nginx
        ports:
        - containerPort: 80
        volumeMounts:
        - mountPath: /usr/share/nginx/html      
          name: linstor-volume
      volumes:
      - name: linstor-volume
        persistentVolumeClaim:
          claimName: fs-pvc5g

查看状态

root@ubuntu:~/k8sYaml/linstor# kubectl get all -A -o wide | grep ng1
default                pod/ng1-84794695b7-2pg4v                         1/1     Running   0          58m     10.244.3.17   k8s4   <none>           <none>
default                pod/ng1-84794695b7-pt4xx                         1/1     Running   0          58m     10.244.3.16   k8s4   <none>           <none>
default                service/ng1                         NodePort       10.110.200.0     <none>        80:31075/TCP             57m     run=ng1
default                deployment.apps/ng1                         2/2     2            2           58m     ng1                         nginx                                                   run=ng1
default                replicaset.apps/ng1-84794695b7                         2         2         2       58m     ng1                         nginx                                                   pod-template-hash=84794695b7,run=ng1

可以看到Pod已经是running状态,并且是运行在k8s4这个node
查看k8s4的DRBD资源状态

root@k8s4:~# drbdadm status
pvc-64c11eac-5ac8-4f22-ab22-6cd5f87d2811 role:Primary
  disk:Diskless
  ubuntu role:Secondary
    peer-disk:UpToDate

查看这个volume在宿主机系统中的mount路径

root@k8s4:~# df -h | grep pvc-64c11eac-5ac8-4f22-ab22-6cd5f87d2811
/dev/drbd1007   4.9G   21M  4.6G   1% /var/lib/kubelet/pods/bc325e27-06f0-4ab6-895f-e8b66e19aa2d/volumes/kubernetes.io~csi/pvc-64c11eac-5ac8-4f22-ab22-6cd5f87d2811/mount

进入到此路径下添加一个nginx 服务会展示的index.html文件,内容是File from drbd res

root@k8s4:~# cat /var/lib/kubelet/pods/bc325e27-06f0-4ab6-895f-e8b66e19aa2d/volumes/kubernetes.io~csi/pvc-64c11eac-5ac8-4f22-ab22-6cd5f87d2811/mount/index.html 
File from drbd res

Expose 这个Deployment的service

root@ubuntu:~/k8sYaml/linstor# kubectl expose deployment ng1 --port=80 --type=NodePort   
service/ng1 exposed
root@ubuntu:~/k8sYaml/linstor# kubectl get svc
NAME                   TYPE           CLUSTER-IP     EXTERNAL-IP   PORT(S)        AGE
kubernetes             ClusterIP      10.96.0.1      <none>        443/TCP        5d19h
loadbalancer-service   LoadBalancer   10.100.64.28   <pending>     80:32416/TCP   3d23h
ng1                    NodePort       10.110.200.0   <none>        80:31075/TCP   4s

在局域网中找另一个系统来访问这个服务

root@k8smaster:~# curl 10.203.1.85:31075      
File from drbd res

k8s4 node failover测试

由于现在Pod是运行在k8s4,所以shutdown此节点,观察Pod会不会转移到k8s5 node,数据保持不变

shutdown k8s4

root@k8s4:~# shutdown now

观察

此时服务会暂时中断,属于正常情况,因为DRBD资源不是dual primary模式,需要一定时间在k8s5 node重新创建Pod

root@k8smaster:~# curl 10.203.1.85:31075      
curl: (7) Failed to connect to 10.203.1.85 port 31075: Connection refused

可以看到pod在worker nodes上的变化,Kubernetes集群尝试在k8s5 node创建新的Pod

root@ubuntu:~/k8sYaml/linstor# kubectl get pod -A -o wide | grep ng1
default                ng1-84794695b7-2pg4v                         1/1     Terminating         0          79m     10.244.3.17   k8s4   <none>           <none>
default                ng1-84794695b7-hg72j                         0/1     ContainerCreating   0          2m5s    <none>        k8s5   <none>           <none>
default                ng1-84794695b7-mghwl                         0/1     ContainerCreating   0          2m5s    <none>        k8s5   <none>           <none>
default                ng1-84794695b7-pt4xx                         1/1     Terminating         0          79m     10.244.3.16   k8s4   <none>           <none>

观察一段时间后,k8s5 node 上的Pod一直处于creating的状态,查看详细信息

root@ubuntu:~/k8sYaml/linstor# kubectl describe pv pvc-64c11eac-5ac8-4f22-ab22-6cd5f87d2811
Name:              pvc-64c11eac-5ac8-4f22-ab22-6cd5f87d2811
Labels:            <none>
Annotations:       pv.kubernetes.io/provisioned-by: linstor.csi.linbit.com
Finalizers:        [kubernetes.io/pv-protection external-attacher/linstor-csi-linbit-com]
StorageClass:      linstor
Status:            Bound
Claim:             default/fs-pvc5g
Reclaim Policy:    Delete
Access Modes:      RWO
VolumeMode:        Filesystem
Capacity:          5Gi
Node Affinity:     
  Required Terms:  
    Term 0:        linbit.com/hostname in [ubuntu]
    Term 1:        linbit.com/sp-DfltDisklessStorPool in [true]
Message:           
Source:
    Type:              CSI (a Container Storage Interface (CSI) volume source)
    Driver:            linstor.csi.linbit.com
    FSType:            ext4
    VolumeHandle:      pvc-64c11eac-5ac8-4f22-ab22-6cd5f87d2811
    ReadOnly:          false
    VolumeAttributes:      storage.kubernetes.io/csiProvisionerIdentity=1617072579350-8081-linstor.csi.linbit.com
Events:                <none>
root@ubuntu:~/k8sYaml/linstor# kubectl describe pod ng1-84794695b7-hg72j
Name:           ng1-84794695b7-hg72j
Namespace:      default
Priority:       0
Node:           k8s5/10.203.1.96
Start Time:     Tue, 30 Mar 2021 14:36:59 +0800
Labels:         pod-template-hash=84794695b7
                run=ng1
Annotations:    <none>
Status:         Pending
IP:             
IPs:            <none>
Controlled By:  ReplicaSet/ng1-84794695b7
Containers:
  ng1:
    Container ID:   
    Image:          nginx
    Image ID:       
    Port:           80/TCP
    Host Port:      0/TCP
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /usr/share/nginx/html from linstor-volume (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-qbmdj (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  linstor-volume:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  fs-pvc5g
    ReadOnly:   false
  default-token-qbmdj:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-qbmdj
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason              Age                From                     Message
  ----     ------              ----               ----                     -------
  Normal   Scheduled           14m                default-scheduler        Successfully assigned default/ng1-84794695b7-hg72j to k8s5
  Warning  FailedAttachVolume  14m                attachdetach-controller  Multi-Attach error for volume "pvc-64c11eac-5ac8-4f22-ab22-6cd5f87d2811" Volume is already used by pod(s) ng1-84794695b7-pt4xx, ng1-84794695b7-2pg4v
  Warning  FailedMount         5m33s              kubelet                  Unable to attach or mount volumes: unmounted volumes=[linstor-volume], unattached volumes=[default-token-qbmdj linstor-volume]: timed out waiting for the condition
  Warning  FailedMount         63s (x5 over 12m)  kubelet                  Unable to attach or mount volumes: unmounted volumes=[linstor-volume], unattached volumes=[linstor-volume default-token-qbmdj]: timed out waiting for the condition

发现如下报错,提示volume已经被mount,而由于现在k8s4 node是shutdown的状态,Pod无法成功删除,所以卡在此状态

Warning  FailedAttachVolume  14m                attachdetach-controller  Multi-Attach error for volume "pvc-64c11eac-5ac8-4f22-ab22-6cd5f87d2811" Volume is already used by pod(s) ng1-84794695b7-pt4xx, ng1-84794695b7-2pg4v

k8s4 node failback测试

Start k8s4 node

观察

此时由于k8s4重启,之前的旧pod能够正常删除,所以在k8s5上的pod由creating状态变为running

root@ubuntu:~/k8sYaml/linstor# kubectl get pod -A -o wide | grep ng1    
default                ng1-84794695b7-hg72j                         1/1     Running   0          23m     10.244.4.28   k8s5   <none>           <none>
default                ng1-84794695b7-mghwl                         1/1     Running   0          23m     10.244.4.27   k8s5   <none>           <none>

nginx 服务恢复并且内容不变

root@k8smaster:~# curl 10.203.1.85:31075      
File from drbd res

问题

在k8s4 node failover的情况下,无法在k8s5 node成功running pod,服务不可用

再次测试

原因

由于上一次测试会导致服务不可用,所以需要寻找是否有办法让服务中断之后自动恢复

查看上一次测试PVC状态

yaml文件

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: fs-pvc5g
spec:
  storageClassName: linstor
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi

Describe

root@ubuntu:~/k8sYaml/linstor# kubectl describe pvc fs-pvc5g
Name:          fs-pvc5g
Namespace:     default
StorageClass:  linstor
Status:        Bound
Volume:        pvc-64c11eac-5ac8-4f22-ab22-6cd5f87d2811
Labels:        <none>
Annotations:   pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
               volume.beta.kubernetes.io/storage-provisioner: linstor.csi.linbit.com
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      5Gi
Access Modes:  RWO
VolumeMode:    Filesystem
Used By:       ng1-84794695b7-hg72j
               ng1-84794695b7-mghwl
Events:        <none>

可以看到此pvc有一个属性是accessModes,值是ReadWriteOnce,或许跟此状态有关

查找pvc accessModes

查询到信息如下

1 ReadWriteOnce-该卷可以被单个节点以读写方式挂载
2 ReadOnlyMany-该卷可以被许多节点以只读方式挂载
3 ReadWriteMany-该卷可以被多个节点以读写方式挂载

创建一个新的accessModes为ReadWriteMany的pvc进行测试

创建pvc

Apply以下yaml文件

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: fs-pvc1g
spec:
  storageClassName: linstor
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 1Gi

创建一个Deployment 来使用此PVC

Apply以下yaml文件,依旧使用nginx image,名称是ng2

apiVersion: apps/v1
kind: Deployment
metadata:
  name: ng2
spec:
  replicas: 2
  strategy:
    type: Recreate
  selector:
    matchLabels:
      run: ng2
  template:
    metadata:
      labels:
        run: ng2
    spec:
      containers:
      - name: ng2
        image: nginx
        ports:
        - containerPort: 80
        volumeMounts:
        - mountPath: /usr/share/nginx/html      
          name: linstor-volume
      volumes:
      - name: linstor-volume
        persistentVolumeClaim:
          claimName: fs-pvc1g

查看状态

root@ubuntu:~/k8sYaml/linstor# kubectl get pod -o wide
NAME                                READY   STATUS    RESTARTS   AGE    IP            NODE   NOMINATED NODE   READINESS GATES
frontend                            1/1     Running   0          28h    10.244.4.10   k8s5   <none>           <none>
ng1-84794695b7-hg72j                1/1     Running   0          32m    10.244.4.28   k8s5   <none>           <none>
ng1-84794695b7-mghwl                1/1     Running   0          32m    10.244.4.27   k8s5   <none>           <none>
ng2-56fb7f7bdf-p7lwh                1/1     Running   0          35s    10.244.3.19   k8s4   <none>           <none>
ng2-56fb7f7bdf-zln7l                1/1     Running   0          35s    10.244.3.20   k8s4   <none>           <none>
nginx-deployment-59586cc59f-k69xv   1/1     Running   0          4d4h   10.244.4.7    k8s5   <none>           <none>
nginx-deployment-59586cc59f-nptx9   1/1     Running   0          4d4h   10.244.4.6    k8s5   <none>           <none>
nginx-deployment-59586cc59f-tpkfc   1/1     Running   0          4d4h   10.244.4.8    k8s5   <none>           <none>

可以看到ng2 pod跑在k8s4 node

Expose service

root@ubuntu:~/k8sYaml/linstor# kubectl expose deployment ng2 --port=80 --type=NodePort       
service/ng2 exposed
root@ubuntu:~/k8sYaml/linstor# kubectl get svc
NAME                   TYPE           CLUSTER-IP     EXTERNAL-IP   PORT(S)        AGE
kubernetes             ClusterIP      10.96.0.1      <none>        443/TCP        5d21h
loadbalancer-service   LoadBalancer   10.100.64.28   <pending>     80:32416/TCP   4d1h
ng1                    NodePort       10.110.200.0   <none>        80:31075/TCP   109m
ng2                    NodePort       10.109.98.75   <none>        80:31012/TCP   4s
ng2-service            NodePort       10.106.127.4   <none>        80:30001/TCP   4d1h

在k8s4 node往volume增加index.html文件

root@k8s4:~# cd /var/lib/kubelet/pods/a9539bf6-41b9-4d3c-827d-54b30135de5d/volumes/kubernetes.io~csi/pvc-b9a41c03-d9eb-43c8-942b-a0dabba62e69/mount
root@k8s4:/var/lib/kubelet/pods/a9539bf6-41b9-4d3c-827d-54b30135de5d/volumes/kubernetes.io~csi/pvc-b9a41c03-d9eb-43c8-942b-a0dabba62e69/mount# vi index.html

内容如下

file from rwx drbd res

在其他节点访问此服务

root@k8smaster:~# curl 10.203.1.85:31012
file from rwx drbd res

Shutdown k8s4 node

查看状态

服务中断

root@k8smaster:~# curl 10.203.1.85:31012
curl: (7) Failed to connect to 10.203.1.85 port 31012: Connection refused

等待一段时间后Pod 状态,可以看到这次虽然k8s4 node上的pod也没有完全删除,但是k8s5 node上重新running了两个pod

root@ubuntu:~/k8sYaml/linstor# kubectl get pod -o wide | grep ng2
ng2-56fb7f7bdf-75wz7                1/1     Running       0          21m    10.244.4.30   k8s5   <none>           <none>
ng2-56fb7f7bdf-lk6n6                1/1     Running       0          21m    10.244.4.29   k8s5   <none>           <none>
ng2-56fb7f7bdf-p7lwh                1/1     Terminating   0          34m    10.244.3.19   k8s4   <none>           <none>
ng2-56fb7f7bdf-zln7l                1/1     Terminating   0          34m    10.244.3.20   k8s4   <none>           <none>

再次访问服务,恢复

root@k8smaster:~# curl 10.203.1.85:31012
file from rwx drbd res

Start k8s4 node

k8s4 node上的pod正常删除,测试通过

root@ubuntu:~/k8sYaml/linstor# kubectl get pod -o wide | grep ng2
ng2-56fb7f7bdf-75wz7                1/1     Running       0          25m    10.244.4.30   k8s5   <none>           <none>
ng2-56fb7f7bdf-lk6n6                1/1     Running       0          25m    10.244.4.29   k8s5   <none>           <none>
ng2-56fb7f7bdf-p7lwh                0/1     Terminating   0          38m    <none>        k8s4   <none>           <none>
ng2-56fb7f7bdf-zln7l                0/1     Terminating   0          38m    <none>        k8s4   <none>           <none>
root@ubuntu:~/k8sYaml/linstor# kubectl get pod -o wide | grep ng2
ng2-56fb7f7bdf-75wz7                1/1     Running   0          25m    10.244.4.30   k8s5   <none>           <none>
ng2-56fb7f7bdf-lk6n6                1/1     Running   0          25m    10.244.4.29   k8s5   <none>           <none>
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 212,657评论 6 492
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 90,662评论 3 385
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 158,143评论 0 348
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 56,732评论 1 284
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 65,837评论 6 386
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 50,036评论 1 291
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 39,126评论 3 410
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 37,868评论 0 268
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 44,315评论 1 303
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 36,641评论 2 327
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 38,773评论 1 341
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 34,470评论 4 333
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 40,126评论 3 317
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 30,859评论 0 21
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,095评论 1 267
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 46,584评论 2 362
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 43,676评论 2 351

推荐阅读更多精彩内容