引子
目前,kubeadm 已经支持了搭建高可用的 Kubernetes 集群,大大降低了搭建的难度,官方的文档也非常地简洁明了,但是还是有些坑点,下面我就将我的操作记录下。
架构
Master节点上只要运行一下三个服务:
- kube-apiserver: 无状态的,需要通过LB来实现高可用。
- kube-controller-manager: 自带leader-elected功能,kubeadm初始化时,默认
--leader-elect=true
; - kube-scheduler: 自带leader-elected功能,kubeadm初始化时,默认
--leader-elect=true
;
环境说明
- OS: Ubuntu 16.04
- Kubernetes: 1.9
- CNI Network: Calico
- 节点信息:
Hostname | IP | 备注 |
---|---|---|
k8s-master-01 | 192.168.4.24 | |
k8s-master-02 | 192.168.4.25 | |
k8s-master-03 | 192.168.4.26 | |
k8s-node-01 | 192.168.4.27 | kubernetes 工作节点 |
lb-haproxy | 192.168.4.40 | 通过 haproxy 来实现 kube-api 的 LB |
前置准备
开始之前,请确保你有个能翻墙的环境,推荐使用 shadowsocket,并开启 http proxy。
1. LB 准备
开源的LB实现有很多,比如 ipvs, nginx, haproxy等等。在选型时,请注意,这是我踩的第一个坑,我刚开始选用的方案是 ipvs 的 DR 模式来实现一个四层的负载均衡,请看下面的流程:
master kube-proxy -> LB -> master
# 等同于,这在ipvs DR下是有问题的,因为master是ipvs后端实际的负载,它上面有条路由是:route add -host $vip lo:0, $vip的路由被直接发送到了Loopback接口了;
master kube-proxy -> master
后来该用 haproxy 来实现,就没有问题了,安装步骤就略过,给个我自己的最简单配置:
haproxy.cfg
########## Kube-API LB #####################
listen kube-api-lb
bind 0.0.0.0:6443
mode tcp
balance roundrobin
server k8s-master-01 192.168.4.24:6443 weight 1 maxconn 10000 check inter 10s
server k8s-master-02 192.168.4.25:6443 weight 1 maxconn 10000 check inter 10s
server k8s-master-03 192.168.4.26:6443 weight 1 maxconn 10000 check inter 10s
######## stast ############################
listen admin_stats
bind 0.0.0.0:8099
mode http
option httplog
maxconn 10
stats refresh 30s
stats uri /stats
2. 关闭 swap 分区
swapoff -a
#要永久禁掉swap分区,打开如下文件注释掉swap那一行
vi /etc/fstab
3. 开启 http_proxy
后续安装 kubeadm 等资源都需要翻墙,这里需要开命令行下开启代理:
export http_proxy="http://192.168.4.18:1080"
export https_proxy="http://192.168.4.18:1080"
export no_proxy="192.168.4.24,192.168.4.25,192.168.4.26,127.0.0.1"
4. ETCD 集群
我这里 etcd 集群没有开启TLS的支持,需要的可以参考官方的文档。这里还需要注意下,如何你想查看下 kubernetes 或者 calico 存在 etcd 中的内容,请使用api3的版本:export ETCDCTL_API=3
, 不然你使用etcdctl ls /
来查看是没有任何内容的。
下载 etcd 二进制包:
export ETCD_VERSION=v3.1.10
curl -sSL https://github.com/coreos/etcd/releases/download/${ETCD_VERSION}/etcd-${ETCD_VERSION}-linux-amd64.tar.gz | tar -xzv --strip-components=1 -C /usr/local/bin/
rm -rf etcd-$ETCD_VERSION-linux-amd64*
配置启动脚本:
# node1
touch /etc/etcd.env
echo "PEER_NAME=k8s-master-01" >> /etc/etcd.env
echo "PRIVATE_IP=192.168.4.24" >> /etc/etcd.env
# node2
touch /etc/etcd.env
echo "PEER_NAME=k8s-master-02" >> /etc/etcd.env
echo "PRIVATE_IP=192.168.4.25" >> /etc/etcd.env
# node3
touch /etc/etcd.env
echo "PEER_NAME=k8s-master-03" >> /etc/etcd.env
echo "PRIVATE_IP=192.168.4.26" >> /etc/etcd.env
cat >/etc/systemd/system/etcd.service <<EOL
[Unit]
Description=etcd
Documentation=https://github.com/coreos/etcd
Conflicts=etcd.service
Conflicts=etcd2.service
[Service]
EnvironmentFile=/etc/etcd.env
Type=notify
Restart=always
RestartSec=5s
LimitNOFILE=40000
TimeoutStartSec=0
ExecStart=/usr/local/bin/etcd --name ${PEER_NAME} \
--data-dir /var/lib/etcd \
--listen-client-urls http://${PRIVATE_IP}:2379,http://127.0.0.1:2379 \
--advertise-client-urls http://${PRIVATE_IP}:2379 \
--listen-peer-urls http://${PRIVATE_IP}:2380 \
--initial-advertise-peer-urls http://${PRIVATE_IP}:2380 \
--initial-cluster k8s-master-01=http://192.168.4.24:2380,k8s-master-02=http://192.168.4.25:2380,k8s-master-03=http://192.168.4.26:2380 \
--initial-cluster-token my-etcd-token \
--initial-cluster-state new
[Install]
WantedBy=multi-user.target
EOL
启动:
systemctl daemon-reload
systemctl start etcd
检查集群运行状态:
etcdctl cluster-health
5. Docker Daemon
docker的版本貌似也有要求:
On each of your machines, install Docker. Version v1.12 is recommended, but v1.11, v1.13 and 17.03 are known to work as well. Versions 17.06+ might work, but have not yet been tested and verified by the Kubernetes node team.
apt-get update
apt-get install -y docker.io=1.13.1-0ubuntu1~16.04.2
docker daemon使用http代理
mkdir -p /etc/systemd/system/docker.service.d
vim /etc/systemd/system/docker.service.d/http-proxy.conf
[Service]
Environment="HTTP_PROXY=http://192.168.4.18:1080"
Environment="HTTPS_PROXY=http://192.168.4.18:1080"
Environment="NO_PROXY=192.168.4.0/24"
systemctl daemon-reload
systemctl restart docker
6. 安装 kubeadm kubelet kubectl
apt-get update && apt-get install -y apt-transport-https
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
cat <<EOF >/etc/apt/sources.list.d/kubernetes.list
deb http://apt.kubernetes.io/ kubernetes-xenial main
EOF
apt-get update
apt-get install -y kubelet kubeadm kubectl
建立 Master 节点
1. kubeadm init
通过配置文件进行初始化,这里需要注意的是: podSubnet
需要跟后面CNI网络保持一致;
cat >config.yaml <<EOL
apiVersion: kubeadm.k8s.io/v1alpha1
kind: MasterConfiguration
api:
advertiseAddress: 192.168.4.24
etcd:
endpoints:
- http://192.168.4.24:2379
- http://192.168.4.25:2379
- http://192.168.4.26:2379
networking:
podSubnet: 10.1.0.0/16
apiServerCertSANs:
- 192.168.4.24
- 192.168.4.25
- 192.168.4.26
- 192.168.4.27
- 192.168.4.40
apiServerExtraArgs:
endpoint-reconciler-type: lease
EOL
kubeadm init --config=config.yaml
2. 拷贝证书
待第一个master节点完成初始化后,将/etc/kubernetes/pki
目录下的所有文件拷贝至其他两个节点。
同上步骤,其他两个节点使用kubeadm init --config=config.yaml
来完成初始化。
3. 配置kubectl的配置文件
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config
如果是 root 用户:
export KUBECONFIG=/etc/kubernetes/admin.conf
4. 准备CNI网络
kubernetes 前提是需要Pod能跨主机通讯,这里我选择的方案是Calico, 在CNI网络没有准备好之前,Master节点的状态为NotReady
。我这是使用上面的etcd集群作为calico的存储,更多请参考calico Standard Hosted Install。
注意: 这里又一个坑点来了,kubeadm初始化的Master默认是不允许调度Pod的, Calico的node却使用DaemonSet来部署的,所以我们先要把这个限制给去掉先, Master节点通过taints来限制的,更多请参考: https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/:
去除限制:
kubectl taint nodes k8s-master-01 node-role.kubernetes.io/master:NoSchedule-
kubectl taint nodes k8s-master-02 node-role.kubernetes.io/master:NoSchedule-
kubectl taint nodes k8s-master-03 node-role.kubernetes.io/master:NoSchedule-
建立 RBAC 的角色
https://docs.projectcalico.org/v3.0/getting-started/kubernetes/installation/hosted/hosted
通过calico.yaml部署CNI
wget https://docs.projectcalico.org/v3.0/getting-started/kubernetes/installation/hosted/calico.yaml
# calico.yaml 需要修改两个地方:
1. etcd_endpoints: "http://192.168.4.24:2379,http://192.168.4.25:2379,http://192.168.4.26:2379"
2. - name: CALICO_IPV4POOL_CIDR
value: "10.1.0.0/16" # 这个应该与kubeadm初始化文件config.yaml一致
kubectl apply -f calico.yaml
Master 节点加上Pod调度限制:
kubectl taint nodes k8s-master-01 node-role.kubernetes.io/master=:NoSchedule
kubectl taint nodes k8s-master-02 node-role.kubernetes.io/master=:NoSchedule
kubectl taint nodes k8s-master-03 node-role.kubernetes.io/master=:NoSchedule
Worker Node 节点
1. 准备
worker node上也需要安装docker damon, kubelet, kubeadm, kubectl, 具体请参考Master节点。
2. worker node join
kubeadm join --token a65269.b45f13a6e90114e3 192.168.4.24:6443 --discovery-token-ca-cert-hash sha256:f71fb0208c16a54d782a2a05f33d5b22a062ede8f14c901384e9020534dca169
3. 将kubep-proxy, kubelet的kube-api地址指向LB
kubectl get configmap -n kube-system kube-proxy -o yaml > kube-proxy.yaml
sudo sed -i 's#server:.*#server: https://<masterLoadBalancerFQDN>:6443#g' kube-proxy.yaml
kubectl apply -f kube-proxy.yaml --force
# restart all kube-proxy pods to ensure that they load the new configmap
kubectl delete pod -n kube-system -l k8s-app=kube-proxy
sudo sed -i 's#server:.*#server: https://<masterLoadBalancerFQDN>:6443#g' /etc/kubernetes/kubelet.conf
sudo systemctl restart kubelet
参考
https://kubernetes.io/docs/setup/independent/high-availability/