一、事前准备
1、所有节点彼此网络互通,并且k8s-m1 SSH 登入其他节点为 passwdless,由于过程中很多会在某台节点(k8s-m1)上以 SSH 复制与操作其他节点
sed -i 's#PermitRootLogin without-password#PermitRootLogin yes#g' /etc/ssh/sshd_config
systemctl restart sshd
#安装必备包
yum -y install bash-completion.noarch net-tools vim lrzsz wget tree screen lsof tcpdump
#ssh 免交互创建密钥
ssh-keygen -t rsa -P "" -f ~/.ssh/id_rsa
cd /root/.ssh && cat id_rsa.pub > authorized_keys
scp -rp /root/.ssh master1:/root
scp -rp /root/.ssh node1:/root
scp -rp /root/.ssh node2:/root
vim /etc/hosts
ip master1
ip master2
ip node1
2、确认所有防火墙与 SELinux 已关闭。如 CentOS:
systemctl stop firewalld && systemctl disable firewalld
setenforce 0
3、所有节点需要设定/etc/hosts解析到所有丛集主机。
4、所有节点需要设定以下系统自变量。
cat <<EOF | tee /etc/sysctl.d/k8s.conf
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sysctl -p /etc/sysctl.d/k8s.conf
5、关闭系统 Swap,请在所有节点利用以下指令关闭:
swapoff -a && sysctl -w vm.swappiness=0
# 不同档案会有差异
sed '/swap.img/d' -i /etc/fstab
二、安装docker
1、删除已安装的Docker
sudo yum remove docker \
docker-client \
docker-client-latest \
docker-common \
docker-latest \
docker-latest-logrotate \
docker-logrotate \
docker-selinux \
docker-engine-selinux \
docker-engine
2、配置yum源
cd /etc/yum.repo.d/
yum clean
yum makecache
3、检视Docker版本:
yum list docker-ce --showduplicates
4、安装指定版本
yum install -y --setopt=obsoletes=0 \
docker-ce-17.03.2.ce-1.el7.centos.x86_64 \
docker-ce-selinux-17.03.2.ce-1.el7.centos.noarch
启动docker
systemctl start docker
三、安装k8s
使用kubeadm安装:
1.首先配置各节点阿里K8S YUM源
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
EOF
yum -y install epel-release
yum clean all
yum makecache
2.在各节点安装kubeadm和相关工具包(本文安装的是1.10.0版本)
yum install -y kubelet-1.10.0 kubeadm-1.10.0 kubectl-1.10.0 --disableexcludes=kubernetes
3.启动kubelet服务
systemctl enable kubelet && systemctl start kubelet
提示:此时kubelet的服务执行状态是异常的,因为缺少主配置档案kubelet.conf。但可以暂不处理,因为在完成Master节点的初始化后才会生成这个配置档案。
修改kubelet配置,启动kubelet(所有节点)
注意:时刻检视/var/log/message的日志输出,会看到kubelet一直启动失败。
4、编辑10-kubeadm.conf的档案,修改cgroup-driver配置:
[root@centos7-base-ok]# cat /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--kubeconfig=/etc/kubernetes/kubelet.conf --require-kubeconfig=true"
Environment="KUBELET_SYSTEM_PODS_ARGS=--pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true"
Environment="KUBELET_NETWORK_ARGS=--network-plugin=cni --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/opt/cni/bin"
Environment="KUBELET_DNS_ARGS=--cluster-dns=10.96.0.10 --cluster-domain=cluster.local"
Environment="KUBELET_AUTHZ_ARGS=--authorization-mode=Webhook --client-ca-file=/etc/kubernetes/pki/ca.crt"
Environment="KUBELET_CADVISOR_ARGS=--cadvisor-port=0"
Environment="KUBELET_CGROUP_ARGS=--cgroup-driver=cgroupfs"
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_SYSTEM_PODS_ARGS $KUBELET_NETWORK_ARGS $KUBELET_DNS_ARGS $KUBELET_AUTHZ_ARGS $KUBELET_CADVISOR_ARGS $KUBELET_CGROUP_ARGS $KUBELET_EXTRA_ARGS
将“--cgroup-driver=systems”修改成为“--cgroup-driver=cgroupfs”,重新启动kubelet。
systemctl restart kubelet
4.下载K8S相关映象(Master节点操作)
因为无法直接访问gcr.io下载映象,所以需要配置一个国内的容器映象加速器
配置一个阿里云的加速器:
在页面中找到并点选映象加速按钮,即可看到属于自己的专属加速连结,选择Centos版本后即可看到配置方法。
提示:在阿里云上使用 Docker 并配置阿里云映象加速器,可能会遇到 daemon.json 导致 docker daemon 无法启动的问题,可以通过以下方法解决。
你需要的是编辑
vim /etc/sysconfig/docker
然后
OPTIONS='--selinux-enabled --log-driver=journald --registry-mirror=http://xxxx.mirror.aliyuncs.com'
registry-mirror 输入你的映象地址
最后 service docker restart 重启 daemon
然后 ps aux | grep docker 然后你就会发现带有映象的启动自变量了。
5.下载K8S相关映象
#!/bin/bash
images=(kube-proxy-amd64:v1.10.0 kube-scheduler-amd64:v1.10.0 kube-controller-manager-amd64:v1.10.0 kube-apiserver-amd64:v1.10.0
etcd-amd64:3.1.12 pause-amd64:3.1 kubernetes-dashboard-amd64:v1.8.3 k8s-dns-sidecar-amd64:1.14.8 k8s-dns-kube-dns-amd64:1.14.8
k8s-dns-dnsmasq-nanny-amd64:1.14.8)
for imageName in ${images[@]} ; do
docker pull keveon/$imageName
docker tag keveon/$imageName k8s.gcr.io/$imageName
docker rmi keveon/$imageName
done
上面的shell指令码主要做了3件事,下载各种需要用到的容器映象、重新打标记为符合k8s命令规范的版本名称、清除旧的容器映象。
提示:映象版本一定要和kubeadm安装的版本一致,否则会出现time out问题。
6.初始化安装K8S Master
执行上述shell指令码,等待下载完成后,执行kubeadm init
[root@k8smaster ~]# kubeadm init --kubernetes-version=v1.10.0 --pod-network-cidr=10.244.0.0/16
[init] Using Kubernetes version: v1.10.0
[init] Using Authorization modes: [Node RBAC]
[preflight] Running pre-flight checks.
[WARNING Service-Kubelet]: kubelet service is not enabled, please run 'systemctl enable kubelet.service'
[WARNING FileExisting-crictl]: crictl not found in system path
Suggestion: go get github.com/kubernetes-incubator/cri-tools/cmd/crictl
[preflight] Starting the kubelet service
[certificates] Generated ca certificate and key.
[certificates] Generated apiserver certificate and key.
[certificates] apiserver serving cert is signed for DNS names [k8smaster kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 10.0.100.202]
[certificates] Generated apiserver-kubelet-client certificate and key.
[certificates] Generated etcd/ca certificate and key.
[certificates] Generated etcd/server certificate and key.
[certificates] etcd/server serving cert is signed for DNS names [localhost] and IPs [127.0.0.1]
[certificates] Generated etcd/peer certificate and key.
[certificates] etcd/peer serving cert is signed for DNS names [k8smaster] and IPs [10.0.100.202]
[certificates] Generated etcd/healthcheck-client certificate and key.
[certificates] Generated apiserver-etcd-client certificate and key.
[certificates] Generated sa key and public key.
[certificates] Generated front-proxy-ca certificate and key.
[certificates] Generated front-proxy-client certificate and key.
[certificates] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/scheduler.conf"
[controlplane] Wrote Static Pod manifest for component kube-apiserver to "/etc/kubernetes/manifests/kube-apiserver.yaml"
[controlplane] Wrote Static Pod manifest for component kube-controller-manager to "/etc/kubernetes/manifests/kube-controller-manager.yaml"
[controlplane] Wrote Static Pod manifest for component kube-scheduler to "/etc/kubernetes/manifests/kube-scheduler.yaml"
[etcd] Wrote Static Pod manifest for a local etcd instance to "/etc/kubernetes/manifests/etcd.yaml"
[init] Waiting for the kubelet to boot up the control plane as Static Pods from directory "/etc/kubernetes/manifests".
[init] This might take a minute or longer if the control plane images have to be pulled.
[apiclient] All control plane components are healthy after 21.001790 seconds
[uploadconfig] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[markmaster] Will mark node k8smaster as master by adding a label and a taint
[markmaster] Master k8smaster tainted and labelled with key/value: node-role.kubernetes.io/master=""
[bootstraptoken] Using token: thczis.64adx0imeuhu23xv
[bootstraptoken] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstraptoken] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstraptoken] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstraptoken] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: kube-dns
[addons] Applied essential addon: kube-proxy
Your Kubernetes master has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of machines by running the following on each node
as root:
kubeadm join 10.0.100.202:6443 --token thczis.64adx0imeuhu23xv --discovery-token-ca-cert-hash sha256:fa7b11bb569493fd44554aab0afe55a4c051cccc492dbdfafae6efeb6ffa80e6
提示:选项--kubernetes-version=v1.10.0是必须的,否则会因为访问google网站被墙而无法执行命令。这里使用v1.10.0版本,刚才前面也说到了下载的容器映象版本必须与K8S版本一致否则会出现time out。
上面的命令大约需要1分钟的过程,期间可以观察下tail -f /var/log/message日志档案的输出,掌握该配置过程和进度。上面最后一段的输出信息储存一份,后续新增工作节点还要用到。
7.配置kubectl认证信息(Master节点操作)
# 对于非root使用者
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
# 对于root使用者
export KUBECONFIG=/etc/kubernetes/admin.conf
也可以直接放到~/.bash_profile
echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> ~/.bash_profile
8.安装flannel网络(Master节点操作)
mkdir -p /etc/cni/net.d/
cat <<EOF> /etc/cni/net.d/10-flannel.conf
{
“name”: “cbr0”,
“type”: “flannel”,
“delegate”: {
“isDefaultGateway”: true
}
}
EOF
mkdir /usr/share/oci-umount/oci-umount.d -p
mkdir /run/flannel/
cat <<EOF> /run/flannel/subnet.env
FLANNEL_NETWORK=10.244.0.0/16
FLANNEL_SUBNET=10.244.1.0/24
FLANNEL_MTU=1450
FLANNEL_IPMASQ=true
EOF
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/v0.9.1/Documentation/kube-flannel.yml
9.让node1、node2加入丛集
在node1和node2节点上分别执行kubeadm join命令,加入丛集:
[root@k8snode1 ~]# kubeadm join 10.0.100.202:6443 --token thczis.64adx0imeuhu23xv --discovery-token-ca-cert-hash sha256:fa7b11bb569493fd44554aab0afe55a4c051cccc492dbdfafae6efeb6ffa80e6
[preflight] Running pre-flight checks.
[WARNING Service-Kubelet]: kubelet service is not enabled, please run 'systemctl enable kubelet.service'
[WARNING FileExisting-crictl]: crictl not found in system path
Suggestion: go get github.com/kubernetes-incubator/cri-tools/cmd/crictl
[discovery] Trying to connect to API Server "10.0.100.202:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://10.0.100.202:6443"
[discovery] Requesting info from "https://10.0.100.202:6443" again to validate TLS against the pinned public key
[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "10.0.100.202:6443"
[discovery] Successfully established connection with API Server "10.0.100.202:6443"
This node has joined the cluster:
* Certificate signing request was sent to master and a response
was received.
* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the master to see this node join the cluster.
提示:细心的童鞋应该会发现,这段命令其实就是前面K8S Matser安装成功后我让你们储存的那段命令。
预设情况下,Master节点不参与工作负载,但如果希望安装出一个All-In-One的k8s环境,则可以执行以下命令,让Master节点也成为一个Node节点:
kubectl taint nodes --all node-role.kubernetes.io/master
```-
10.验证K8S Master是否搭建成功(Master节点操作)
# 检视节点状态
kubectl get nodes
# 检视pods状态
kubectl get pods --all-namespaces
# 检视K8S丛集状态
kubectl get cs
常见错误解析
安装时候最常见的就是time out,因为K8S映象在国外,所以我们在前面就说到了提前把他下载下来,可以用一个国外机器采用habor搭建一个私有仓库把映象都download下来。
[root@k8smaster ~]# kubeadm init
[init] Using Kubernetes version: v1.10.0
[init] Using Authorization modes: [Node RBAC]
[preflight] Running pre-flight checks.
[WARNING Service-Kubelet]: kubelet service is not enabled, please run 'systemctl enable kubelet.service'
[WARNING FileExisting-crictl]: crictl not found in system path
Suggestion: go get github.com/kubernetes-incubator/cri-tools/cmd/crictl
[preflight] Starting the kubelet service
[certificates] Generated ca certificate and key.
[certificates] Generated apiserver certificate and key.
[certificates] apiserver serving cert is signed for DNS names [k8smaster kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 10.0.100.202]
[certificates] Generated apiserver-kubelet-client certificate and key.
[certificates] Generated etcd/ca certificate and key.
[certificates] Generated etcd/server certificate and key.
[certificates] etcd/server serving cert is signed for DNS names [localhost] and IPs [127.0.0.1]
[certificates] Generated etcd/peer certificate and key.
[certificates] etcd/peer serving cert is signed for DNS names [k8smaster] and IPs [10.0.100.202]
[certificates] Generated etcd/healthcheck-client certificate and key.
[certificates] Generated apiserver-etcd-client certificate and key.
[certificates] Generated sa key and public key.
[certificates] Generated front-proxy-ca certificate and key.
[certificates] Generated front-proxy-client certificate and key.
[certificates] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/scheduler.conf"
[controlplane] Wrote Static Pod manifest for component kube-apiserver to "/etc/kubernetes/manifests/kube-apiserver.yaml"
[controlplane] Wrote Static Pod manifest for component kube-controller-manager to "/etc/kubernetes/manifests/kube-controller-manager.yaml"
[controlplane] Wrote Static Pod manifest for component kube-scheduler to "/etc/kubernetes/manifests/kube-scheduler.yaml"
[etcd] Wrote Static Pod manifest for a local etcd instance to "/etc/kubernetes/manifests/etcd.yaml"
[init] Waiting for the kubelet to boot up the control plane as Static Pods from directory "/etc/kubernetes/manifests".
[init] This might take a minute or longer if the control plane images have to be pulled.
Unfortunately, an error has occurred:
timed out waiting for the condition
This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
- Either there is no internet connection, or imagePullPolicy is set to "Never",
so the kubelet cannot pull or find the following control plane images:
- k8s.gcr.io/kube-apiserver-amd64:v1.10.0
- k8s.gcr.io/kube-controller-manager-amd64:v1.10.0
- k8s.gcr.io/kube-scheduler-amd64:v1.10.0
- k8s.gcr.io/etcd-amd64:3.1.12 (only if no external etcd endpoints are configured)
If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'
couldn't initialize a Kubernetes cluster
那出现这个问题大部分原因是因为安装的K8S版本和依赖的K8S相关映象版本不符导致的,关于这部分排错可以检视/var/log/message我们在文章开始安装的时候也提到了要多看日志。
还有些童鞋可能会说,那我安装失败了,怎么清理环境重新安装啊?下面教大家一条命令:
kubeadm reset