参考:
《Kubernetes权威指南》
https://www.kubernetes.org.cn/5462.html
问题处理:https://blog.csdn.net/u012570862/article/details/80150988
官方文档:
准备Kubeadm:https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/#letting-iptables-see-bridged-traffic
使用Kubeadm安装单控制集群:https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/
安装环境Centos 7
1 准备
禁用防火墙
systemctl stop firewalld
systemctl disable firewalld
禁用SELinux,使容器可以读取主机文件系统
setenforce 0
或者修改配置文件/etc/sysconfig/selinux
,将SELINUX=enforcing修改正SELINUX=disabled,最后重启Linux是修改生效
2 安装kubeadm及相关工具
yum源镜像设置
官方yum源地址为
https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
由于国内有可能会被墙,可以将源地址换成
https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
修改yum源地址的方式为:
在配置文件/etc/yum.repos.d/kubernetes.repo
配置文件中添加如下配置
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
安装Docker、kubeadm及相关工具并启动
- 安装Docker
卸载旧版本
yum remove docker \
docker-client \
docker-client-latest \
docker-common \
docker-latest \
docker-latest-logrotate \
docker-logrotate \
docker-selinux \
docker-engine-selinux \
docker-engine
安装一些必要的系统工具:
yum install -y yum-utils device-mapper-persistent-data lvm2
添加软件源信息:
yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
更新 yum 缓存:
yum makecache fast
安装 Docker-ce:
yum -y install docker-ce
启动 Docker 后台服务
systemctl start docker
systemctl enable docker
测试运行 hello-world
docker run hello-world
确认一下iptables filter表中FOWARD链的默认策略(pllicy)为ACCEPT。
iptables -nvL
Docker从1.13版本开始调整了默认的防火墙规则,禁用了iptables filter表中FOWARD链,这样会引起Kubernetes集群中跨Node的Pod无法通信。但这里通过安装docker 1806,发现默认策略又改回了ACCEPT,这个不知道是从哪个版本改回的,因为我们线上版本使用的1706还是需要手动调整这个策略的。
- 安装kubeadm及相关工具
yum install -y kubelet kubeadm kubectl --disableexcludes=kubenetes
#--disableexcludes=kubernetes 禁掉除了这个之外的别的仓库
安装了如下依赖
Installing : libnetfilter_cttimeout-1.0.0-6.el7_7.1.x86_64
Installing : socat-1.7.3.2-2.el7.x86_64
Installing : cri-tools-1.13.0-0.x86_64
Installing : kubectl-1.17.4-0.x86_64
Installing : libnetfilter_queue-1.0.2-2.el7_2.x86_64
Installing : libnetfilter_cthelper-1.0.0-10.el7_7.1.x86_64
Installing : conntrack-tools-1.4.4-5.el7_7.2.x86_64
Installing : kubernetes-cni-0.7.5-0.x86_64
Installing : kubelet-1.17.4-0.x86_64
Installing : kubeadm-1.17.4-0.x86_64
- 启动Docker和Kubelet,并设置为开机自动启动,若Docker已安装过,可略过与Docker相关的操作
Docker开机启动设置和启动
systemctl enable docker &&systemctl start docker
输出
systemctl enable docker &&systemctl start docker
kubelet开机启动设置和启动
systemctl enable kubelet&&systemctl start kubelet
输出
systemctl enable kubelet&&systemctl start kubelet
3 kubeadm config
相关配置
kubeadm config upload from-file 由配置文件上传到集群中生成ConfigMap
kubeadm config upload from-flags 由配置参数生成ConfigMap
kubeadm config view 查看当前集群中的配置值
kubeadm config print init-defaults 输出init-defaults默认参数文件内容
kubeadm config print join-defaults 输出join-defaults默认参数文件内容
kubeadm config migrate 在新旧版本之间进行配置转换
kubeadm config images list 列出所需镜像列表
kubeadm config images pull 拉取镜像到本地
获取初始化配置
kubeadm config print init-defaults>init.default.yaml
对生成的init.default.yaml
文件进行编辑,按需进行配置。
例如,若要定制镜像仓库地址,以及Pod的地址范围,可以按下列配置
apiVersion: kubeadm.k8s.io/v1beta2
imageRepository: registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: v1.17.0
networking:
podSubnet: 192.168.0.0/24
4下载kubernetes相关镜像
- 使用
kubeadm config image pull
下载镜像
kubeadm config images pull --config=init-config.yaml
5 使用kubeadm init
指令安装Master
使用下列指令安装Master
kubeadm init --config=init-config.yaml
启动成功后会有如下输出
[init] Using Kubernetes version: v1.17.0
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [192.168.0.120 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.0.120]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [192.168.0.120 localhost] and IPs [192.168.0.120 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [192.168.0.120 localhost] and IPs [192.168.0.120 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
W0324 11:05:24.385293 10915 manifests.go:214] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[control-plane] Creating static Pod manifest for "kube-scheduler"
W0324 11:05:24.386627 10915 manifests.go:214] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 37.004323 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.17" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node 192.168.0.120 as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node 192.168.0.120 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: b5ech1.ew5z582otx2mjyju
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.0.120:6443 --token b5ech1.ew5z582otx2mjyju \
--discovery-token-ca-cert-hash sha256:06afc39b62bd5b15f4b3529b78b16e464cdaf99792a71714659a925b423a2b82
表示Kubernetes集群的Master已经安装成功。
但是此时集群中还没有可用的的Node,并且缺乏对于容器网络的配置。
同时还需要注意,在kubeadm init
命令执行成功后的最后几行提示信息,其中包含加入master节点(kubeadm init)指令,和所需的Token。
给用户授权及验证
按照Kubernetes集群的Master已经安装成功的指示,执行下列命令,授予用户执行权限
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
最后使用kubectl指令验证Master节点是否运行正常
[root@192 ~]# kubectl get -n kube-system configmap
正常输出:
NAME DATA AGE
coredns 1 8h
extension-apiserver-authentication 6 8h
kube-proxy 2 8h
kubeadm-config 2 8h
kubelet-config-1.17 1 8h
6 安装Node
安装Docker和Kubeadm
由于要在一台新的机器上安装Nodej节点,需要在新的节点机器上重新按第1节、第2节、第3节安装Docker与Kubeadm。
创建join-config.yaml配置文件
apiVersion: kubeadm.k8s.io/v1beta2
kind: JoinConfiguration
discovery:
bootstrapToken:
apiServerEndpoint: 192.168.0.120:6443
token: b5ech1.ew5z582otx2mjyju
unsafeSkipCAVerification: true
timeout: 5m0s
tlsBootstrapToken: b5ech1.ew5z582otx2mjyju
其中apiServerEndpoint是Master服务器地址,token、tlsBootstrapToken的值是kubadm init安装Master成功后最后一部分提示信息中显示的token的值。
使用kubeadm join
加入集群
[root@192 ~]# kubeadm join --config=join-config.yaml
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.17" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
7 安装网络插件
在Master执行下列指令会发现node的STATUS均为NotReady,这是因为还没有安装CNI网络插件
[root@192 ~]# kubectl get nodes -n kube-system
NAME STATUS ROLES AGE VERSION
192.168.0.120 NotReady master 24h v1.17.4
192.168.0.122 NotReady <none> 6m18s v1.17.4
安装CNI网络插件有多种选择,详细可以参考https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/#pod-network
可以选择weave插件,如下所示
kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"
输出:
serviceaccount/weave-net created
clusterrole.rbac.authorization.k8s.io/weave-net created
clusterrolebinding.rbac.authorization.k8s.io/weave-net created
role.rbac.authorization.k8s.io/weave-net created
rolebinding.rbac.authorization.k8s
验证安装是否完成
使用下列指令查看Kubernetes集群的Pod节点是否都在正常运行
[root@192 ~]# kubectl get pod -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-9d85f5447-mfmmv 1/1 Running 0 4d22h
kube-system coredns-9d85f5447-t5mxm 1/1 Running 0 4d22h
kube-system etcd-192.168.0.120 1/1 Running 2 4d22h
kube-system kube-apiserver-192.168.0.120 1/1 Running 2 4d22h
kube-system kube-controller-manager-192.168.0.120 1/1 Running 2 4d22h
kube-system kube-proxy-6hcw6 1/1 Running 2 4d22h
kube-system kube-proxy-mbj86 1/1 Running 0 3d22h
kube-system kube-scheduler-192.168.0.120 1/1 Running 2 4d22h
kube-system weave-net-bgpr8 2/2 Running 0 12h
kube-system weave-net-qwh58 2/2 Running 0 12h
若发现有错误状态的Pod,可使用下列指令来查看错误原因
kubectl describe pod <pod_name> --namespace=<namespace_name>
若最后还是安装失败,可使用kubeadm reset
将主机复原,再次进行安装。
8 Trouble Shooting
Node节点加入集群(kubeadm join
)时,0Master节点的token过期
错误信息:
kubeadm join —
error execution phase preflight: couldn’t validate the identity of the API Server: abort connecting to API servers after timeout of 5m0s
解决方式
#1.在Master节点上创建新token
kubeadm token create
#2.查看新创建的token
kubeadm token list
输出:
TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS
inrs2y.dhx1vftn4s4h5vkc 13h 2020-03-29T23:05:11-04:00 authentication,signing <none> system:bootstrappers:kubeadm:default-node-token
#3.将token填入join-config.yaml
修改后的join-config.yaml
bootstrapToken:
apiServerEndpoint: 192.168.0.120:6443
token: inrs2y.dhx1vftn4s4h5vkc
unsafeSkipCAVerification: true
timeout: 5m0s
tlsBootstrapToken: inrs2y.dhx1vftn4s4h5vkc
#4.再次加入集群
kubeadm join --config=join-config.yaml
防火墙拦截
错误信息:
[WARNING Firewalld]: firewalld is active, please ensure ports [6443 10250] are open or your cluster may not function correctly
有两种解决方式:
1.直接停用firewalld
systemctl stop firewalld
systemctl disable firewalld
2.在防火墙上开放这两个端口
firewall-cmd --zone=public --add-port=6443/tcp --permanent
firewall-cmd --zone=public --add-port=10250/tcp --permanent
firewall-cmd --reload
检测到“cgroupfs”作为Docker cgroup驱动程序,推荐的驱动程序是“systemd”
错误信息
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
error execution phase preflight: [preflight] Some fatal errors occurred:
解决方式:
在/etc/docker/daemon.json中
vim /etc/docker/daemon.json
加入以下内容:
{
"exec-opts":["native.cgroupdriver=systemd"]
}
并重启Docker
systemctl restart docker
Swap错误
[ERROR Swap]: running with swap on is not supported. Please disable swap
[preflight] If you know what you are doing, you can make a check non-fatal with--ignore-preflight-errors=...
To see the stack trace of this error execute with --v=5 or higher
解决方式:
- 关闭操作系统Swap
Kubernetes 1.8开始要求关闭系统的Swap,如果不关闭,默认配置下kubelet将无法启动。 关闭系统的Swap方法如下:
swapoff -a
修改 /etc/fstab 文件,注释掉 SWAP 的自动挂载,使用free -m确认swap已经关闭。 swappiness参数调整,修改/etc/sysctl.d/k8s.conf添加下面一行:
vm.swappiness=0
执行
sysctl -p /etc/sysctl.d/k8s.conf
使修改生效。
- 去除Kubelet对Swap的限制(实验)
因为这里本次用于测试两台主机上还运行其他服务,关闭swap可能会对其他服务产生影响,所以这里修改kubelet的配置去掉这个限制。 使用kubelet的启动参数–fail-swap-on=false去掉必须关闭Swap的限制,修改/etc/sysconfig/kubelet,加入:
KUBELET_EXTRA_ARGS=--fail-swap-on=false
最后重新启动集群
若是在Kubeadm init
阶段,处理完错误后执行下列指令重新启动Master
kubeadm reset && systemctl restart kubelet && kubeadm init --config=init-config.yaml
若是在Kubeadm join
阶段,处理完错误后执行下列指令重新启动Master
kubeadm reset && systemctl restart kubelet && kubeadm join --config=join-config.yaml