备注:关于本人kubernets相关文档目前都是基于阳明
老师的k8s训练营课程
学习所得,与阳明老师的文档大部分都相同,但是个人文档也完全是一步一步实操验证过的,部分内容略有添加。
阳明老师博客地址:https://www.qikqiak.com/post/promotion-51/
。如有需要自行查阅。
在安装k8s之前先来复习下ipvs相关内容
ipvs (IP Virtual Server) 实现了传输层负载均衡,也就是我们常说的4层LAN交换,作为 Linux 内核的一部分。ipvs运行在主机上,在真实服务器集群前充当负载均衡器。ipvs可以将基于TCP和UDP的服务请求转发到真实服务器上,并使真实服务器的服务在单个 IP 地址上显示为虚拟服务。
ipvs VS iptables
我们知道kube-proxy支持 iptables 和 ipvs 两种模式, 在kubernetes v1.8 中引入了 ipvs 模式,在 v1.9 中处于 beta 阶段,在 v1.11 中已经正式可用了。iptables 模式在 v1.1 中就添加支持了,从 v1.2 版本开始 iptables 就是 kube-proxy 默认的操作模式,ipvs 和 iptables 都是基于netfilter的,那么 ipvs 模式和 iptables 模式之间有哪些差异呢?
ipvs 为大型集群提供了更好的可扩展性和性能
ipvs 支持比 iptables 更复杂的复制均衡算法(最小负载、最少连接、加权等等)
ipvs 支持服务器健康检查和连接重试等功能
环境准备
3个节点,都是centos7.6系统,内核版本3.10.0-957.21.3.el7.x86_64
。在每个节点上添加hosts信息。
172.17.122.150 master
172.17.122.151 node01
172.17.122.152 node02
节点的 hostname 必须使用标准的 DNS 命名,另外千万不用什么默认的 localhost 的 hostname,会导致各种错误出现的。在 Kubernetes 项目里,机器的名字以及一切存储在
Etcd 中的 API 对象
,都必须使用标准的 DNS 命名(RFC 1123)。可以使用命令 hostnamectl set-hostname ydzs-node1 来修改 hostname。
- 禁用防火墙和SELinux。如果是阿里云服务器的话则默认都是禁用的。
由于要开启内核ipv4转发所以需要加载br_netfilter模块,所以加载下此模块:
modprobe br_netfilter
创建/etc/sysctl.d/k8s.conf
文件,添加如下内容:
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
bridege-nf
bridge-nf 使得 netfilter 可以对 Linux 网桥上的 IPv4/ARP/IPv6 包过滤。比如,设置net.bridge.bridge-nf-call-iptables=1后,二层的网桥在转发包时也会被 iptables的 FORWARD 规则所过滤。常用的选项包括:
net.bridge.bridge-nf-call-arptables:是否在 arptables 的 FORWARD 中过滤网桥的 ARP 包
net.bridge.bridge-nf-call-ip6tables:是否在 ip6tables 链中过滤 IPv6 包
net.bridge.bridge-nf-call-iptables:是否在 iptables 链中过滤 IPv4 包
net.bridge.bridge-nf-filter-vlan-tagged:是否在 iptables/arptables 中过滤打了 vlan 标签的包。
执行如下命令使修改生效:
sysctl -p /etc/sysctl.d/k8s.conf
安装ipvs:
$ cat > /etc/sysconfig/modules/ipvs.modules <<EOF
#!/bin/bash
modprobe -- ip_vs
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- ip_vs_sh
modprobe -- nf_conntrack_ipv4
EOF
$ chmod 755 /etc/sysconfig/modules/ipvs.modules && bash /etc/sysconfig/modules/ipvs.modules && lsmod | grep -e ip_vs -e nf_conntrack_ipv4
上面的脚本创建了的/etc/sysconfig/modules/ipvs.modules
文件,保证在节点重启后能自动加载所需模块。使用lsmod|grep - e iv_vs -e nf_conntrack_ipv4
命令查看是否依旧正确加载所需的内核模块。
接下来还需要确保各个节点上已经安装了ipset软件包:
yum install ipset
为了方便查看ipvs的代理规则,最好安装一下管理工具ipvsadm:
yum install ipvsadm
同步服务器时间,阿里云服务器同区域时间都是同步的。无需再操作。如有需要可以使用chrony来配置。
关闭swap分区:
swapoff -a
接下来就可以安装Docker了
[root@master ~]# yum install -y yum-utils device-mapper-persistent-data
[root@master ~]# yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
[root@master ~]# yum list docker-ce --showduplicates|sort -r #可以选择一个版本进行安装,比如我这里就安装最新版本
[root@master ~]# yum install docker-ce-18.09.9 -y
配置docker镜像加速器:
# daemon.json可能不存在,那么我们需要自己创建
[root@master docker]# cd /etc/docker && cat daemon.json
{
"exec-opts": ["native.cgroupdriver=systemd"],
"registry-mirrors": [
"https://ot2k4d59.mirror.aliyun.com/"
],
"graph": "/data/docker" #修改docker的镜像存储路径
}
由于默认情况下 kubelet 使用的 cgroupdriver 是 systemd,所以需要保持 docker 和kubelet 的 cgroupdriver 一致,我们这里修改 docker 的 cgroupdriver=systemd。如果不修改 docker 则需要修改 kubelet 的启动配置,需要保证两者一致。
启动Docker:
systemctl start docker
systemctl enable docker
在确保 Docker 安装完成后,上面的相关环境配置也完成了,现在我们就可以来安装 Kubeadm 了,我们这里是通过指定yum 源的方式来进行安装的:
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
# 使用阿里云的源进行安装
baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg
http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
然后安装kubeadm、kubelet、kubectl:
# --disableexcludes 禁掉除了kubernetes之外的别的仓库
[root@master sysctl.d]# yum install -y kubelet-1.16.2 kubeadm-1.16.2 kubectl-1.16.2 --disableexcludes=kubernetes
[root@master sysctl.d]# kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.2", GitCommit:"c97fe5036ef3df2967d086711e6c0c405941e14b", GitTreeState:"clean", BuildDate:"2019-10-15T19:15:39Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"linux/amd64"}
可以看到我们这里安装的是 v1.16.2 版本,然后将 kubelet 设置成开机启动:
systemctl enable --now kubelet
到这里为止,上面的所有操作都需要在所有节点上执行配置。
初始化集群
然后接下来在master节点
上配置kubeadm初始化文件,可以通过如下命令导出默认的初始化配置
:
kubeadm config print init-defaults > kubeadm.yaml
然后根据我们自己的需求修改配置文件,比如修改imageRepository的值,kube-proxy
的模式为ipvs
,另外需要注意的是我们这里是准备安装flanner网络插件
,需要将networking.podSubnet
设置为10.244.0.0/16
如下:
[root@master ~]# cat kubeadm.yaml
apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 172.17.122.150 # apiserver 节点内网地址
bindPort: 6443 # apiserver通信端口,后面node节点加入(join)到集群中时用的就是此端口和上面的IP地址
nodeRegistration:
criSocket: /var/run/dockershim.sock
name: master # 默认读取当前master节点的hostname
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: gcr.azk8s.cn/google_containers # 修改成微软镜像
kind: ClusterConfiguration
kubernetesVersion: v1.16.0
networking:
dnsDomain: cluster.local
podSubnet: 10.244.0.0/16 # Pod 网络,flannel插件需要使用这个网段
serviceSubnet: 10.96.0.0/12
scheduler: {}
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: ipvs # kube-proxy 模式
- 配置提示
对于上面的资源清单的文档比较杂,要想完整了解上面的资源对象对应的属性,可以查看对应的 godoc 文档,地址: https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2。
然后使用上面的配置文件进行初始化:
kubeadm init --config kubeadm.yaml
初始化部分信息如下:
......
certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 22.002232 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.16" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node master as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node master as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: abcdef.0123456789abcdef
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 172.17.122.150:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:27d2a487e4412c5085ccf97690133f0fed2db6a3d81e3af17af88e90bcbfb613
按照提示继续操作:拷贝kubeconfig文件,kubectl会读取此配置文件
[root@master ~]# mkdir -p $HOME/.kube
[root@master ~]# cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
[root@master ~]# chown $(id -u):$(id -g) $HOME/.kube/config
kubeadm init
命令执行流程如下图所示:
添加节点
记住初始化集群上面的配置和操作要提前做好,将master节点上面的$HOME/.kube/config
文件拷贝到node节点对应的文件中(为了kubectl可以读取config配置文件执行命令获取集群信息),安装kubeadm、kubelet、kubectl(kubectl可选),然后执行上面的初始化完成后提示的join
命令即可。
[root@node01 ~]# kubeadm join 172.17.122.150:6443 --token abcdef.0123456789abcdef \
> --discovery-token-ca-cert-hash sha256:27d2a487e4412c5085ccf97690133f0fed2db6a3d81e3af17af88e90bcbfb613
[preflight] Running pre-flight checks
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.16" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Activating the kubelet service
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
- join命令
如果忘记了上面的 join 命令可以使用命令 kubeadm token create --print-join-command 重新获取。
执行成功后运行 get nodes 命令:
[root@master ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
master NotReady master 34m v1.16.2
node01 NotReady <none> 22s v1.16.2
node02 NotReady <none> 5s v1.16.2
可以看到是NotReady状态,这是因为还没有安装网络插件,接下来安装网络插件,可以在文档https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/
中选择我们要安装的网络插件,这里我们使用flannel插件:
wget https://raw.githubusercontent.com/coreos/flannel/2140ac876ef134e0ed5af15c65e414cf26827915/Documentation/kube-flannel.yml
因为有的节点可能是多网卡,所以需要在资源清单文件中指定内网网卡
搜索到名为kube-flannel-ds-adm64
的DaemonSet,在kube-flannel容器下面
然后安装flannel网络插件
kubectl apply -f kube-flannel.yml # 安装flannel网络插件,master节点安装即可
[root@master ~]# kubectl apply -f kube-flannel.yml
podsecuritypolicy.policy/psp.flannel.unprivileged created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.apps/kube-flannel-ds-amd64 created
daemonset.apps/kube-flannel-ds-arm64 created
daemonset.apps/kube-flannel-ds-arm created
daemonset.apps/kube-flannel-ds-ppc64le created
daemonset.apps/kube-flannel-ds-s390x created
隔一会查看Pod运行状态
[root@master ~]# kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-667f964f9b-99rbr 1/1 Running 0 100m
coredns-667f964f9b-w2gt7 1/1 Running 0 100m
etcd-master 1/1 Running 0 99m
kube-apiserver-master 1/1 Running 0 99m
kube-controller-manager-master 1/1 Running 0 99m
kube-flannel-ds-amd64-d7bb5 0/1 Init:0/1 0 62s #
kube-flannel-ds-amd64-tqzkw 0/1 Init:0/1 0 62s #
kube-flannel-ds-amd64-x922j 1/1 Running 0 62s #
kube-proxy-25qd4 1/1 Running 0 100m
kube-proxy-bqb2f 1/1 Running 0 66m
kube-proxy-rz2tb 1/1 Running 0 66m
kube-scheduler-master 1/1 Running 0 99m
再等一会查看
[root@master ~]# kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-667f964f9b-99rbr 1/1 Running 0 103m
coredns-667f964f9b-w2gt7 1/1 Running 0 103m
etcd-master 1/1 Running 0 103m
kube-apiserver-master 1/1 Running 0 102m
kube-controller-manager-master 1/1 Running 0 102m
kube-flannel-ds-amd64-d7bb5 0/1 Init:0/1 0 4m25s #
kube-flannel-ds-amd64-tqzkw 1/1 Running 0 4m25s #
kube-flannel-ds-amd64-x922j 1/1 Running 0 4m25s #
kube-proxy-25qd4 1/1 Running 0 103m
kube-proxy-bqb2f 1/1 Running 0 69m
kube-proxy-rz2tb 1/1 Running 0 69m
kube-scheduler-master 1/1 Running 0 103m
再等待一会
[root@master ~]# kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-667f964f9b-99rbr 1/1 Running 0 106m
coredns-667f964f9b-w2gt7 1/1 Running 0 106m
etcd-master 1/1 Running 0 105m
kube-apiserver-master 1/1 Running 0 105m
kube-controller-manager-master 1/1 Running 0 105m
kube-flannel-ds-amd64-d7bb5 1/1 Running 0 6m54s #
kube-flannel-ds-amd64-tqzkw 1/1 Running 0 6m54s #
kube-flannel-ds-amd64-x922j 1/1 Running 0 6m54s #
kube-proxy-25qd4 1/1 Running 0 106m
kube-proxy-bqb2f 1/1 Running 0 72m
kube-proxy-rz2tb 1/1 Running 0 72m
kube-scheduler-master 1/1 Running 0 105m
注意每次查看时的结果(标记#)
可以看到网络插件正在运行到各个节点,直至节点全部为Running
状态。
- Flannel网络插件
当我们部署完网络插件后执行 ifconfig 命令,正常会看到新增的cni0与flannel1这两个虚拟设备,但是如果没有看到cni0这个设备也不用太担心,我们可以观察/var/lib/cni目录是否存在,如果不存在并不是说部署有问题,而是该节点上暂时还没有应用运行,我们只需要在该节点上运行一个 Pod 就可以看到该目录会被创建,并且cni0设备也会被创建出来。
此时再查看集群状态,也就正常了。
[root@master ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
master Ready master 112m v1.16.2
node01 Ready <none> 78m v1.16.2
node02 Ready <none> 78m v1.16.2
Dashboard的安装
v1.16.2版本的集群需要安装最新的2.0+版本的Dashboard:
推荐使用下面这种方式
wget https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0-beta5/aio/deploy/recommended.yaml
vim recommended.yaml
添加type: NodePort
是为了外部能够访问k8s集群,具体请参考Kubernetes的三种外部访问方式:NodePort、LoadBalancer和Ingress
。
- 监控组件
在YAML文件中可以看到新版本Dashboard集成了一个metrics-scraper的组件,可以通过Kubernetes的Mettrcs API收集一些基础资源的监控信息,并在web页面上展示,所以要想在页面上展示监控信息就需要提供Metrics API,比如安装Metrics Server。
直接创建:
kubectl apply -f recommended.yaml
[root@master ~]# kubectl apply -f recommended.yaml
namespace/kubernetes-dashboard created
serviceaccount/kubernetes-dashboard created
service/kubernetes-dashboard created
secret/kubernetes-dashboard-certs created
secret/kubernetes-dashboard-csrf created
secret/kubernetes-dashboard-key-holder created
configmap/kubernetes-dashboard-settings created
role.rbac.authorization.k8s.io/kubernetes-dashboard created
clusterrole.rbac.authorization.k8s.io/kubernetes-dashboard created
rolebinding.rbac.authorization.k8s.io/kubernetes-dashboard created
clusterrolebinding.rbac.authorization.k8s.io/kubernetes-dashboard created
deployment.apps/kubernetes-dashboard created
service/dashboard-metrics-scraper created
deployment.apps/dashboard-metrics-scraper created
新版本的 Dashboard 会被默认安装在 kubernetes-dashboard 这个命名空间下面:
[root@master ~]# kubectl get pods -n kubernetes-dashboard -l k8s-app=kubernetes-dashboard
NAME READY STATUS RESTARTS AGE
kubernetes-dashboard-6b86b44f87-z5zjq 1/1 Running 0 54s
参数说明:
-n: 指定k8s的namespace名称
-l: --selector='': Selector (label query) to filter on, supports '=', '==', and '!='.(e.g. -l key1=value1,key2=value2)
[root@master ~]# kubectl get svc -n kubernetes-dashboard
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
dashboard-metrics-scraper ClusterIP 10.100.100.100 <none> 8000/TCP 11m
kubernetes-dashboard NodePort 10.97.88.252 <none> 443:31982/TCP 11m
然后可以通过上面的31982端口访问Dashboard,要记住使用https。
然后创建一个具有全局所有权限的用户来登录Dashboard:(admin.yaml)
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: admin
annotations:
rbac.authorization.kubernetes.io/autoupdate: "true"
roleRef:
kind: ClusterRole
name: cluster-admin
apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
name: admin
namespace: kubernetes-dashboard
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: admin
namespace: kubernetes-dashboard
直接创建
[root@master ~]# kubectl apply -f admin.yaml
clusterrolebinding.rbac.authorization.k8s.io/admin created
serviceaccount/admin created
[root@master ~]# kubectl get secret -n kubernetes-dashboard|grep admin-token
admin-token-mwcv6 kubernetes.io/service-account-token 3 2m58s
[root@master ~]# kubectl get secret admin-token-mwcv6 -o jsonpath={.data.token} -n kubernetes-dashboard |base64 -d #会获取一长串字符串
# 注意:第一条命令的第一个字段值会被用在第二条命令上,拿来获取字符串
拿到此token后输入到登录页面的token
处,即可登录到Dashboard
控制台。
如果你的集群安装过程中遇到了其他问题,我们可以使用下面的命令来进行重置:
$ kubeadm reset
$ ifconfig cni0 down && ip link delete cni0
$ ifconfig flannel.1 down && ip link delete flannel.1
$ rm -rf /var/lib/cni/