一. 预处理机器
1. 修改节点主机名
一定要避免节点重名,否则会导致加入节点后,master 无法发现node节点
master 节点
hostnamectl --static set-hostname k8s-master
node节点
hostnamectl --static set-hostname k8s-noden
执行完毕后重启或执行下面的命令即可生效
hostname $hostname
2. 禁止 swap 分区
临时关闭
swapoff -a
3. 关闭防火墙
ufw status
ufw disable
二. 安装 docker-ce
已经安装 docker 的先删除本机原有的 docker 或直接跳过本节
1. 一键安装最新阿里云docker-ce脚本
#!/bin/bash
apt update
apt install apt-transport-https ca-certificates curl gnupg-agent software-properties-common
sudo curl -fsSL https://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | sudo apt-key add -
add-apt-repository \
"deb [arch=amd64] https://mirrors.aliyun.com/docker-ce/linux/ubuntu \
$(lsb_release -cs) \
stable"
apt update
apt install docker-ce docker-ce-cli containerd.io
docker --version
2. 分步安装指定版本 docker-ce
a. 安装必要的工具
apt update
apt install apt-transport-https ca-certificates curl gnupg-agent software-properties-common
b. 安装GPG 证书
sudo curl -fsSL https://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | sudo apt-key add -
c. 写入软件源信息
add-apt-repository \
"deb [arch=amd64] https://mirrors.aliyun.com/docker-ce/linux/ubuntu \
$(lsb_release -cs) \
stable"
d. 安装指定版本的 docker-ce
1. 更新
apt update
2. 查找指定版本的 docker-ce
apt-cache madison docker-ce
3. 安装
apt install docker-ce=18.06.3~ce~3-0~ubuntu
3. 配置 docker-hub 源
国内网络拉取国外源时可能会失败
vim /etc/docker/daemon.json
---------------------------------------------------
{
"registry-mirrors": [
"https://hub-mirror.c.163.com",
"https://ustc-edu-cn.mirror.aliyuncs.com",
"https://ghcr.io",
"https://mirror.baidubce.com"
]
}
4. 重启 docker
systemctl daemon-reload && systemctl restart docker
三. 安装指定版本的kubeadm
#!/bin/bash
apt update && apt install apt-transport-https
curl -fsSL https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | apt-key add -
add-apt-repository "deb [arch=amd64] https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main"
apt-get update
apt-cache madison kubelet kubectl kubeadm |grep '1.15.4-00'
apt install -y kubelet=1.15.4-00 kubectl=1.15.4-00 kubeadm=1.15.4-00
配置禁用 swap
vim /etc/default/kubelet
---------------------------------------------------
KUBELET_EXTRA_ARGS="--fail-swap-on=false"
重启服务
systemctl daemon-reload && systemctl restart kubelet
四. 初始化集群
1. 启动 master 节点
a. 初始化节点
kubeadm init \
--kubernetes-version=v1.15.4 \
--image-repository registry.aliyuncs.com/google_containers \
--pod-network-cidr=10.24.0.0/16 \
--ignore-preflight-errors=Swap
成功后会打印出类似下面的输出,要保存起来
Your Kubernetes control-plane has initialized successfully!
Ω
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.1.21:6443 --token xcczbg.zr6mb4dzlu6wdg6r \
--discovery-token-ca-cert-hash sha256:3594158e202d0280512f8a3bab2de144b601fb3c7f928dcebc2556a55d673ff0
b. 执行,以启动集群
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
c. 部署 k8s 网络到集群,这里使用 flannel
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
2. 添加 node 到集群
init 时打印出来的命令
kubeadm join 192.168.1.21:6443 --token xcczbg.zr6mb4dzlu6wdg6r \
--discovery-token-ca-cert-hash sha256:3594158e202d0280512f8a3bab2de144b601fb3c7f928dcebc2556a55d673ff0
3. 单节点 k8s,默认 pod 不被调度在 master 节点
所以使用下面的命令可以使 master 被调度
kubectl taint nodes --all node-role.kubernetes.io/master-
4. dashboard
a. 将 dashboard pod 部署到集群
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0-beta4/aio/deploy/recommended.yaml
b. 创建服务账号
vim admin-user.yaml
---------------------------------------------------
apiVersion: v1
kind: ServiceAccount
metadata:
name: admin-user
namespace: kube-system
---------------------------------------------------
kubectl create -f admin-user.yaml
c. 绑定角色
vim admin-user-role-binding.yaml
---------------------------------------------------
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: admin-user
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: admin-user
namespace: kube-system
---------------------------------------------------
kubectl create -f admin-user-role-binding.yaml
d. 获取 token
输入一次下面的命令后会告诉你一个 Name,替换下面的 b9bwj
,记得保存生成的 token,后续登录需要使用
kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep admin-user | awk '{print $1}')Name: admin-user-token-b9bwj
e. 制作证书
grep 'client-certificate-data' ~/.kube/config | head -n 1 | awk '{print $2}' | base64 -d >> kubecfg.crt
grep 'client-key-data' ~/.kube/config | head -n 1 | awk '{print $2}' | base64 -d >> kubecfg.key
openssl pkcs12 -export -clcerts -inkey kubecfg.key -in kubecfg.crt -out kubecfg.p12 -name "kubernetes-client"
下载生成的 kubecfg.12,双击安装证书
f. 进入 dashboard
地址栏输入:
https://192.168.3.101:6443/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/
五. 测试
1. 添加 nginx pod
- 进入 dashboard
- 点击右上角加号
Create new resource
- 点击
Create from input
- 输入
apiVersion: v1
kind: Pod
metadata:
name: nginx
labels:
name: lab-ngx
app: lab-ngx
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
- 调用 nginx
这是第一步 init 时设置的 ip
curl 10.24.0.6
六. 常见错误
1. 节点主机名相同,节点加入 master 成功后,master 不显示
a. 主机重名
hostnamectl --static set-hostname k8s-master
hostname $hostname
修改成功后先 reset 然后重新init,join 机器
2. 节点 join 集群卡住
a. token 证书失效
查看证书时效
kubeadm token list
生成永久 token
kubeadm token create --ttl 0
查看 CA 证书
$ openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
根据新的 token 重新 join 集群
kubeadm join 192.168.3.206:6443 --token yev1gf.njaktxs6sqyml7kr \
--discovery-token-ca-cert-hash sha256:9a3df0018f0c8d7c4d02aa7066c96f3180b668edbeefd381ad5b9b06819c56b4
3. 部署镜像后,pod 不能正常启动
kubectl describe pod {POD_NAME} --namespace {NAMESPACE}
查看启动的错误日志,搜索对应错误的解决方案
常见错误
a. 修改 hostname 后没有重启
b. network: failed to set bridge addr: "cni0" already has an IP address different from 10.24.4.1/24
卸载网卡,它会自动安装
sudo ifconfig cni0 down
sudo ip link delete cnio