在项目的系统中,使用了 k8s 集群容器化服务的方式承载 Java项目系统。
K8S 集群使用的机器情况:
n1628 (master)
n1542
n1509
使用的 k8s 环境:
v1.15.3
在使用的过程中,由于证书过期导致 kube-apiserver 无法通信,服务故障
发现问题
2021年7月22日发现比对系统异常,任务没有点击,无排重数据,登录到 n1628 查看容器情况,
发现 kube-apiserver 异常退出,查看日志信息
docker logs -t --since="2021-07-20T13:23:37" --until "2021-07-23T12:23:37" 2d25ad94ef32
观察到错误信息
2021-07-22T03:18:27.477804611Z I0722 03:18:27.477643 1 controller.go:107] OpenAPI AggregationController: Processing item
2021-07-22T03:18:27.477852624Z I0722 03:18:27.477691 1 controller.go:130] OpenAPI AggregationController: action for item : Nothing (removed from the queue).
2021-07-22T03:18:27.477878308Z I0722 03:18:27.477705 1 controller.go:130] OpenAPI AggregationController: action for item k8s_internal_local_delegation_chain_0000000000: Nothing (removed from the queue).
2021-07-22T03:18:27.487305589Z I0722 03:18:27.487152 1 storage_scheduling.go:128] all system priority classes are created successfully or already exist.
2021-07-22T03:18:28.120293228Z E0722 03:18:28.120118 1 authentication.go:65] Unable to authenticate the request due to an error: x509: certificate has expired or is not yet valid
2021-07-22T03:18:28.126428343Z E0722 03:18:28.126310 1 authentication.go:65] Unable to authenticate the request due to an error: x509: certificate has expired or is not yet valid
2021-07-22T03:18:28.130438641Z E0722 03:18:28.130330 1 authentication.go:65] Unable to authenticate the request due to an error: x509: certificate has expired or is not yet valid
2021-07-22T03:18:28.133040542Z E0722 03:18:28.132961 1 authentication.go:65] Unable to authenticate the request due to an error: x509: certificate has expired or is not yet valid
2021-07-22T03:18:28.135957952Z E0722 03:18:28.135862 1 authentication.go:65] Unable to authenticate the request due to an error: x509: certificate has expired or is not yet valid
2021-07-22T03:18:30.234536126Z E0722 03:18:30.234254 1 authentication.go:65] Unable to authenticate the request due to an error: x509: certificate has expired or is not yet valid
2021-07-22T03:18:30.238039464Z E0722 03:18:30.237922 1 authentication.go:65] Unable to authenticate the request due to an error: x509: certificate has expired or is not yet valid
2021-07-22T03:18:30.240711863Z E0722 03:18:30.240616 1 authentication.go:65] Unable to authenticate the request due to an error: x509: certificate has expired or is not yet valid
2021-07-22T03:18:30.243274286Z E0722 03:18:30.243179 1 authentication.go:65] Unable to authenticate the request due to an error: x509: certificate has expired or is not yet valid
2021-07-22T03:18:30.245872244Z E0722 03:18:30.245775 1 authentication.go:65] Unable to authenticate the request due to an error: x509: certificate has expired or is not yet valid
定位到证书过期,进行验证
for item in `find /etc/kubernetes/pki -maxdepth 2 -name "*.crt"`;
do openssl x509 -in $item -text -noout| grep Not;
echo ======================$item===============;
done
信息看出,2021年7月22日 03 年证书到期
Not Before: Sep 19 06:27:23 2019 GMT
Not After : Jul 22 03:16:23 2021 GMT
======================/etc/kubernetes/pki/front-proxy-client.crt===============
Not Before: Sep 19 06:27:24 2019 GMT
Not After : Jul 22 03:16:21 2021 GMT
======================/etc/kubernetes/pki/apiserver-etcd-client.crt===============
Not Before: Sep 19 06:27:25 2019 GMT
Not After : Sep 16 06:27:25 2029 GMT
======================/etc/kubernetes/pki/ca.crt===============
Not Before: Sep 19 06:27:25 2019 GMT
Not After : Jul 22 03:16:21 2021 GMT
======================/etc/kubernetes/pki/apiserver.crt===============
Not Before: Sep 19 06:27:25 2019 GMT
Not After : Jul 22 03:16:21 2021 GMT
======================/etc/kubernetes/pki/apiserver-kubelet-client.crt===============
Not Before: Sep 19 06:27:23 2019 GMT
Not After : Sep 16 06:27:23 2029 GMT
======================/etc/kubernetes/pki/front-proxy-ca.crt===============
Not Before: Sep 19 06:27:24 2019 GMT
Not After : Jul 22 03:16:22 2021 GMT
======================/etc/kubernetes/pki/etcd/server.crt===============
Not Before: Sep 19 06:27:24 2019 GMT
Not After : Sep 16 06:27:24 2029 GMT
======================/etc/kubernetes/pki/etcd/ca.crt===============
Not Before: Sep 19 06:27:24 2019 GMT
Not After : Jul 22 03:16:22 2021 GMT
======================/etc/kubernetes/pki/etcd/healthcheck-client.crt===============
Not Before: Sep 19 06:27:24 2019 GMT
Not After : Jul 22 03:16:22 2021 GMT
======================/etc/kubernetes/pki/etcd/peer.crt===============
更新证书
更新证书操作
kubeadm alpha certs renew all --config=/root/kubeadm.conf
在 master 上执行重启 kube-apiserver, kube-controller, kube-scheduler, etcd 容器,使证书生效
docker ps -a | grep -v pause | grep -E "etcd|scheduler|controller|apiserver" | awk '{print $1}' | awk '{print "docker","restart",$1}' | bash
查看证书过期时间
for item in `find /etc/kubernetes/pki -maxdepth 2 -name "*.crt"`;do openssl x509 -in $item -text -noout| grep Not;echo ======================$item===============;done
2022年7月22日过期
==========;done
Not Before: Sep 19 06:27:23 2019 GMT
Not After : Jul 22 03:16:23 2022 GMT
======================/etc/kubernetes/pki/front-proxy-client.crt===============
Not Before: Sep 19 06:27:24 2019 GMT
Not After : Jul 22 03:16:21 2022 GMT
======================/etc/kubernetes/pki/apiserver-etcd-client.crt===============
Not Before: Sep 19 06:27:25 2019 GMT
Not After : Sep 16 06:27:25 2029 GMT
======================/etc/kubernetes/pki/ca.crt===============
Not Before: Sep 19 06:27:25 2019 GMT
Not After : Jul 22 03:16:21 2022 GMT
======================/etc/kubernetes/pki/apiserver.crt===============
Not Before: Sep 19 06:27:25 2019 GMT
Not After : Jul 22 03:16:21 2022 GMT
======================/etc/kubernetes/pki/apiserver-kubelet-client.crt===============
Not Before: Sep 19 06:27:23 2019 GMT
Not After : Sep 16 06:27:23 2029 GMT
======================/etc/kubernetes/pki/front-proxy-ca.crt===============
Not Before: Sep 19 06:27:24 2019 GMT
Not After : Jul 22 03:16:22 2022 GMT
======================/etc/kubernetes/pki/etcd/server.crt===============
Not Before: Sep 19 06:27:24 2019 GMT
Not After : Sep 16 06:27:24 2029 GMT
======================/etc/kubernetes/pki/etcd/ca.crt===============
Not Before: Sep 19 06:27:24 2019 GMT
Not After : Jul 22 03:16:22 2022 GMT
======================/etc/kubernetes/pki/etcd/healthcheck-client.crt===============
Not Before: Sep 19 06:27:24 2019 GMT
Not After : Jul 22 03:16:22 2022 GMT
======================/etc/kubernetes/pki/etcd/peer.crt===============
后续
- 监控起证书的过期时间
- 自动续签
- 使用cert-manage
- 证书生效期改成 10年