Liveness探测
Liveness 探测让用户可以自定义判断容器是否健康的条件。如果探测失败,Kubernetes 就会重启容器。
第一步:创建如下 Pod:
[root@master-01 k8s]# vim liveness.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp-healthy
namespace: default
spec:
replicas: 1
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp-healthy
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app: myapp
release: canary
template:
metadata:
labels:
app: myapp
release: canary
spec:
imagePullSecrets:
- name: regsecret
hostAliases:
- ip: "10.1.1.5"
hostnames:
- "harbor-ali.abc.com"
containers:
- name: myapp
image: "harbor-ali.abc.com/k8s_img/myapp:v1"
imagePullPolicy: Always
ports:
- name: http
containerPort: 80
args:
- /bin/sh
- -c
- touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 400
livenessProbe:
exec:
command:
- cat
- /tmp/healthy
initialDelaySeconds: 10
periodSeconds: 5
nodeSelector:
node-label: test
启动进程首先创建文件 /tmp/healthy,30 秒后删除,在我们的设定中,如果 /tmp/healthy 文件存在,则认为容器处于正常状态,反正则发生故障。
livenessProbe 部分定义如何执行 Liveness 探测:
探测的方法是:通过 cat 命令检查 /tmp/healthy 文件是否存在。如果命令执行成功,返回值为零,Kubernetes 则认为本次 Liveness 探测成功;如果命令返回值非零,本次 Liveness 探测失败。
initialDelaySeconds: 10 指定容器启动 10 之后开始执行 Liveness 探测,我们一般会根据应用启动的准备时间来设置。比如某个应用正常启动要花 30 秒,那么 initialDelaySeconds 的值就应该大于 30。
periodSeconds: 5 指定每 5 秒执行一次 Liveness 探测。Kubernetes 如果连续执行 3 次 Liveness 探测均失败,则会杀掉并重启容器。
第二步:下面创建 Pod liveness:
[root@master-01 k8s]# kubectl apply -f liveness.yaml
deployment.apps/myapp-healthy created
[root@master-01 k8s]# kubectl get pod
NAME READY STATUS RESTARTS AGE
myapp-healthy-5cd98cfb54-hw7m9 1/1 Running 0 5s
从配置文件可知,最开始的 30 秒,/tmp/healthy 存在,cat 命令返回 0,Liveness 探测成功
第三步:这段时间 kubectl describe pod 的 Events部分会显示正常的日志。
[root@master-01 k8s]# kubectl describe pod myapp-healthy-5cd98cfb54-hw7m9
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler Successfully assigned default/myapp-healthy-6654757bc6-lcjj2 to node-02
Normal Pulling 15s kubelet, node-02 Pulling image "harbor-ali.abc.com/k8s_img/myapp:v1"
Normal Pulled 15s kubelet, node-02 Successfully pulled image "harbor-ali.abc.com/k8s_img/myapp:v1"
Normal Created 15s kubelet, node-02 Created container myapp
Normal Started 15s kubelet, node-02 Started container myapp
第四步:33秒后再次查看日志
33 秒之后,日志会显示 /tmp/healthy 已经不存在,Liveness 探测失败。再过几十秒,几次探测都失败后,容器会被重启。
[root@master-01 k8s]# kubectl describe pod myapp-healthy-5cd98cfb54-hw7m9
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler Successfully assigned default/myapp-healthy-6654757bc6-lcjj2 to node-02
Normal Pulling 33s kubelet, node-02 Pulling image "harbor-ali.abc.com/k8s_img/myapp:v1"
Normal Pulled 33s kubelet, node-02 Successfully pulled image "harbor-ali.abc.com/k8s_img/myapp:v1"
Normal Created 33s kubelet, node-02 Created container myapp
Normal Started 33s kubelet, node-02 Started container myapp
Warning Unhealthy 0s kubelet, node-02 Liveness probe failed: cat: can't open '/tmp/healthy': No such file or directory
第五步:查看pod
可以发现容器开始被重启
[root@master-01 k8s]# kubectl get pod
NAME READY STATUS RESTARTS AGE
myapp-healthy-6654757bc6-lcjj2 1/1 Running 1 80s
Readiness 探测
除了 Liveness 探测,Kubernetes Health Check 机制还包括 Readiness 探测。
用户通过 Liveness 探测可以告诉 Kubernetes 什么时候通过重启容器实现自愈;Readiness 探测则是告诉 Kubernetes 什么时候可以将容器加入到 Service 负载均衡池中,对外提供服务。
第一步:Readiness 探测的配置语法与 Liveness 探测完全一样
这个配置文件只是将前面例子中的 liveness 替换为了 readiness,我们看看有什么不同的效果。
[root@master-01 healthy]# cat readiness.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp-readiness
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app: myapp
release: canary
template:
metadata:
labels:
app: myapp
release: canary
spec:
imagePullSecrets:
- name: regsecret
hostAliases:
- ip: "10.1.1.5"
hostnames:
- "harbor-ali.abc.com"
containers:
- name: myapp
image: "harbor-ali.abc.com/k8s_img/myapp:v1"
imagePullPolicy: Always
ports:
- name: http
containerPort: 80
args:
- /bin/sh
- -c
- touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 400
readinessProbe:
exec:
command:
- cat
- /tmp/healthy
initialDelaySeconds: 10
periodSeconds: 5
nodeSelector:
node-label: test
第二步:部署
[root@master-01 healthy]# kubectl get pod
NAME READY STATUS RESTARTS AGE
myapp-readiness-6bdd66f6cd-p2j4g 0/1 Running 0 3s
[root@master-01 healthy]# kubectl get pod myapp-readiness-6bdd66f6cd-5fvw5
NAME READY STATUS RESTARTS AGE
myapp-readiness-6bdd66f6cd-5fvw5 1/1 Running 0 15s
[root@master-01 healthy]# kubectl get pod myapp-readiness-6bdd66f6cd-5fvw5
NAME READY STATUS RESTARTS AGE
myapp-readiness-6bdd66f6cd-5fvw5 0/1 Running 0 43s
Pod readiness 的 READY 状态经历了如下变化:
刚被创建时,READY 状态为不可用。
15 秒后(initialDelaySeconds + periodSeconds),第一次进行 Readiness 探测并成功返回,设置 READY 为可用。
30 秒后,/tmp/healthy 被删除,连续 3 次 Readiness 探测均失败后,READY 被设置为不可用。
第三步:通过 kubectl describe pod readiness 也可以看到 Readiness 探测失败的日志。
[root@master-01 healthy]# kubectl describe pod myapp-readiness-6bdd66f6cd-5fvw5
···
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler Successfully assigned default/myapp-readiness-6bdd66f6cd-5fvw5 to node-02
Normal Pulling 2m50s kubelet, node-02 Pulling image "harbor-ali.abc.com/k8s_img/myapp:v1"
Normal Pulled 2m50s kubelet, node-02 Successfully pulled image "harbor-ali.abc.com/k8s_img/myapp:v1"
Normal Created 2m50s kubelet, node-02 Created container myapp
Normal Started 2m50s kubelet, node-02 Started container myapp
Warning Unhealthy 39s (x21 over 43s) kubelet, node-02 Readiness probe failed: cat: can't open '/tmp/healthy': No such file or directory
下面对 Liveness 探测和 Readiness 探测做个比较:
Liveness 探测和 Readiness 探测是两种 Health Check 机制,如果不特意配置,Kubernetes 将对两种探测采取相同的默认行为,即通过判断容器启动进程的返回值是否为零来判断探测是否成功。
两种探测的配置方法完全一样,支持的配置参数也一样。不同之处在于探测失败后的行为:Liveness 探测是重启容器;Readiness 探测则是将容器设置为不可用,不接收 Service 转发的请求。
Liveness 探测和 Readiness 探测是独立执行的,二者之间没有依赖,所以可以单独使用,也可以同时使用。用 Liveness 探测判断容器是否需要重启以实现自愈;用 Readiness 探测判断容器是否已经准备好对外提供服务。