1. 下载metrics-server代码
git clone https://github.com/kubernetes-incubator/metrics-server.git
2. 查看依赖的镜像
$ cd metrics-server/deploy/1.8+
$ grep 'image:' *
metrics-server-deployment.yaml: image: k8s.gcr.io/metrics-server-amd64:v0.3.3
假如gcr.io的镜像访问不到可以将metrics-server-deployment.yaml中的镜像替换为:registry.cn-hangzhou.aliyuncs.com/kubernets-imags/metrics-server-amd64:v0.3.3
sed -i "s/image: .*/image: registry.cn-hangzhou.aliyuncs.com\/kubernets-imags\/metrics-server-amd64:v0.3.3/g" metrics-server-deployment.yaml
3. 安装metrics-server
$ cd metrics-server
$ kubectl create -f deploy/1.8+/
稍后就可以看到 metrics-server 运行起来:
$ kubectl -n kube-system get pods -l k8s-app=metrics-server
NAME READY STATUS RESTARTS AGE
metrics-server-54957b58f4-dnntx 1/1 Running 0 21s
4. 验证是否安全成功
$ kubectl top node
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io)
从上面的输出可以看到 metrics-server 并未成功启动。查看 metrics-server 运行日志:
$ kubectl logs metrics-server-54957b58f4-dnntx -n kube-system
E1005 11:58:15.654250 1 manager.go:111] unable to fully collect metrics: [unable to fully scrape metrics from source kubelet_summary:mesos-test2: unable to fetch metrics from Kubelet mesos-test2 (mesos-test2): Get https://mesos-test2:10250/stats/summary/: dial tcp: lookup mesos-test2 on 10.96.0.10:53: no such host, unable to fully scrape metrics from source kubelet_summary:k8s-slave20: unable to fetch metrics from Kubelet k8s-slave20 (k8s-slave20): Get https://k8s-slave20:10250/stats/summary/: dial tcp: lookup k8s-slave20 on 10.96.0.10:53: no such host, unable to fully scrape metrics from source kubelet_summary:mesos-test1: unable to fetch metrics from Kubelet mesos-test1 (mesos-test1): Get https://mesos-test1:10250/stats/summary/: dial tcp: lookup mesos-test1 on 10.96.0.10:53: no such host]
可以看到metrics-server在从kubelet的10250端口获取信息时,使用的是hostname,而因为node1和node2是一个独立的Kubernetes演示环境,只是修改了这两个节点系统的/etc/hosts文件,而并没有内网的DNS服务器,所以metrics-server中不认识node1和node2的名字。
解决方案:
- 删除metrics-server
kubectl delete pods metrics-server-54957b58f4-dnntx -n kube-system
- 修改metrics-server-deployment.yaml,添加如下command配置,然后重新部署metrics-server。
imagePullPolicy: Always
command:
- /metrics-server
- --kubelet-preferred-address-types=InternalIP
- --kubelet-insecure-tls
volumeMounts:
- name: tmp-dir
mountPath: /tmp