grafana启用插件:grafana-kubernetes-app
grafana有一个专门针对Kubernetes集群监控的插件:grafana-kubernetes-app
安装这个插件
- 可以在部署grafana的时候,直接把插件装上:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: monitoring-grafana
namespace: janny
spec:
replicas: 1
template:
metadata:
labels:
task: monitoring
k8s-app: grafana
spec:
containers:
- name: grafana
image: registry.cn-shanghai.aliyuncs.com/grafana_cluster/grafana:latest
ports:
- containerPort: 3000
protocol: TCP
env:
- name: INFLUXDB_HOST
value: monitoring-influxdb
- name: GF_INSTALL_PLUGINS
value: grafana-kubernetes-app
---
apiVersion: v1
kind: Service
metadata:
name: monitoring-grafana
namespace: kube-system
spec:
ports:
- port: 80
targetPort: 3000
type: LoadBalancer
selector:
k8s-app: grafana
- 在grafana的pod中执行安装命令:
kubectl get pods -n <namespace>
kubectl exec -it <pod name> /bin/bash -n <namespace>
grafana-cli plugins install grafana-kubernetes-app
-
装好插件后,需要在grafana中配置,才会生效
在grafana页面,点击plugins
点击kubernets-enable
配置集群访问地址以及访问证书:
-
集群访问证书,用几圈配置文件中的证书信息即可
其中属性certificate-authority-data、client-certificate-data、client-key-data对应 CA 证书、Client 证书、Client 私钥, config 文件里面的内容是base64编码过后的,这里填写内容需要做base64解码(百度搜索base64解码),保存。
grafana dashboard中自动出现下图中的dashboard
从图中看到dashboard中都没有数据。
Edit 图表,修改监控的数据项
prometheus alertmanager配置钉钉告警
Prometheus alertmanager支持告警发送到钉钉,但是需要部署prometheus-webhook-dingtalk
apiVersion: v1
kind: Service
metadata:
name: prometheus-webhook-dingtalk
namespace: kube-ops
labels:
app: prometheus-webhook-dingtalk
spec:
type: ClusterIP
selector:
app: prometheus-webhook-dingtalk
ports:
- name: http
port: 5358
targetPort: 5358
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: prometheus-webhook-dingtalk
namespace: kube-ops
spec:
template:
metadata:
labels:
app: prometheus-webhook-dingtalk
spec:
containers:
- name: prometheus-webhook-dingtalk
image: timonwong/prometheus-webhook-dingtalk:latest
args:
- '--web.listen-address=:5358'
- '--ding.profile=webhook=https://oapi.dingtalk.com/robot/send?access_token=c83f3a98adf27544d6c1b01cbf30674cbb18c5de63784d62ccd3a42c2c06bb2c'
- '--ding.timeout=5s'
- '--log.level=info'
ports:
- containerPort: 5358
prometheus-alert-configuration 配置如下:
---
kind: ConfigMap
apiVersion: v1
metadata:
name: alertmanager
namespace: kube-ops
data:
config.yml: |-
global:
resolve_timeout: 5m
smtp_smarthost: 'localhost:25'
smtp_from: 'alertmanager@example.org'
smtp_auth_username: 'alertmanager'
smtp_auth_password: 'password'
route:
receiver: webhook
group_wait: 30s
group_interval: 5m
repeat_interval: 4h
group_by: [alertname]
routes:
- receiver: webhook
group_wait: 10s
match:
team: node
receivers:
- name: webhook
webhook_configs:
- url: 'http://prometheus-webhook-dingtalk:5358/dingtalk/webhook/send'
send_resolved: true
pagerduty_configs:
- service_key: 84c023a8c96f4339aa9716dcd213f421