prometheus官网:重点了解配置文件:
https://prometheus.io/docs/introduction/overview/
prometheus+grafana监控K8S,未加alert版本:
- prometheus配置文件:prometheus-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-config
namespace: kube-ops
data:
prometheus.yml: |
global:
scrape_interval: 30s
scrape_timeout: 30s
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'kubernetes-apiservers'
kubernetes_sd_configs:
- role: endpoints
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
action: keep
regex: default;kubernetes;https
- job_name: 'kubernetes-nodes'
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: node
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- target_label: __address__
replacement: kubernetes.default.svc:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics
- job_name: 'kubernetes-cadvisor'
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: node
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- target_label: __address__
replacement: kubernetes.default.svc:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
- job_name: 'kubernetes-node-exporter'
scheme: http
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: node
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- source_labels: [__meta_kubernetes_role]
action: replace
target_label: kubernetes_role
- source_labels: [__address__]
regex: '(.*):10250'
replacement: '${1}:31672'
target_label: __address__
- Prometheus采用v2.0.0版本,所以需要创建ServiceAccount, 1.0版本则不需要service_Account.yaml, 相关资料: https://www.kubernetes.org.cn/4062.html
apiVersion: v1
kind: ServiceAccount
metadata:
name: prometheus
namespace: kube-ops
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
name: prometheus
namespace: kube-ops
rules:
- apiGroups: [""]
resources:
- nodes
- nodes/proxy
- services
- endpoints
- pods
verbs: ["get", "list", "watch"]
- nonResourceURLs: ["/metrics"]
verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: prometheus
namespace: kube-ops
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: prometheus
subjects:
- kind: ServiceAccount
name: prometheus
namespace: kube-ops
- node-exporter.yaml
---
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: node-exporter
namespace: kube-ops
labels:
k8s-app: node-exporter
spec:
template:
metadata:
labels:
k8s-app: node-exporter
spec:
containers:
- image: prom/node-exporter
name: node-exporter
ports:
- containerPort: 9100
protocol: TCP
name: http
---
apiVersion: v1
kind: Service
metadata:
labels:
k8s-app: node-exporter
name: node-exporter
namespace: kube-ops
spec:
ports:
- name: http
port: 9100
nodePort: 31672
protocol: TCP
type: NodePort
selector:
k8s-app: node-exporter
- prometheus_deploy.yaml
prometheus可以让外网访问,有很多方式:
- 用LoadBalancer服务,生成一个外网可以访问的IP,这样的话,但是这种方式,一旦删除了服务,重新新建一个服务,IP地址会随时发生改变,访问的地址就会随时发生改变,loadbancer暴露的是deployment。
- 用域名访问,就需要为用ingress的部署方式来暴露服务,ingress暴露的是service。
- ingress有几种不同类型:
- kubernetes.io/ingress.class: "nginx" 相关介绍http://blog.51cto.com/newfly/2060587
- kubernetes.io/ingress.class: "traefik": 相关介绍:https://www.jianshu.com/p/121b58782865
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
labels:
k8s-app: prometheus
name: prometheus
namespace: kube-ops
spec:
replicas: 1
template:
metadata:
labels:
k8s-app: prometheus
spec:
serviceAccountName: prometheus
containers:
- image: prom/prometheus:v2.0.0
name: prometheus
command:
- "/bin/prometheus"
args:
- "--config.file=/etc/prometheus/prometheus.yml"
- "--storage.tsdb.path=/prometheus"
- "--storage.tsdb.retention=15d"
ports:
- containerPort: 9090
protocol: TCP
name: http
volumeMounts:
- mountPath: "/prometheus"
name: data
subPath: prometheus/data
- mountPath: "/etc/prometheus"
name: config-volume
resources:
requests:
cpu: 100m
memory: 100Mi
limits:
cpu: 200m
memory: 1Gi
volumes:
volumes:
- name: data
emptyDir: {}
- configMap:
name: prometheus-config
name: config-volume
#loadbanlancer方式
---
apiVersion: v1
kind: Service
metadata:
name: prometheus-srv
namespace: kube-ops
labels:
k8s-app: prometheus
spec:
ports:
- port: 80
targetPort: 9090
type: LoadBalancer #部署到阿里云的集群,用loadbalancer来暴露服务
selector:
k8s-app: prometheus
#ingress暴露服务,服务本身采用clusterIP的方式,service默认不设置,就标识ClusterIP:
---
apiVersion: v1
kind: Service
metadata:
name: prometheus-srv
namespace: kube-ops
labels:
k8s-app: prometheus
spec:
ports:
- port: 80
targetPort: 9090
selector:
k8s-app: prometheus
---apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: traefik-default-ingress
namespace: kube-ops
annotations:
kubernetes.io/ingress.class: "nginx"
spec:
rules:
- host: "*.prometheus.cfd1afbd5543c44c58397f5d17a601026.cn-shanghai.alicontainer.com"
http:
paths:
- backend:
serviceName: prometheus-srv
servicePort: 9090
path: /
- grafana.yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: monitoring-grafana
namespace: kube-ops
spec:
replicas: 1
template:
metadata:
labels:
k8s-app: grafana
task: monitoring
spec:
containers:
- name: grafana
image: <your grafana image path>
ports:
- containerPort: 3000
protocol: TCP
resources:
limits:
cpu: 200m
memory: 256Mi
requests:
cpu: 100m
memory: 100Mi
env:
- name: INFLUXDB_HOST
value: monitoring-influxdb
- name: GF_INSTALL_PLUGINS
value: grafana-kubernetes-app, grafana-clock-panel, briangann-gauge-panel, michaeldmoore-annunciator-panel, jdbranham-diagram-panel, grafana-piechart-panel, grafana-worldmap-panel, vonage-status-panel
---
apiVersion: v1
kind: Service
metadata:
labels:
kubernetes.io/cluster-service: 'true'
kubernetes.io/name: grafana
name: monitoring-grafana
namespace: kube-ops
spec:
ports:
- port: 80
targetPort: 3000
type: LoadBalancer
selector:
k8s-app: grafana
grafana: dashboard可以去官网下载模板
参考文件:https://www.qikqiak.com/post/kubernetes-monitor-prometheus-grafana/