Prometheus在k8s上的安装与使用

Prometheus可是个好东西,云原生时代监控领域的现象级产品,常与Grafana搭配使用,是当前互联网企业的首选监控解决方案。

一、安装Prometheus

安装主要有YAML、Operater两种,先从YAML开始可以更好的理解细节(Operater最终也是生成的yml文件)。需要考虑几个点:

  • 访问权限
  • 配置文件
  • 存储卷
    访问权限相关的配置:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: prometheus
rules:
  - apiGroups: [""] # "" indicates the core API group
    resources:
      - nodes
      - nodes/proxy
      - services
      - endpoints
      - pods
    verbs:
      - get
      - watch
      - list
  - apiGroups:
      - extensions
    resources:
      - ingresses
    verbs:
      - get
      - watch
      - list
  - nonResourceURLs: ["/metrics"]
    verbs:
      - get
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: prometheus
  namespace: smac
  labels:
    app: prometheus
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: prometheus
subjects:
  - kind: ServiceAccount
    name: prometheus
    namespace: smac
roleRef:
  kind: ClusterRole
  name: prometheus
  apiGroup: rbac.authorization.k8s.io

配置文件configmap

apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-rules
  namespace: smac
  labels:
    app: prometheus
data:
  cpu-usage.rule: |
    #因篇幅过长,此处内容忽略
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-conf
  namespace: smac
  labels:
    app: prometheus
data:
  prometheus.yml: |-
  #因篇幅过长,此处内容忽略

存储卷相关的配置,建议使用StorageClass,官方不建议使用NFS,极端情况会导致数据丢失,配置如下:

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: prometheus-pvc
  namespace: smac
  labels:
    app: prometheus
  annotations:
    volume.beta.kubernetes.io/storage-class: "local"
  finalizers:
    - kubernetes.io/pvc-protection
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi

后面,就是常规的deployment和service的配置:

kind: Deployment
apiVersion: apps/v1
metadata:
  labels:
    app: prometheus
  name: prometheus
  namespace: smac
spec:
  replicas: 1
  selector:
    matchLabels:
      app: prometheus
  template:
    metadata:
      labels:
        app: prometheus
    spec:
      serviceAccountName: prometheus
      securityContext:
        runAsUser: 0
      containers:
        - name: prometheus
          image: prom/prometheus:v2.29.1
          imagePullPolicy: IfNotPresent
          volumeMounts:
            - mountPath: /prometheus
              name: prometheus-data-volume
            - name: prometheus-conf-volume  #注意,此处不能用subPath,会导致configmap的热更新失效
              mountPath: /etc/prometheus
            - name: prometheus-rules-volume
              mountPath: /etc/prometheus/rules
          ports:
            - containerPort: 9090
              protocol: TCP
      volumes:
        - name: prometheus-data-volume
          persistentVolumeClaim:
            claimName: prometheus-data-pvc
        - name: prometheus-conf-volume
          configMap:
            name: prometheus-conf
        - name: prometheus-rules-volume
          configMap:
            name: prometheus-rules
---
#service
kind: Service
apiVersion: v1
metadata:
  annotations:
    prometheus.io/scrape: 'true'
  labels:
    app: prometheus
  name: prometheus-service
  namespace: smac
spec:
  ports:
    - port: 9090
      targetPort: 9090
  selector:
    app: prometheus
  type: NodePort

二、配置热更新

接下来,我们要在prometheus中添加一个job。修改configmap中的prometheus.yml,增加如下内容:

scrape_configs:
  ...
  - job_name: "demo-service"
    metrics_path: "/actuator/prometheus"
    static_configs:
      - targets: ["10.233.97.135:8080"]

嗯?发现并没有生效诶?难道需要重启?有没有热更新的方式?
于是一通搜索,得到以下结论:

Prometheus支持热更新,在启动时通过参数--web.enable-lifecycle开启,之后通过 curl -X POST http://localhost:9090/-/reload 即可更新。

于是调整配置如下:

      containers:
        - name: prometheus
          args:
            - '--config.file=/etc/prometheus/prometheus.yml'
            - '--web.enable-lifecycle'

调整之后进行了重新部署,然后修改configmap的内容,按照上面的命令执行就可以了。但是,手动更新依然很麻烦,能不能自动更细呢?于是,又一通搜索,发现了一款神器:configmap-reload,于是赶紧配置上:

      containers:
        - name: prometheus
          image: prom/prometheus:v2.29.1
          args:
            - '--config.file=/etc/prometheus/prometheus.yml'
            - '--web.enable-lifecycle'
          ...
        - name: prometheus-configmap-reloader
          image: 'jimmidyson/configmap-reload:v0.3.0'
          args:
            - '--webhook-url=http://localhost:9090/-/reload'
            - '--volume-dir=/etc/prometheus'   #此处的volume-dir应该与volumes的定义完全一致
          volumeMounts:
            #注意,此处不能用subPath,会导致configmap的热更新失效
            - name: prometheus-conf-volume
              mountPath: /etc/prometheus

调整之后,发现容器组里多了一个prometheus-configmap-reloader的pod。此时再尝试,修改configmap后过一小会儿(大概10s,不要问我为什么,你懂的),新增的配置项生效了,我们从Targets中发现了‘demo-service’。

注意:configmap如果使用subPath进行挂载,将无法自动更新。

三、Job的服务发现

上面我们添加一个job,targets中是指定的ip:port,在k8s中显然是不实用的,我们必须要实现动态获取target。好在prometheus已经支持了该功能,原理是通过apiserver轮询pod信息。通过kubernetes_sd_configs,可以实现各种资源的服务发现,下面是pod的配置示例:

      - job_name: 'kubernetes-pods'
        kubernetes_sd_configs:
        - role: pod
        relabel_configs:
        - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
          action: keep
          regex: true
        - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
          action: replace
          target_label: __metrics_path__
          regex: (.+)
        - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
          action: replace
          regex: ([^:]+)(?::\d+)?;(\d+)
          replacement: $1:$2
          target_label: __address__
        - action: labelmap
          regex: __meta_kubernetes_pod_label_(.+)
        - source_labels: [__meta_kubernetes_namespace]
          action: replace
          target_label: kubernetes_namespace
        - source_labels: [__meta_kubernetes_pod_name]
          action: replace
          target_label: kubernetes_pod_name

这个是官方例子,具体label的清单以及含义,请参照官网:Prometheus#Configuration#kubernetes_sd_config

我的实战配置如下:

scrape_configs:
  - job_name: "demo-service"
    metrics_path: "/actuator/prometheus"
    static_configs:
      - targets: ["demo-service:8080"]

    kubernetes_sd_configs:
    - role: pod

    relabel_configs:
    - source_labels: [__meta_kubernetes_pod_label_app]
      regex: demo-service
      action: keep
    - source_labels: [__meta_kubernetes_pod_label_app]
      action: replace
      target_label: application
    - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
      action: replace
      regex: ([^:]+)(?::\d+)?;(\d+)
      replacement: $1:$2
      target_label: __address__

此配置的效果是从所有pod中寻找label:app=demo-serivce的pod,将pod的地址和端口替换上面配置中targets的内容,并添加一个application=demo-service的label。

至此,基本满足平时的使用了,再往后就是高可用HA、第三方存储、PromQL的实战等高级内容,敬请期待~!

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容