Prometheus+Grafana监控系统

说明

Prometheus负责收集数据,Grafana负责展示数据。其中采用Prometheus 中的 Exporter含:

  1. Node Exporter,负责收集 host 硬件和操作系统数据。它将以容器方式运行在所有 host 上
  2. 2cAdvisor,负责收集容器数据。它将以容器方式运行在所有 host 上
  3. Alertmanager,负责告警。它将以容器方式运行在所有 host 上

docker安装

环境准备

使用清华镜像

curl -fsSL https://mirrors.tuna.tsinghua.edu.cn/docker-ce/linux/debian/gpg | sudo apt-key add -

配置docker apt

echo 'deb https://mirrors.tuna.tsinghua.edu.cn/docker-ce/linux/debian/ buster stable' | sudo tee /etc/apt/sources.list.d/docker.list

更新apt

apt-get update
安装docker

如果已经安装过的需要先卸载就版本,没有安装过输入命令显示无效操作

apt-get docker docker-engine docker.io

进行docker安装

apt-get install docker-ce

查看docker版本

docker -v

查看docker运行状态

systemctl status docker 或者 service docker status

安装docker-compose

apt install docker-compose

添加配置文件

mkdir -p /usr/local/src/config
cd /usr/local/src/config
添加prometheus.yml配置文件
global:
  scrape_interval:     15s
  evaluation_interval: 15s

alerting:
  alertmanagers:
  - static_configs:
    - targets: ['192.168.127.130:9093']

rule_files:
  - "node_down.yml"

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
    - targets: ['192.168.127.130:9090']

  - job_name: 'cadvisor'
    static_configs:
    - targets: ['192.168.127.130:8080']

  - job_name: 'node'
    scrape_interval: 8s
    static_configs:
      - targets: ['192.168.127.130:9100']
添加邮件告警配置文件
global:
  smtp_smarthost: 'smtp.qq.com:25'
  smtp_from: ''
  smtp_auth_username: ''
  smtp_auth_password: ''
  smtp_require_tls: false

route:
  group_by: ['alertname']
  group_wait: 10s
  group_interval: 10s
  repeat_interval: 10m
  receiver: live-monitoring

receivers:
- name: 'live-monitoring'
  email_configs:
  - to: ''
添加报警规则
groups:
- name: node_down
  rules:
  - alert: InstanceDown
    expr: up == 0
    for: 1m
    labels:
      user: test
    annotations:
      summary: "Instance {{ $labels.instance }} down"
      description: "{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 1 minutes."
编写docker-compose
version: '2'

networks:
    monitor:
        driver: bridge

services:
    prometheus:
        image: prom/prometheus:latest
        container_name: prometheus
        hostname: prometheus
        restart: always
        volumes:
            - /usr/local/src/config/prometheus.yml:/etc/prometheus/prometheus.yml
            - /usr/local/src/config/node_down.yml:/etc/prometheus/node_down.yml
        ports:
            - "9090:9090"
        networks:
            - monitor

    alertmanager:
        image: quay.io/prometheus/alertmanager:v0.12.0
        container_name: alertmanager
        hostname: alertmanager
        restart: always
        volumes:
            - /usr/local/src/config/alertmanager.yml:/etc/alertmanager/alertmanager.yml
        ports:
            - "9093:9093"
        networks:
            - monitor

    grafana:
        image: grafana/grafana:6.3.5
        container_name: grafana
        hostname: grafana
        restart: always
        ports:
            - "3000:3000"
        networks:
            - monitor

    node-exporter:
        image: quay.io/prometheus/node-exporter
        container_name: node-exporter
        hostname: node-exporter
        restart: always
        ports:
            - "9100:9100"
        networks:
            - monitor

    cadvisor:
        image: google/cadvisor:latest
        container_name: cadvisor
        hostname: cadvisor
        restart: always
        volumes:
            - /:/rootfs:ro
            - /var/run:/var/run:rw
            - /sys:/sys:ro
            - /var/lib/docker/:/var/lib/docker:ro
        ports:
            - "8080:8080"
        networks:
            - monitor

启动docker-compose

启动容器:
docker-compose -f /usr/local/src/config/docker-compose-monitor.yml up -d
删除容器:
docker-compose -f /usr/local/src/config/docker-compose-monitor.yml down
重启容器:
docker restart id
image.png

9100:收集到数据,有了它可以做数据展示
9093:配置文件的数据
9090:prometheus的web界面
8080:用于收集,聚合,处理和导出有关正在运行的容器的信息
3000:grafana的web界面

附录:单独命令启动各容器

#启动prometheus
docker run -d -p 9090:9090 --name=prometheus \
-v /usr/local/src/config/prometheus.yml:/etc/prometheus/prometheus.yml \
-v /usr/local/src/config/node_down.yml:/etc/prometheus/node_down.yml \
prom/prometheus

# 启动grafana
docker run -d -p 3000:3000 --name=grafana grafana/grafana

#启动alertmanager容器
docker run -d -p 9093:9093 -v /usr/local/src/config/config.yml:/etc/alertmanager/config.yml --name alertmanager prom/alertmanager

#启动node exporter
docker run -d \
  -p 9100:9100 \
  -v "/:/host:ro,rslave" \
  --name=node_exporter \
  quay.io/prometheus/node-exporter \
  --path.rootfs /host

#启动cadvisor
docker run                                    \
--volume=/:/rootfs:ro                         \
--volume=/var/run:/var/run:rw                 \
--volume=/sys:/sys:ro                         \
--volume=/var/lib/docker/:/var/lib/docker:ro  \
--publish=8080:8080                           \
--detach=true                                 \
--name=cadvisor                               \
google/cadvisor:latest
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容