采用prometheus和grafana监控
安装Noderexporter和Cadvisor
docker stop cadvisor
docker rm cadvisor
sudo docker run \
--volume=/:/rootfs:ro \
--volume=/var/run:/var/run:rw \
--volume=/sys:/sys:ro \
--volume=/var/lib/docker/:/var/lib/docker:ro \
--publish=8081:8080 \
--detach=true \
--name=cadvisor \
docker.io/google/cadvisor
docker stop nodeexporter
docker rm nodeexporter
docker run -d -p 9100:9100 \
-v "/proc:/host/proc:ro" \
-v "/sys:/host/sys:ro" \
-v "/:/rootfs:ro" \
--net="host" \
quay.io/prometheus/node-exporter
docker stop prometheus
docker rm prometheus
docker run -itd -p 9090:9090 \
-v /opt/prometheus/prometheus.yml:/etc/prometheus/prometheus.yml \
-v /opt/prometheus/prometheus-data:/prometheus \
-v /etc/localtime:/etc/localtime:ro \
--privileged=true \
--restart always \
--name=prometheus \
docker.io/prom/prometheus
配置
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: 'prometheus'
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ['192.168.1.54:9090']
- job_name: 'mysql'
scrape_interval: 5s
static_configs:
- targets: ['192.168.1.54:9104']
labels:
instance: cargts_ali
- job_name: 'nodeexporter'
scrape_interval: 5s
static_configs:
- targets: ['192.168.1.54:9100']
labels:
instance: node
- job_name: 'docker-monitor'
scrape_interval: 5s
static_configs:
- targets: ['192.168.1.54:8081']
labels:
group: 'prod'
kafka监控
https://www.robustperception.io/monitoring-kafka-with-prometheus/
解决问题
1.cadvisor无法映射路径的问题
sudo mount -o remount,rw '/sys/fs/cgroup'
sudo ln -s /sys/fs/cgroup/cpu,cpuacct /sys/fs/cgroup/cpuacct,cpu
2.导入模板无法显示的问题
添加数据源的时候要注意:将Prometheus的p大写!