blackbox_exporter
常规的各种exporter都是和需要监控的机器一起安装的,如果需要监控一些tcp端口和七层应用层的状态呢,这个时候就需要黑盒监控了,不需要安装在目标机器上即可从外部去监控。
9115是它的http端点的默认监听端口,blackbox.yml它的配置文件里以基础的http、dns、tcp、icmp等prober定制配置出各种监测模块(module),在prometheus server的配置文件里声明用哪个模块去探测哪个targets,下面以docker-compose启动一组实例,docker的网络自带dns,所以里面全部用名字替代ip。
安装
sudo useradd --no-create-home --shell /bin/false blackbox_exporter
sudo useradd --no-create-home --shell /bin/false alertmanager
wget https://github.com/prometheus/blackbox_exporter/releases/download/v0.14.0/blackbox_exporter-0.14.0.linux-amd64.tar.gz
tar xvf blackbox_exporter-0.14.0.linux-amd64.tar.gz
sudo mv ./blackbox_exporter-0.14.0.linux-amd64/blackbox_exporter /usr/local/bin
sudo chown blackbox_exporter:blackbox_exporter /usr/local/bin/blackbox_exporter
sudo mkdir /etc/blackbox_exporter
sudo chown blackbox_exporter:blackbox_exporter /etc/blackbox_exporter
sudo vim /etc/blackbox_exporter/blackbox.yml
modules:
http_2xx: # http 监测模块
prober: http
http:
http_post_2xx: # http post 监测模块
prober: http
http:
method: POST
tcp_connect: # tcp 监测模块
prober: tcp
ping: # icmp 检测模块
prober: icmp
timeout: 5s
icmp:
preferred_ip_protocol: "ip4"
sudo vim /etc/systemd/system/blackbox_exporter.service
[Unit]
Description=Blackbox Exporter
Wants=network-online.target
After=network-online.target
[Service]
User=blackbox_exporter
Group=blackbox_exporter
Type=simple
ExecStart=/usr/local/bin/blackbox_exporter --config.file /etc/blackbox_exporter/blackbox.yml
[Install]
WantedBy=multi-user.target
sudo systemctl daemon-reload
sudo systemctl start blackbox_exporter
sudo systemctl status blackbox_exporter
sudo systemctl enable blackbox_exporter
使用场景
ping检测
在内网可以通过ping (icmp)检测服务器的存活,以前面的最基本的module配置为例,在Prometheus的配置文件中配置使用ping module:
- job_name: 'ping_all'
scrape_interval: 1m
metrics_path: /probe
params:
module: [ping]
static_configs:
- targets:
- 192.168.1.2
labels:
instance: node2
- targets:
- 192.168.1.3
labels:
instance: node3
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- target_label: __address__
replacement: 127.0.0.1:9115 # black_exporter 这里和Prometheus在一台机器上
curl "http://127.0.0.1:9115/probe?module=ping&target=180.101.49.11"
# HELP probe_dns_lookup_time_seconds Returns the time taken for probe dns lookup in seconds
# TYPE probe_dns_lookup_time_seconds gauge
probe_dns_lookup_time_seconds 4.3875e-05
# HELP probe_duration_seconds Returns how long the probe took to complete in seconds
# TYPE probe_duration_seconds gauge
probe_duration_seconds 0.059794089
# HELP probe_icmp_duration_seconds Duration of icmp request by phase
# TYPE probe_icmp_duration_seconds gauge
probe_icmp_duration_seconds{phase="resolve"} 4.3875e-05
probe_icmp_duration_seconds{phase="rtt"} 0.01433185
probe_icmp_duration_seconds{phase="setup"} 0.045255703
# HELP probe_ip_protocol Specifies whether probe ip protocol is IP4 or IP6
# TYPE probe_ip_protocol gauge
probe_ip_protocol 4
# HELP probe_success Displays whether or not the probe was a success
# TYPE probe_success gaug
http检测
以前面的最基本的module配置为例,在Prometheus的配置文件中配置使用http_2xx module:
- job_name: 'http_get_all'
scrape_interval: 30s
metrics_path: /probe
params:
module: [http_2xx]
static_configs:
- targets:
- https://frognew.com
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: 127.0.0.1:9115
http检测除了可以探测http服务的存活外,还可以根据指标probe_ssl_earliest_cert_expiry进行ssl证书有效期预警。
参考:
https://zhangguanzhang.github.io/2018/12/04/black-box-exporter/
https://blog.frognew.com/2018/02/prometheus-blackbox-exporter.html
https://prometheus.io/download/