使用 prometheus+ grafana+node_exporter+jmx_exporter+alertmanager监控应用服务器与tomcat服务
- node-exporter组件:负责收集节点上的metrics监控数据
- prometheus:一个开源的服务监控系统和时间序列数据库,负责抓取并存储exporter组件获取的数据
- grafana:将这些数据通过网页以图形的形式展现给用户
- jmx_exporter:监控java程序
- alertmanager:报警管理
1.安装node-exporter(在所要监控的应用机器上部署)
- 下载 node-exporter
wget https://github.com/prometheus/node_exporter/releases/download/v1.0.0-rc.0/node_exporter-1.0.0-rc.0.linux-amd64.tar.gz
- 解压
tar -zxvf node_exporter-1.0.0-rc.0.linux-amd64.tar.gz
- 启动访问:
http://ip:9100/metrics
2.下载 jmx_exporter
https://github.com/prometheus/jmx_exporter
- 下载 jar
wget https://repo1.maven.org/maven2/io/prometheus/jmx/jmx_prometheus_javaagent/0.12.0/jmx_prometheus_javaagent-0.12.0.jar
- 下载配置文件
https://github.com/prometheus/jmx_exporter/blob/master/example_configs/tomcat.yml - 在tomcat的catalina.sh添加配置
JAVA_OPTS="-Djava.awt.headless=true -Xmx1028m -XX:+UseConcMarkSweepGC -javaagent:/usr/soft/agent/jmx_prometheus_javaagent-0.12.0.jar=9191:/usr/soft/agent/tomcat.yml"
- 重新启动tomcat,访问
http://ip:9191/metrics
3.安装prometheus
- 首先下载 prometheus
wget https://github.com/prometheus/prometheus/releases/download/v2.17.1/prometheus-2.17.1.linux-amd64.tar.gz
- 解压
tar -zxvf prometheus-2.17.1.linux-amd64.tar.gz
- 配置 prometheus
vim prometheus.yml
#my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
#scrape_timeout is set to the global default (10s).
#Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
#- alertmanager:9093
#Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
#- "first_rules.yml"
#- "second_rules.yml"
#A scrape configuration containing exactly one endpoint to scrape:
#Here it's Prometheus itself.
scrape_configs:
#The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: 'prometheus'
#metrics_path defaults to '/metrics'
#scheme defaults to 'http'.
static_configs:
- targets: ['localhost:9090']
- job_name: "node"
static_configs:
- targets: ["localhost:9100"]
- job_name: "jmx"
static_configs:
- targets: ["localhost:9191"]
- 检查配置文件:
./promtool check config prometheus.yml
- 启动prometheus
./prometheus --config.file=prometheus.yml &
- 访问:
http://ip:9090
- 访问:
http://ip:9090/targets
4.grafana安装
- 下载安装包
wget https://dl.grafana.com/oss/release/grafana-6.7.2.linux-amd64.tar.gz
解压
tar -zxvf grafana-6.7.2.linux-amd64.tar.gz
启动:
./grafana-server &
访问 :
http://ip:3000/login
配置prometheus数据源
-
浏览安装成功
5.安装 alertmanager
下载
wget https://github.com/prometheus/alertmanager/releases/download/v0.20.0/alertmanager-0.20.0.linux-amd64.tar.gz
解压
tar -zxvf alertmanager-0.20.0.linux-amd64.tar.gz
配置模版
alertmanager.yml
#alertmanager.yml
# 全局配置项
global:
resolve_timeout: 5s #超时,默认5min
#邮箱smtp服务
smtp_smarthost: 'smtp.qq.com:587'
smtp_from: 'zhaosc007@qq.com'
smtp_auth_username: 'zhaosc007@qq.com'
smtp_auth_password: '密码'
smtp_hello: 'qq.com'
#smtp_require_tls: false
# 模板
templates:
- '/alertmanager/*.tmpl'
# 路由
route:
group_by: ['alertname'] # 报警分组依据
group_wait: 20s #组等待时间
group_interval: 20s # 发送前等待时间
repeat_interval: 12h #重复周期
receiver: 'email' # 默认警报接收者
# 警报接收者
receivers:
- name: 'email' #警报名称
email_configs:
- to: '545843950@qq.com' # 接收警报的email
html: '{{ template "emai.html" . }}' # 模板
headers: { Subject: " {{ .CommonLabels.instance }} {{ .CommonAnnotations.summary }}" } #标题
- email 模版
emai.tmpl
{{ define "emai.html" }}
{{ range .Alerts }}
<pre>
实例: {{ .Labels.instance }}
信息: {{ .Annotations.summary }}
详情: {{ .Annotations.description }}
时间: {{ .StartsAt.Format "2006-01-02 15:04:05" }}
</pre>
{{ end }}
{{ end }}
- 启动
./alertmanager-config.file= alertmanager.yml #默认配置项为alertmanager.yml