Prometheus部署:
安装 go 语言环境
由于Prometheus是用golang开发的,所以首先安装一个go环境,Go语言是跨平台,支持Windows、Linux、Mac OS X等系统,还提供有源码,可编译安装
- 二进制包下载地址
wget https://dl.google.com/go/go1.13.5.linux-amd64.tar.gz
想安装指定历史版本,点此进入:go语言官网
- 下载后,上传到要部署的服务器。指定解压路径后,设置环境变量
[root@prometheus ~]# tar xf go1.13.4.linux-amd64.tar.gz -C /usr/local
- 配置环境变量
[root@prometheus ~]# vim /etc/profile
# 在文件的最后添加如下内容:
export PATH=$PATH:/usr/local/go/bin
# 生效环境变量
[root@prometheus ~]# source /etc/profile
- 测试go环境是否安装成功
[root@prometheus ~]# go version
go version go1.13.4 linux/amd64
安装Prometheus
- 二进制包下载地址
wget https://github.com/prometheus/prometheus/releases/download/v2.14.0/prometheus-2.14.0.linux-amd64.tar.gz
想安装指定历史版本,点此进入:Prometheus官网(包括所有的export组件)
- 下载好包后,传到准备好的server端,解压到指定目录
[root@prometheus ~]# tar xf prometheus-2.14.0.linux-amd64.tar.gz -C /usr/local
- 为了以后方便配置,做一个软链接
[root@prometheus ~]# ln -sv /usr/local/prometheus-2.14.0.linux-amd64/ /usr/local/Prometheus
- 然后修改prometheus的配置文件
[root@prometheus ~]# vim /usr/local/Prometheus/prometheus.yml
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: 'prometheus'
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ['localhost:9090']
- 后台启动prometheus
[root@prometheus ~]# nohup /usr/local/Prometheus/prometheus --config.file=/usr/local/Prometheus/prometheus.yml &
- 查看是否运行状态
[root@prometheus ~]# cat ./nohup.out
通过如下URL可以打开prometheus的自带监控界面: localhost:9090,点击targets 跳转到监控目标,可以展示出所配置的监控对象。
如果state一列中,所配置的监控对象的状态事down,说明node节点未配置或配置错误
配置热加载
Promtheus的时序 数据库 在存储了大量的数据后,每次重启Prometheus进程的时间会越来越慢。 而在日常运维工作中会经常调整Prometheus的配置信息,实际上Prometheus提供了在运行时热加载配置信息的功能。
- 热加载方式:发送一个POST请求到 /-/reload ,需要在启动prometheus时加上 --web.enable-lifecycle 选项
[root@prometheus ~]# nohup /usr/local/Prometheus/prometheus --config.file=/usr/local/Prometheus/prometheus.yml --web.enable-lifecycle &
- 热加载命令
[root@prometheus ~]# curl -XPOST http://localhost:9090/-/reload
这样更改配置文件后就不用关闭再启动prometheus了
扩展(批量添加节点)
在添加大量主机集群时,一台一台在prometheus.yml中添加,显然不太方便,我们通过编写发现文件,进行批量主机管理
- 编写自动发现文件
[root@prometheus ~]# mkdir -p /usr/local/Prometheus/prometheus/targets/node
[root@prometheus ~]# vim /usr/local/Prometheus/prometheus/targets/node/node.yml
- targets:
- '172.16.214.141:9100'
- '172.16.214.140:9100'
- '172.16.214.139:9100'
labels:
idc: "bj" # 备注集群名
- 修改配置文件
[root@prometheus ~]# vim /usr/local/Prometheus/prometheus/prometheus.yml
- job_name: 'nm_mch-app_node'
file_sd_configs:
- files: ['/usr/local/Prometheus/prometheus/targets/node/node.yml']
refresh_interval: 5s
配置systemd启动prometheus
[root@prometheus ~]# vim /lib/systemd/system/prometheus.service
[Unit]
Description=Prometheus
After=network.target
[Service]
ExecStart=/usr/local/Prometheus/prometheus --config.file=/usr/local/Prometheus/prometheus.yml --web.enable-lifecycle
[Install]
WantedBy=multi-user.target
# 重载配置并设置开机自启
[root@prometheus ~]# systemctl daemon-reload
[root@prometheus ~]# systemctl start prometheus
[root@prometheus ~]# systemctl enable prometheus