读完这个你就有三年运维经验

zabbix

监控

为什么监控。有哪些监控，区别，优缺点
监控对象：
　　　　1. 监控对象的理解：CPU是怎么工作的，原理
　　　　2. 监控对象的指标：CPU使用率 CPU负载 CPU个数上下文切换
　　　　3. 确定性能基准线：怎么样才算故障？CPU负载多上才算高
监控范围：
　　　　1.硬件监控服务器的硬件故障
　　　　2.操作系统监控 CPU、内存、硬盘、IO、进程
　　　　3.应用服务监控 nginx、MySQL、等服务
　　　　4.业务监控

硬件监控：

1.使用IPMI
2.机房巡检
远程控制卡：
　　　　DELL服务器：iDRAC
　　　　HP服务器：ILO ————-Linux就可以使用IPMI（依赖于BMC控制器）
　　　　IBM服务器：IMM |
　　　　Linux是管理IPMI工具
　　　　‘ipmitool’（监控和控制）

1.硬件要支持

2.操作系统 ‘Linux IPMI’

ipmitool安装:

[root@localhost ~]# yum install OpenIPMI ipmitool -y
[root@localhost ~]# rpm -qa OpenIPMI ipmitool
ipmitool-1.8.13-8.el7_1.x86_64
OpenIPMI-2.0.19-11.el7.x86_64

使用IPMI有两种方式
1、本地进行调用
2、远程调用（IP地址用户名和密码）

[root@localhost ~]# systemctl start ipmi  #启动
本次以Centos7进行演示

IPMI相关命令

[root@localhost ~]# ipmitool --help
ipmitool: invalid option -- '-'
ipmitool version 1.8.13
usage: ipmitool [options...] <command>
       -h             This help
       -V             Show version information
       -v             Verbose (can use multiple times)
       -c             Display output in comma separated format
       -d N           Specify a /dev/ipmiN device to use (default=0)
       -I intf        Interface to use
       -H hostname    Remote host name for LAN interface
       -p port        Remote RMCP port [default=623]
       -U username    Remote session username
       -f file        Read remote session password from file
       -z size        Change Size of Communication Channel (OEM)
       -S sdr         Use local file for remote SDR cache
       -D tty:b[:s]   Specify the serial device, baud rate to use
                      and, optionally, specify that interface is the system one
       -a             Prompt for remote password
       -Y             Prompt for the Kg key for IPMIv2 authentication
       -e char        Set SOL escape character
       -C ciphersuite Cipher suite to be used by lanplus interface
       -k key         Use Kg key for IPMIv2 authentication
       -y hex_key     Use hexadecimal-encoded Kg key for IPMIv2 authentication
       -L level       Remote session privilege level [default=ADMINISTRATOR]
                      Append a '+' to use name/privilege lookup in RAKP1
       -A authtype    Force use of auth type NONE, PASSWORD, MD2, MD5 or OEM
       -P password    Remote session password
       -E             Read password from IPMI_PASSWORD environment variable
       -K             Read kgkey from IPMI_KGKEY environment variable
       -m address     Set local IPMB address
       -b channel     Set destination channel for bridged request
       -t address     Bridge request to remote target address
       -B channel     Set transit channel for bridged request (dual bridge)
       -T address     Set transit address for bridge request (dual bridge)
       -l lun         Set destination lun for raw commands
       -o oemtype     Setup for OEM (use 'list' to see available OEM types)
       -O seloem      Use file for OEM SEL event descriptions
       -N seconds     Specify timeout for lan [default=2] / lanplus [default=1] interface
       -R retry       Set the number of retries for lan/lanplus interface [default=4]
Interfaces:
    open          Linux OpenIPMI Interface [default]
    imb           Intel IMB Interface 
    lan           IPMI v1.5 LAN Interface 
    lanplus       IPMI v2.0 RMCP+ LAN Interface 
    serial-terminal  Serial Interface, Terminal Mode 
    serial-basic  Serial Interface, Basic Mode 
Commands:
    raw           Send a RAW IPMI request and print response
    i2c           Send an I2C Master Write-Read command and print response
    spd           Print SPD info from remote I2C device
    lan           Configure LAN Channels
    chassis       Get chassis status and set power state
    power         Shortcut to chassis power commands
    event         Send pre-defined events to MC
    mc            Management Controller status and global enables
    sdr           Print Sensor Data Repository entries and readings
    sensor        Print detailed sensor information
    fru           Print built-in FRU and scan SDR for FRU locators
    gendev        Read/Write Device associated with Generic Device locators sdr
    sel           Print System Event Log (SEL)
    pef           Configure Platform Event Filtering (PEF)
    sol           Configure and connect IPMIv2.0 Serial-over-LAN
    tsol          Configure and connect with Tyan IPMIv1.5 Serial-over-LAN
    isol          Configure IPMIv1.5 Serial-over-LAN
    user          Configure Management Controller users
    channel       Configure Management Controller channels
    session       Print session information
    dcmi          Data Center Management Interface
    sunoem        OEM Commands for Sun servers
    kontronoem    OEM Commands for Kontron devices
    picmg         Run a PICMG/ATCA extended cmd
    fwum          Update IPMC using Kontron OEM Firmware Update Manager
    firewall      Configure Firmware Firewall
    delloem       OEM Commands for Dell systems
    shell         Launch interactive IPMI shell
    exec          Run list of commands from file
    set           Set runtime variable for shell and exec
    hpm           Update HPM components using PICMG HPM.1 file
    ekanalyzer    run FRU-Ekeying analyzer using FRU files
    ime           Update Intel Manageability Engine Firmware

IPMI配置网络，有两种方式：
ipmi over lan（大体意思是通过网卡来进行连接）
独立（给服务器单独插一个网线） DELL服务器可以在小面板中设置ipmi 云主机我们不需要考虑IPMI

对于路由器和交换机：SNMP
对于这些设备，就不做具体描述了，毕竟没有接触过

系统监控

做为系统运维来说系统监控是重点

- CPU
- 内存
- IO Input/Ouput（网络、磁盘）

CPU三个重要的概念：

1.上下文切换：CPU调度器实施的进程的切换过程，上下文切换
　　2.运行队列（负载）：运行队列，排队可以参考我是一个进程文章
　　3.使用率
监控CPU需要确定服务类型：
（1） IO密集型（数据库）
（2） CPU密集型（Web/mail）

确定性能的基准线
　　运行队列：1-3个线程 1CPU 4核负载不超过12
　　CPU使用：65%-70%用户态利用率
　　　　　　　30%-35%内核态利用率
　　　　　　　0%-5% 空闲
　　上下文切换：越少越好
所有的监控都要根据业务来考虑

常见的系统监控工具

Top、sysstat、mpstat

工具的使用方法
TOP参数解释

# top命令详解

top工具
top是linux常用的性能分析工具，能够实时显示各个进程的资源使用情况。

在命令行输入 top 即可进入
[root@tigerfive ~]# top

top - 07:47:17 up 28 min,  2 users,  load average: 1.08, 1.54, 1.13
Tasks: 209 total,   1 running, 208 sleeping,   0 stopped,   0 zombie
%Cpu(s):  2.4 us, 12.2 sy,  0.1 ni, 82.1 id,  3.2 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem :  7946924 total,  3838888 free,  1033464 used,  3074572 buff/cache
KiB Swap:  8388604 total,  8388604 free,        0 used.  4911180 avail Mem

PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND

      第一行                                                                                     

07:47:17  当前时间

28 min     已运行时间

2 users    当前连接用户

load average: 1.08, 1.54, 1.13   平均负载 ：一分钟  五分钟  十五分钟

第二行                                                                                     


209 total          当前运行的总进程数

1 running        正在运行的进程数

208 sleeping    挂起的进程数

0 stopped        停止的进程数

0 zombie          僵尸进程数

     第三行                                                                             


2.4 us      用户占用的CPU百分比

12.2 sy    系统占用的cpu百分比

 0.1 ni,     调整过优先级的进程占用cpu百分比

82.1 id     空闲时间占cpu百分比 

3.2 wa      等待（等待I/O输入输出）时间占cpu百分比

 0.0 hi       cpu硬中断占用时间百分比

0.0 si        cpu软中断占用时间百分比

0.0 st        被偷走的占用时间百分比

这里的CPU时间百分比是平均值，按1即可展开全部cpu的详细情况

第四行 


total                物理内存总量

free                 空闲的物理内存

used                已使用的物理 内存

buff/cache        缓冲/缓存 内存

第五行


total       交换区总量

free        空闲的交换区   

used       已使用的交换分区

进程信息   


pid        进程ID

user      进程的所有者

PR         实时优先级（共140个级别）

NI          优先级

VIRT       进程使用的虚拟内存

RES        进程使用的真实内存

SHR        共享内存

S             进程状态

%CPU      进程的cpu占用率

%MEM      进程的内存占用率

TIME+       进程占用的总cpu时间片段

COMMSND  进程名称

top的其他交互式指令

h|?帮助
M    按内存的使用排序
P    按CPU使用排序
N    以PID的大小排序
R    对排序进行反转
f    自定义显示字段
1    显示所有CPU的负载

<    向前
\>    向后
z    彩色
W 保存top环境设置 ~/.toprc

`需要被监控的`                                                                                          

        top 中第一行的 **load average** 需要被监控 load average  ： 一分钟 五分钟  十五分钟   等待cpu处理的进程队列的平均长度


 [root@tigerfive ~]# top -bn1 |head -1
   top - 11:28:31 up  4:00,  3 users,  load average: 0.53, 0.46, 0.49            
[root@tigerfive ~]# top -bn1   | head -1 | cut -d ','  -f 3-
   load average: 0.42, 0.54, 0.53
[root@tigerfive ~]# top -bn1   | head -1 | awk -F ',' '{print $3","$4","$5}'
   load average: 0.33, 0.46, 0.50
[root@tigerfive ~]# top -bn1   | head -1 | awk '{print $10$11$12}'
   0.72,0.51,0.51 

需要注意的是，如果运行时间超过一天就需要更改后边的参数，所以需要脚本中需要先判断时间

         第二行的 **Tasks** 也需要监控

监控是否有僵尸进程 

。。。

其实对于Top，现在我更喜欢htop和gtop，gtop虽然色彩和功能更强大，但是因为gtop不在epel源里，导致gtop的使用没有htop用的广泛

当然gtop这么好用，当然要用一下，这是我的另一片关于gtop的文章在讲gtop之前可以先说一下htop,功能类似于top,但是画面比top更出色，更加清晰明了，但是htop已经集成在epel源里边，只需要yum安装即可

比htop更好看的是gtop，一个比htop更加漂亮美观的命令，但是并没有集成到epel源里面，需要使用npm 命令进行安装

先来一波效果的展示

[图片上传失败...(image-60fd35-1597832587753)]

安装需要我们使用node环境，下面我们先安装node环境
Node.js环境搭建

wget https://nodejs.org/dist/v8.9.2/node-v8.9.2-linux-x64.tar.xz
xz -d node-v8.9.2-linux-x64.tar.xz
tar xf node-v8.9.2-linux-x64.tar -C /usr/local/
ln -s /usr/local/node-v8.9.2-linux-x64 /usr/local/node
echo "export PATH=$PATH:/usr/local/node/bin" >>/etc/profile
. /etc/profile && source /etc/profile
node -v
npm install -g cnpm --registry=https://registry.npm.taobao.org
npm -v
cnpm install gtop -g
gtop

停止使用gtop q，或者ctrl+c在大多数shell环境中使用。

您可以按下对流程表进行排序

p：进程ID
c： CPU使用率
m：内存使用情况


第一行 分别显示：系统当前时间 系统运行时间 当前用户登陆数 系统负载。
　　系统负载（load average），这里有三个数值，分别是系统最近1分钟，5分钟，15分钟的平均负载。一般对于单个处理器来说，负载在0 — 1.00 之间是正常的，超过1.00就要引起注意了。在多核处理器中，你的系统均值不应该高于处理器核心的总数。

第二行 分别显示：total进程总数、 running正在运行的进程数、 sleeping睡眠的进程数、stopped停止的进程数、 zombie僵尸进程数。

第三行
分别显示：
%us用户空间占用CPU百分比、
%sy内核空间占用CPU百分比、
%ni用户进程空间内改变过优先级的进程占用CPU百分比、
%id空闲CPU百分比、
%wa等待输入输出（I/O）的CPU时间百分比 、
%hi指的是cpu处理硬件中断的时间、%si指的是cpu处理软中断的时间 、
%st用于有虚拟cpu的情况，用来指示被虚拟机偷掉的cpu时间。
通常id%值可以反映一个系统cpu的闲忙程度。

第四行 MEM ：total 物理内存总量、 used 使用的物理内存总量、free 空闲内存总量、 buffers 用作内核缓存的内存量。

第五行 SWAP：total 交换区总量、 used使用的交换区总量、free 空闲交换区总量、 cached缓冲的交换区总量。
buffers和cached的区别需要说明一下，buffers指的是块设备的读写缓冲区，cached指的是文件系统本身的页面缓存。它们都是linux操作系统底层的机制，目的就是为了加速对磁盘的访问。

第六行 PID(进程号)、 USER（运行用户）、PR（优先级）、NI（任务nice值）、VIRT（虚拟内存用量）VIRT=SWAP+RES 、RES（物理内存用量）、SHR（共享内存用量）、S（进程状态）、%CPU（CPU占用比）、%MEM（物理内存占用比）、TIME+（累计CPU占 用时间)、　COMMAND 命令名/命令行。

下面简单介绍top命令的使用方法：
top [-] [d]

[q] [c] [C] [S] [n]
运维必会！
参数说明
d指定每两次屏幕信息刷新之间的时间间隔。当然用户可以使用s交互命令来改变之。
p通过指定监控进程ID来仅仅监控某个进程的状态。
q该选项将使top没有任何延迟的进行刷新。如果调用程序有超级用户权限，那么top将以尽可能高的优先级运行。
S指定累计模式。
s使top命令在安全模式中运行。这将去除交互命令所带来的潜在危险。
i使top不显示任何闲置或者僵死进程。
c显示整个命令行而不只是显示命令名。
下面介绍在top命令执行过程中可以使用的一些交互命令
　　从使用角度来看，熟练的掌握这些命令比掌握选项还重要一些。
　　这些命令都是单字母的，如果在命令行选项中使用了s选项，则可能其中一些命令会被屏蔽掉。
Ctrl+L 擦除并且重写屏幕。
h或者? 显示帮助画面，给出一些简短的命令总结说明。
k 终止一个进程。系统将提示用户输入需要终止的进程PID，以及需要发送给该进程什么样的信号。一般的终止进程可以使用15信号；如果不能正常结束那就使用信号9强制结束该进程。默认值是信号15。在安全模式中此命令被屏蔽。
i 忽略闲置和僵死进程。这是一个开关式命令。
q 退出程序。
r 重新安排一个进程的优先级别。系统提示用户输入需要改变的进程PID以及需要设置的进程优先级值。输入一个正值将使优先级降低，反之则可以使该进程拥有更高的优先权。默认值是10。
s 改变两次刷新之间的延迟时间。系统将提示用户输入新的时间，单位为s。如果有小数，就换算成m s。输入0值则系统将不断刷新，默认值是5 s。需要注意的是如果设置太小的时间，很可能会引起不断刷新，从而根本来不及看清显示的情况，而且系统负载也会大大增加。
f或者F 从当前显示中添加或者删除项目。
o或者O 改变显示项目的顺序。
l 切换显示平均负载和启动时间信息。
m 切换显示内存信息。
t 切换显示进程和CPU状态信息。
c 切换显示命令名称和完整命令行。
M 根据驻留内存大小进行排序。
P 根据CPU使用百分比大小进行排序。
T 根据时间/累计时间进行排序。
W 将当前设置写入~/.toprc文件中。这是写top配置文件的推荐方法。
Shift+M 可按内存占用情况进行排序。


ipmitool安装:

#### Zabbix监控什么？

| 分类     | 监控项                                                       |
| -------- | ------------------------------------------------------------ |
| 硬件监控 | 温度、硬件故障等                                             |
| 系统监控 | CPU、内存、硬盘、网卡流量、TCP状态、进程数                   |
| 应用监控 | Nginx、Tomcat、PHP、MySQL、Redis等                           |
| 日志监控 | 系统日志、服务日志、访问日志、错误日志                       |
| 安全监控 | WAF、敏感文件监控                                            |
| API监控  | 可用性、接口请求、响应时间                                   |
| 业务监控 | 例如电商网站每分钟产生多少订单、注册多少用户、多少活跃用户、推广效果如何 |
| 流量监控 | 根据流量获取用户相关信息。例如用户地理位置、某页面访问状况、页面停留时间等 |

zabbix常用监控项

zabbix自带的常用监控项
agent.ping 检测客户端可达性、返回nothing表示不可达。1表示可达
system.cpu.load --检测cpu负载。返回浮点数
system.cpu.util -- 检测cpu使用率。返回浮点数
vfs.dev.read -- 检测硬盘读取数据，返回是sps.ops.bps浮点类型，需要定义1024倍
vfs.dev.write -- 检测硬盘写入数据。返回是sps.ops.bps浮点类型，需要定义1024倍
net.if.out[br0] --检测网卡流速、流出方向，时间间隔为60S
net-if-in[br0] --检测网卡流速，流入方向（单位：字节） 时间间隔60S
proc.num[]  目前系统中的进程总数，时间间隔60s
proc.num[,,run] 目前正在运行的进程总数，时间间隔60S
###处理器信息
通过zabbix_get 获取负载值
合理的控制用户态、系统态、IO等待时间剋保证进程高效率的运行
系统态运行时间较高说明进程进行系统调用的次数比较多，一般的程序如果系统态运行时间占用过高就需要优化程序，减少系统调用
io等待时间过高则表明硬盘的io性能差，如果是读写文件比较频繁、读写效率要求比较高，可以考虑更换硬盘，或者使用多磁盘做raid的方案
system.cpu.swtiches --cpu的进程上下文切换，单位sps，表示每秒采样次数，api中参数history需指定为3
system.cpu.intr  --cpu中断数量、api中参数history需指定为3
system.cpu.load[percpu,avg1]  --cpu每分钟的负载值，按照核数做平均值(Processor load (1 min average per core))，api中参数history需指定为0
system.cpu.load[percpu,avg5]  --cpu每5分钟的负载值，按照核数做平均值(Processor load (5 min average per core))，api中参数history需指定为0
system.cpu.load[percpu,avg15]  --cpu每5分钟的负载值，按照核数做平均值(Processor load (15 min average per core))，api中参数history需指定为0

zabbix的自定义常用项
####内存相关
vim /usr/local/zabbix/etc/zabbix_agentd.conf.d/catcarm.conf
UserParameter=ram.info[*],/bin/cat  /proc/meminfo  |awk '/^$1:{print $2}'
ram.info[Cached] --检测内存的缓存使用量、返回整数，需要定义1024倍
ram.info[MemFree] --检测内存的空余量，返回整数，需要定义1024倍
ram.info[Buffers] --检测内存的使用量，返回整数，需要定义1024倍

####TCP相关的自定义项
vim /usr/local/zabbix/share/zabbix/alertscripts/tcp_connection.sh
#!/bin/bash
function ESTAB { 
/usr/sbin/ss -ant |awk '{++s[$1]} END {for(k in s) print k,s[k]}' | grep 'ESTAB' | awk '{print $2}'
}
function TIMEWAIT {
/usr/sbin/ss -ant | awk '{++s[$1]} END {for(k in s) print k,s[k]}' | grep 'TIME-WAIT' | awk '{print $2}'
}
function LISTEN {
/usr/sbin/ss -ant | awk '{++s[$1]} END {for(k in s) print k,s[k]}' | grep 'LISTEN' | awk '{print $2}'
}
$1

vim /usr/local/zabbix/etc/zabbix_agentd.conf.d/cattcp.conf
UserParameter=tcp[*],/usr/local/zabbix/share/zabbix/alertscripts/tcp_connection.sh $1

tcp[TIMEWAIT] --检测TCP的驻留数，返回整数
tcp[ESTAB]  --检测tcp的连接数、返回整数
tcp[LISTEN] --检测TCP的监听数，返回整数

####nginx相关的自定义项

vim /etc/nginx/conf.d/default.conf
    location /nginx-status
    {
        stub_status on;
        access_log off;
        allow 127.0.0.1;
        deny all;
    }

    
vim /usr/local/zabbix/etc/zabbix_agentd.conf.d/nginx.conf
UserParameter=Nginx.active,/usr/bin/curl -s "http://127.0.0.1:80/nginx-status" | awk '/Active/ {print $NF}'
UserParameter=Nginx.read,/usr/bin/curl -s "http://127.0.0.1:80/nginx-status" | grep 'Reading' | cut -d" " -f2
UserParameter=Nginx.wrie,/usr/bin/curl -s "http://127.0.0.1:80/nginx-status" | grep 'Writing' | cut -d" " -f4
UserParameter=Nginx.wait,/usr/bin/curl -s "http://127.0.0.1:80/nginx-status" | grep 'Waiting' | cut -d" " -f6
UserParameter=Nginx.accepted,/usr/bin/curl -s "http://127.0.0.1:80/nginx-status" | awk '/^[ \t]+[0-9]+[ \t]+[0-9]+[ \t]+[0-9]+/ {print $1}'
UserParameter=Nginx.handled,/usr/bin/curl -s "http://127.0.0.1:80/nginx-status" | awk '/^[ \t]+[0-9]+[ \t]+[0-9]+[ \t]+[0-9]+/ {print $2}'
UserParameter=Nginx.requests,/usr/bin/curl -s "http://127.0.0.1:80/nginx-status" | awk '/^[ \t]+[0-9]+[ \t]+[0-9]+[ \t]+[0-9]+/ {print $3}'

PHP.listenqueue --检测PHP队列数，返回整数
PHP.idle --检测PHP空闲进程数，返回整数
PHP.active --检测PHP活动进程数，返回整数
PHP.conn --检测PHP请求数,返回整数
PHP.reached --检测PHP达到限制次数，返回整数
PHP.requets --检测PHP慢请求书，返回整数

####redis相关的自定义项
vim /usr/local/zabbix/etc/zabbix_agentd.conf.d/redis.conf
UserParameter=Redis.Status,/usr/local/redis/bin/redis-cli -h 127.0.0.1 -p 6379 ping |grep -c PONG
UserParameter=Redis_conn[*],/usr/local/redis/bin/redis-cli -h $1 -p $2 info | grep -w "connected_clients" | awk -F':' '{print $2}'
UserParameter=Redis_rss_mem[*],/usr/local/redis/bin/redis-cli -h $1 -p $2 info | grep -w "used_memory_rss" | awk -F':' '{print $2}'
UserParameter=Redis_lua_mem[*],/usr/local/redis/bin/redis-cli -h $1 -p $2 info | grep -w "used_memory_lua" | awk -F':' '{print $2}'
UserParameter=Redis_cpu_sys[*],/usr/local/redis/bin/redis-cli -h $1 -p $2 info | grep -w "used_cpu_sys" | awk -F':' '{print $2}'
UserParameter=Redis_cpu_user[*],/usr/local/redis/bin/redis-cli -h $1 -p $2 info | grep -w "used_cpu_user" | awk -F':' '{print $2}'
UserParameter=Redis_cpu_sys_cline[*],/usr/local/redis/bin/redis-cli -h $1 -p $2 info | grep -w "used_cpu_sys_children" | awk -F':' '{print $2}'
UserParameter=Redis_cpu_user_cline[*],/usr/local/redis/bin/redis-cli -h $1 -p $2 info | grep -w "used_cpu_user_children" | awk -F':' '{print $2}'
UserParameter=Redis_keys_num[*],/usr/local/redis/bin/redis-cli -h $1 -p $2 info | grep -w "$$1" | grep -w "keys" | grep db$3 | awk -F'=' '{print $2}' | awk -F',' '{print $1}'
UserParameter=Redis_loading[*],/usr/local/redis/bin/redis-cli -h $1 -p $2 info | grep loading | awk -F':' '{print $$2}'

Redis.Status --检测Redis运行状态， 返回整数
Redis_conn  --检测Redis成功连接数，返回整数
Redis_rss_mem  --检测Redis系统分配内存，返回整数
Redis_lua_mem  --检测Redis引擎消耗内存，返回整数
Redis_cpu_sys --检测Redis主程序核心CPU消耗率，返回整数
Redis_cpu_user --检测Redis主程序用户CPU消耗率，返回整数
Redis_cpu_sys_cline --检测Redis后台核心CPU消耗率，返回整数
Redis_cpu_user_cline --检测Redis后台用户CPU消耗率，返回整数
Redis_keys_num --检测库键值数，返回整数
Redis_loding --检测Redis持久化文件状态，返回整数



mysql:
version:数据库版本
key_buffer_size:myisam的索引buffer大小
sort_buffer_size:会话的排序空间（每个线程会申请一个）
join_buffer_size:这是为链接操作分配的最小缓存大小，这些连接使用普通索引扫描、范围扫描、或者连接不适用索引
max_connections:最大允许同时连接的数量
max_connect_errors：允许一个主机最多的错误链接次数，如果超过了就会拒绝之后链接（默认100）。可以使用flush hosts命令去解除拒绝
open_files_limits:操作系统允许mysql打开的文件数量，可以通过opened_tables状态确定是否需要增大table_open_cache,如果opened_tables比较大且一直还在增大说明需要增大table_open_cache
max-heap_tables_size:建立的内存表的最大大小（默认16M）这个参数和tmp_table_size一起限制内部临时表的最大值(取这两个参数的小的一个），如果超过限制，则表会变为innodb或myisam引擎，（5.7.5之前是默认是myisam，5.7.6开始是innodb，可以通过internal_tmp_disk_storage_engine参数调整）。
max_allowed_packet:一个包的最大大小
##########GET INNODB INFO
#INNODB variables
innodb_version:
innodb_buffer_pool_instances：将innodb缓冲池分为指定的多个（默认为1）
innodb_buffer_pool_size:innodb缓冲池大小、5.7.5引入了innodb_buffer_pool_chunk_size,
innodb_doublewrite：是否开启doublewrite（默认开启）
innodb_read_io_threads:IO读线程的数量
innodb_write_io_threads:IO写线程的数量
########innodb status
innodb_buffer_pool_pages_total:innodb缓冲池页的数量、大小等于innodb_buffer_pool_size/(16*1024)
innodb_buffer_pool_pages_data:innodb缓冲池中包含数据的页的数量
########## GET MYSQL HITRATE
1、查询缓存命中率
如果Qcache_hits+Com_select<>0则为 Qcache_hits/（Qcache_hits+Com_select），否则为0

2、线程缓存命中率
如果Connections<>0,则为1-Threads_created/Connections，否则为0

3、myisam键缓存命中率
如果Key_read_requests<>0,则为1-Key_reads/Key_read_requests，否则为0

4、myisam键缓存写命中率
如果Key_write_requests<>0,则为1-Key_writes/Key_write_requests，否则为0

5、键块使用率
如果Key_blocks_used+Key_blocks_unused<>0，则Key_blocks_used/（Key_blocks_used+Key_blocks_unused），否则为0

6、创建磁盘存储的临时表比率
如果Created_tmp_disk_tables+Created_tmp_tables<>0,则Created_tmp_disk_tables/（Created_tmp_disk_tables+Created_tmp_tables），否则为0

7、连接使用率
如果max_connections<>0，则threads_connected/max_connections，否则为0

8、打开文件比率
如果open_files_limit<>0，则open_files/open_files_limit，否则为0

9、表缓存使用率
如果table_open_cache<>0，则open_tables/table_open_cache，否则为0

zabbix其他须掌握技能

自定义监控

自动发现 --> 匹配模板

自动注册 --> 匹配模板

分段监控

细化报警收件人

Nginx

nginx配置的结构

全局配置
events {
    
   
}

log_format   ;
logs         ;

http {                        //只能有一个
upstream name {               //可以多个
    
}    

server {                       //可以多个，在http里边
    listen  ;
    server_name _；
    location 匹配项 {           //可以多个，在server里边
       
    }
    location 匹配项 {
      proxy_pass http://name  ;  
    }    
}
    
    
}

include   ；

以上未nginx配置文档的结构，必知必会

重点复习
https   ssl   443
tcp
status
验证
防盗链
地址重写  rewrite
流量控制  


配置文件过多，使用include，进行拆分

nginx -t
nginx -s reload
nginx -c 文件路径
nginx -v

nginx负载均衡

upstream 、 proxy_pass 权重、算法、(down、backup、) 、timeout、次数

upstream --> proxy_pass

带端口的负载均衡

nginx反向代理

参考负载均衡

nginx

https   ssl   443
tcp
status
验证
防盗链
地址重写  rewrite
流量控制
nginx优势
第三方模块    --add-module=
会话保持  -- hash 、cookie、jvm_route
动静分离