vmstat--虚拟内存统计
vmstat(VirtualMeomoryStatistics,虚拟内存统计) 是Linux中监控内存的常用工具,可对操作系统的虚拟内存、进程、CPU等的整体情况进行监视。
vmstat的常规用法:vmstat interval times即每隔interval秒采样一次,共采样times次,如果省略times,则一直采集数据,直到用户手动停止为止。
第一行显示了系统自启动以来的平均值,第二行开始显示现在正在发生的情况,接下来的行会显示每5秒间隔发生了什么,每一列的含义在头部,如下所示:
▪ procs:r这一列显示了多少进程在等待cpu,b列显示多少进程正在不可中断的休眠(等待IO)。
▪ memory:swapd列显示了多少块被换出了磁盘(页面交换),剩下的列显示了多少块是空闲的(未被使用),多少块正在被用作缓冲区,以及多少正在被用作操作系统的缓存。
▪ swap:显示交换活动:每秒有多少块正在被换入(从磁盘)和换出(到磁盘)。
▪ io:显示了多少块从块设备读取(bi)和写出(bo),通常反映了硬盘I/O。
▪ system:显示每秒中断(in)和上下文切换(cs)的数量。
▪ cpu:显示所有的cpu时间花费在各类操作的百分比,包括执行用户代码(非内核),执行系统代码(内核),空闲以及等待IO。
内存不足的表现:free memory急剧减少,回收buffer和cacher也无济于事,大量使用交换分区(swpd),页面交换(swap)频繁,读写磁盘数量(io)增多,缺页中断(in)增多,上下文切换(cs)次数增多,等待IO的进程数(b)增多,大量CPU时间用于等待IO(wa)
top
pi@M:~ $ top
Tasks: 133 total, 1 running, 128 sleeping, 4 stopped, 0 zombie
%Cpu(s): 0.5 us, 1.2 sy, 0.0 ni, 98.2 id, 0.1 wa, 0.0 hi, 0.0 si, 0.0 st
MiB Mem : 926.1 total, 152.5 free, 138.4 used, 635.2 buff/cache
MiB Swap: 100.0 total, 100.0 free, 0.0 used. 708.5 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
274 root 20 0 51420 4696 3904 S 4.9 0.5 12:56.35 oraynewph
3602 pi 20 0 10288 2916 2544 R 1.0 0.3 0:00.35 top
264 root 20 0 14972 4288 3884 S 0.7 0.5 1:57.47 oraysl
495 www-data 20 0 50764 4452 2872 S 0.7 0.5 0:00.74 nginx
3564 root 20 0 0 0 0 I 0.3 0.0 0:00.07 kworker/1:0-events
1 root 20 0 34872 8364 6468 S 0.0 0.9 0:36.57 systemd
2 root 20 0 0 0 0 S 0.0 0.0 0:00.04 kthreadd
3 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_gp
4 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_par_gp
8 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 mm_percpu_wq
9 root 20 0 0 0 0 S 0.0 0.0 0:01.48 ksoftirqd/0
10 root 20 0 0 0 0 I 0.0 0.0 0:05.96 rcu_sched
11 root 20 0 0 0 0 I 0.0 0.0 0:00.00 rcu_bh
12 root rt 0 0 0 0 S 0.0 0.0 0:00.05 migration/0
13 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuhp/0
14 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuhp/1
15 root rt 0 0 0 0 S 0.0 0.0 0:00.04 migration/1
16 root 20 0 0 0 0 S 0.0 0.0 0:00.10 ksoftirqd/1
19 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuhp/2
20 root rt 0 0 0 0 S 0.0 0.0 0:00.02 migration/2
21 root 20 0 0 0 0 S 0.0 0.0 0:00.71 ksoftirqd/2
23 root 0 -20 0 0 0 I 0.0 0.0 0:00.44 kworker/2:0H-kblockd
24 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuhp/3
top命令的汇总区域显示了五个方面的系统性能信息:
1.负载:时间,登陆用户数,系统平均负载;
2.进程:运行,睡眠,停止,僵尸;
3.cpu:用户态,核心态,NICE,空闲,等待IO,中断等;
4.内存:总量,已用,空闲(系统角度),缓冲,缓存;
5.交换分区:总量,已用,空闲
任务区域默认显示:进程ID,有效用户,进程优先级,NICE值,进程使用的虚拟内存,物理内存和共享内存,进程状态,CPU占用率,内存占用率,累计CPU时间,进程命令行信息。
htop
htop 是Linux系统中的一个互动的进程查看器,一个文本模式的应用程序(在控制台或者X终端中),需要ncurses。
htop
Htop可让用户交互式操作,支持颜色主题,可横向或纵向滚动浏览进程列表,并支持鼠标操作。
与top相比,htop有以下优点:
▪ 可以横向或者纵向滚动浏览进程列表,以便看到所有的进程和完整的命令行。
▪ 在启动上,比top更快。
▪ 杀进程时不需要输入进程号。
▪ htop支持鼠标操作
netstat
Netstat用于显示与IP、TCP、UDP和ICMP协议相关的统计数据,一般用于检验本机各端口的网络连接情况。
用法:
netstat –npl 可以查看你要打开的端口是否已经打开。
netstat –rn 打印路由表信息。
netstat –in 提供系统上的接口信息,打印每个接口的MTU,输入分组数,输入错误,输出分组数,输出错误,冲突以及当前的输出队列的长度。
pi@M:~ $ netstat
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 192.168.31.235:56548 116.62.43.125:6061 ESTABLISHED
tcp 0 0 192.168.31.235:34011 116.62.43.125:6061 ESTABLISHED
tcp 0 0 192.168.31.235:33459 121.40.234.27:https ESTABLISHED
tcp 0 0 192.168.31.235:48858 103.46.128.53:6061 ESTABLISHED
tcp 0 0 192.168.31.235:39237 103.46.128.53:6061 ESTABLISHED
tcp 0 0 192.168.31.235:34899 121.40.234.27:https ESTABLISHED
tcp 0 0 192.168.31.235:46026 192.168.31.235:http ESTABLISHED
tcp 0 0 192.168.31.235:http 192.168.31.235:46026 ESTABLISHED
tcp 0 0 192.168.31.235:36573 103.46.128.53:6061 ESTABLISHED
tcp6 0 0 fe80::37f0:196d:959:ssh fe80::3871:a5a5:3:51993 ESTABLISHED
tcp6 0 0 fe80::37f0:microsoft-ds fe80::3871:a5a5:3:56648 ESTABLISHED
tcp6 0 880 fe80::37f0:196d:959:ssh fe80::3871:a5a5:3:50340 ESTABLISHED
udp 0 0 192.168.31.235:48157 120.132.2.236:6060 ESTABLISHED
Active UNIX domain sockets (w/o servers)
Proto RefCnt Flags Type State I-Node Path
unix 2 [ ] DGRAM 17172 /var/lib/samba/private/msg.sock/788
unix 3 [ ] DGRAM 1581 /run/systemd/notify
unix 2 [ ] DGRAM 1591 /run/systemd/journal/syslog
unix 2 [ ] DGRAM 36273 /var/lib/samba/private/msg.sock/3561
unix 2 [ ] DGRAM 17176 /var/lib/samba/private/msg.sock/790
unix 18 [ ] DGRAM 1611 /run/systemd/journal/dev-log
unix 2 [ ] DGRAM 17748 /run/user/1000/systemd/notify
unix 7 [ ] DGRAM 1620 /run/systemd/journal/socket
unix 2 [ ] DGRAM 13592 /tmp/.vncserver-license/0.376
unix 2 [ ] DGRAM 15707 /var/lib/samba/private/msg.sock/362
unix 2 [ ] DGRAM 13882 /var/lib/samba/private/msg.sock/789
unix 2 [ ] DGRAM 17504 /var/lib/samba/private/msg.sock/782
unix 3 [ ] STREAM CONNECTED 10181 /run/systemd/journal/stdout
ps
▪ 杀掉某一程序的方法:ps aux | grep mysqld | grep –v grep | awk ‘{print 2 == “Z”){print $4}}’ | xargs kill -9
pi@M:~ $ ps
PID TTY TIME CMD
825 pts/0 00:00:00 bash
3557 pts/0 00:00:03 top
3690 pts/0 00:00:00 ps
strace
跟踪程序执行过程中产生的系统调用及接收到的信号,帮助分析程序或命令执行中遇到的异常情况。
举例:查看mysqld在linux上加载哪种配置文件,可以通过运行下面的命令:
strace -e stat64 mysqld -print -defaults > /dev/null
pi@M:~ $ strace -e stat64 mysqld -print -defaults > /dev/null
stat64("/etc/my.cnf", 0x7ea27e60) = -1 ENOENT (No such file or directory)
stat64("/etc/mysql/my.cnf", {st_mode=S_IFREG|0644, st_size=869, ...}) = 0
stat64("/etc/mysql/conf.d/mysql.cnf", {st_mode=S_IFREG|0644, st_size=8, ...}) = 0
stat64("/etc/mysql/conf.d/mysqldump.cnf", {st_mode=S_IFREG|0644, st_size=55, ...}) = 0
stat64("/etc/mysql/mariadb.conf.d/50-client.cnf", {st_mode=S_IFREG|0644, st_size=677, ...}) = 0
stat64("/etc/mysql/mariadb.conf.d/50-mysql-clients.cnf", {st_mode=S_IFREG|0644, st_size=336, ...}) = 0
stat64("/etc/mysql/mariadb.conf.d/50-mysqld_safe.cnf", {st_mode=S_IFREG|0644, st_size=320, ...}) = 0
stat64("/etc/mysql/mariadb.conf.d/50-server.cnf", {st_mode=S_IFREG|0644, st_size=3608, ...}) = 0
stat64("/home/pi/.my.cnf", 0x7ea27e60) = -1 ENOENT (No such file or directory)
211017 22:21:27 [Note] mysqld (mysqld 10.0.28-MariaDB-2+b1) starting as process 3728 ...
stat64("/usr/share/mysql/charsets/Index.xml", {st_mode=S_IFREG|0644, st_size=18307, ...}) = 0
211017 22:21:27 [Warning] Can't create test file /var/lib/mysql/MyNAS.lower-test
211017 22:21:27 [Warning] One can only use the --user switch if running as root
uptime
能够打印系统总共运行了多长时间和系统的平均负载,uptime命令最后输出的三个数字的含义分别是1分钟,5分钟,15分钟内系统的平均负荷。
pi@M:~ $ uptime
22:22:26 up 5:09, 2 users, load average: 0.07, 0.06, 0.61
lsof
lsof(list open files)是一个列出当前系统打开文件的工具。通过lsof工具能够查看这个列表对系统检测及排错,常见的用法:
查看文件系统阻塞 lsof /boot
查看端口号被哪个进程占用 lsof -i : 3306
查看用户打开哪些文件 lsof –u username
查看进程打开哪些文件 lsof –p 4838
查看远程已打开的网络链接 lsof –i @192.168.34.128
pi@MyNAS:~ $ lsof
COMMAND PID TID TASKCMD USER FD TYPE DEVICE SIZE/OFF NODE NAME
systemd 1 root cwd unknown /proc/1/cwd (readlink: Permission denied)
systemd 1 root rtd unknown /proc/1/root (readlink: Permission denied)
systemd 1 root txt unknown /proc/1/exe (readlink: Permission denied)
systemd 1 root NOFD /proc/1/fd (opendir: Permission denied)
kthreadd 2 root cwd unknown /proc/2/cwd (readlink: Permission denied)
kthreadd 2 root rtd unknown /proc/2/root (readlink: Permission denied)
kthreadd 2 root txt unknown /proc/2/exe (readlink: Permission denied)
kthreadd 2 root NOFD /proc/2/fd (opendir: Permission denied)
rcu_gp 3 root cwd unknown /proc/3/cwd (readlink: Permission denied)
rcu_gp 3 root rtd unknown /proc/3/root (readlink: Permission denied)
rcu_gp 3 root txt unknown /proc/3/exe (readlink: Permission denied)
rcu_gp 3 root NOFD /proc/3/fd (opendir: Permission denied)
rcu_par_g 4 root cwd unknown /proc/4/cwd (readlink: Permission denied)
rcu_par_g 4 root rtd unknown /proc/4/root (readlink: Permission denied)
rcu_par_g 4 root txt unknown /proc/4/exe (readlink: Permission denied)
rcu_par_g 4 root NOFD /proc/4/fd (opendir: Permission denied)
mm_percpu 8 root cwd unknown /proc/8/cwd (readlink: Permission denied)
mm_percpu 8 root rtd unknown /proc/8/root (readlink: Permission denied)
mm_percpu 8 root txt unknown /proc/8/exe (readlink: Permission denied)
mm_percpu 8 root NOFD /proc/8/fd (opendir: Permission denied)
ksoftirqd 9 root cwd unknown /proc/9/cwd (readlink: Permission denied)
ksoftirqd 9 root rtd unknown /proc/9/root (readlink: Permission denied)
ksoftirqd 9 root txt unknown /proc/9/exe (readlink: Permission denied)
ksoftirqd 9 root NOFD /proc/9/fd (opendir: Permission denied)
rcu_sched 10 root cwd unknown /proc/10/cwd (readlink: Permission denied)
rcu_sched 10 root rtd unknown /proc/10/root (readlink: Permission denied)
rcu_sched 10 root txt unknown /proc/10/exe (readlink: Permission denied)
rcu_sched 10 root NOFD /proc/10/fd (opendir: Permission denied)
rcu_bh 11 root cwd unknown /proc/11/cwd (readlink: Permission denied)
rcu_bh 11 root rtd unknown /proc/11/root (readlink: Permission denied)
rcu_bh 11 root txt unknown /proc/11/exe (readlink: Permission denied)
rcu_bh 11 root NOFD /proc/11/fd (opendir: Permission denied)
migration 12 root cwd unknown /proc/12/cwd (readlink: Permission denied)
migration 12 root rtd unknown /proc/12/root (readlink: Permission denied)
migration 12 root txt unknown /proc/12/exe (readlink: Permission denied)
migration 12 root NOFD /proc/12/fd (opendir: Permission denied)
cpuhp/0 13 root cwd unknown /proc/13/cwd (readlink: Permission denied)
cpuhp/0 13 root rtd unknown /proc/13/root (readlink: Permission denied)
cpuhp/0 13 root txt unknown /proc/13/exe (readlink: Permission denied)
cpuhp/0 13 root NOFD /proc/13/fd (opendir: Permission denied)
cpuhp/1 14 root cwd unknown /proc/14/cwd (readlink: Permission denied)
cpuhp/1 14 root rtd unknown /proc/14/root (readlink: Permission denied)
cpuhp/1 14 root txt unknown /proc/14/exe (readlink: Permission denied)
cpuhp/1 14 root NOFD /proc/14/fd (opendir: Permission denied)
migration 15 root cwd unknown /proc/15/cwd (readlink: Permission denied)
migration 15 root rtd unknown /proc/15/root (readlink: Permission denied)
migration 15 root txt unknown /proc/15/exe (readlink: Permission denied)
migration 15 root NOFD /proc/15/fd (opendir: Permission denied)
ksoftirqd 16 root cwd unknown /proc/16/cwd (readlink: Permission denied)
ksoftirqd 16 root rtd unknown /proc/16/root (readlink: Permission denied)
perf 半支持
perf是Linux kernel自带的系统性能优化工具。优势在于与Linux Kernel的紧密结合,它可以最先应用到加入Kernel的new feature,用于查看热点函数,查看cashe miss的比率,从而帮助开发者来优化程序性能。
性能调优工具如 perf,Oprofile 等的基本原理都是对被监测对象进行采样,最简单的情形是根据 tick 中断进行采样,即在 tick 中断内触发采样点,在采样点里判断程序当时的上下文。假如一个程序 90% 的时间都花费在函数 foo() 上,那么 90% 的采样点都应该落在函数 foo() 的上下文中。运气不可捉摸,但我想只要采样频率足够高,采样时间足够长,那么以上推论就比较可靠。因此,通过 tick 触发采样,我们便可以了解程序中哪些地方最耗时间,从而重点分析。
pi@M:~ $ perf
/usr/bin/perf: line 13: exec: perf_4.19: not found
E: linux-perf-4.19 is not installed.