使用sysbench进行性能测试,安装apt install -y sysbench,教程
目录
- 操作系统
- cpu
- 查看主频,核数和线程数
- 性能测试前的准备
- 性能测试
- GPU
- 查看大小,型号,驱动是否安装正确
- 性能测试
- 内存
- 查看大小
- 吞吐量
- 磁盘
- 查看大小
- IO性能
- 交换机-集群
- 所以节点之间的连通性
- 网速
操作系统
$ lsb_release -a
CPU
查看主频,核数,线程数
Socket芯片卡槽数,Core(s) per socket每一块芯片有多少核心,Thread(s) per core每个核心支持几个线程,即是否使用超线程技术
总CPU数则为Socket * Core(s) per socket * Thread(s) per core
$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 46 bits physical, 57 bits virtual
Byte Order: Little Endian
CPU(s): 112
On-line CPU(s) list: 0-111
Vendor ID: GenuineIntel
Model name: Intel(R) Xeon(R) Gold 6330 CPU @ 2.00GHz
CPU family: 6
Model: 106
Thread(s) per core: 2
Core(s) per socket: 28
Socket(s): 2
Stepping: 6
CPU max MHz: 3100.0000
CPU min MHz: 800.0000
BogoMIPS: 4000.00
性能测试前的准备
/sys/devices/system/cpu/cpu*/cpufreq文件夹里有每个CPU的配置和信息,*代表CPU编号(0~N-1)。可用cat查看每个文件
/sys/devices/system/cpu/cpu*/cpufreq/scaling_governor中为CPU频率调节器的类型,可用如下命令改变模式
echo performance | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
有如下几种选择:
- performance:将CPU频率设置为最高值,以提供最佳性能。适合需要高响应速度和处理能力的场景,但可能会增加功耗和热量。
- powersave:将CPU频率设置为最低值,以节省电力。适合电池供电设备或对功耗敏感的场景。
- userspace:允许用户空间程序通过写入scaling_setspeed属性来设置CPU频率。
- ondemand:根据当前系统负载动态调整CPU频率。当负载增加时,频率会提高以提供更好的性能,而在轻负载时频率会降低以节省电力。
- conservative:类似于ondemand,但频率调整更加平缓,不会立即跳到最高频率。适合需要平衡性能和功耗的场景。
- schedutil:基于CPU调度器的利用率数据来动态调整频率。它是较新的调节器,通常被认为是ondemand和conservative的替代品,因为它与CPU调度器更紧密集成,开销更小。
/sys/devices/system/cpu/cpu*/cpufreq/scaling_max_freq中为CPU频率变化阈值上限
/sys/devices/system/cpu/cpu*/cpufreq/scaling_min_freq中为CPU频率变化阈值下限
/sys/devices/system/cpu/cpu*/cpufreq/scaling_cur_freq中为当前CPU频率
另外可以开一个窗口持续执行watch -n 1 "cat /proc/cpuinfo | grep 'MHz'"来监控当前CPU频率
/sys/devices/system/cpu/cpu*/cpufreq/base_frequency中为一个频率基准值
当scaling_governor为performance时,若有base_frequency,则CPU频率不会升高到scaling_max_freq而是会维持在base_frequency,同理,当scaling_governor为powersave时,若有base_frequency,则CPU频率不会下降到scaling_min_freq而是会维持在base_frequency
/sys/devices/system/cpu/cpu*/cpufreq/cpuinfo_max_freq中为CPU信息中的频率上限,对应lscpu中的CPU max MHz
/sys/devices/system/cpu/cpu*/cpufreq/cpuinfo_min_freq中为CPU信息中的频率下限,对应lscpu中的CPU min MHz
性能测试
单核性能测试
$ sysbench --test=cpu --cpu-max-prime=20000 --time=30 run
多核性能测试
$ sysbench --test=cpu --cpu-max-prime=20000 --threads=112 --time=30 run
结果会包含每秒任务数,任务耗时,线程均衡性
CPU speed:
events per second: 50015.22
General statistics:
total time: 30.0023s
total number of events: 1500650
Latency (ms):
min: 0.98
avg: 2.24
max: 26.24
95th percentile: 2.26
sum: 3358727.27
Threads fairness:
events (avg/stddev): 13398.6607/139.68
execution time (avg/stddev): 29.9886/0.01
GPU
查看大小,型号,驱动是否安装正确
Nvidia的显卡可以如下查看,Perf下的部分就是型号,Memory-Usage下的部分就是显卡内存
$ nvidia-smi
Tue Feb 18 06:41:55 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.12 Driver Version: 550.90.12 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA A100 80GB PCIe Off | 00000000:52:00.0 Off | 0 |
| N/A 44C P0 62W / 300W | 1MiB / 81920MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 1 NVIDIA A100 80GB PCIe Off | 00000000:56:00.0 Off | 0 |
| N/A 45C P0 68W / 300W | 1MiB / 81920MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 2 NVIDIA A100 80GB PCIe Off | 00000000:D1:00.0 Off | 0 |
| N/A 41C P0 66W / 300W | 1MiB / 81920MiB | 2% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 3 NVIDIA A100 80GB PCIe Off | 00000000:D5:00.0 Off | 0 |
| N/A 42C P0 65W / 300W | 1MiB / 81920MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
单卡/多卡性能
使用gpu_burn进行测试,官方仓库
但官方给的加-参数似乎都不好使
$ cd /home/hx/gpu_burn
$ ./gpu_burn 100
GPU 0: NVIDIA A100 80GB PCIe (UUID: GPU-751c75f1-b612-d705-c571-9173e4969f8b)
GPU 1: NVIDIA A100 80GB PCIe (UUID: GPU-569f62c0-3b97-4d25-a7fc-70b2a2724478)
GPU 2: NVIDIA A100 80GB PCIe (UUID: GPU-2be55775-2295-c096-411d-4f28a4b50ec4)
GPU 3: NVIDIA A100 80GB PCIe (UUID: GPU-4f6920dc-f153-925e-a803-181dd91a232f)
Initialized device 0 with 81037 MB of memory (80602 MB available, using 72542 MB of it), using FLOATS
Initialized device 3 with 81037 MB of memory (80602 MB available, using 72542 MB of it), using FLOATS
Initialized device 2 with 81037 MB of memory (80602 MB available, using 72542 MB of it), using FLOATS
Initialized device 1 with 81037 MB of memory (80602 MB available, using 72542 MB of it), using FLOATS
11.0% proc'd: 9062 (17018 Gflop/s) - 9062 (16879 Gflop/s) - 9062 (17042 Gflop/s) - 4531 (14077 Gflop/s) errors: 0 - 0 - 0 - 0 temps: 72 C - 73 C - 68 C - 81 C
Summary at: Wed Feb 19 04:23:49 AM UTC 2025
24.0% proc'd: 22655 (16849 Gflop/s) - 18124 (16789 Gflop/s) - 18124 (16967 Gflop/s) - 18124 (15897 Gflop/s) errors: 0 - 0 - 0 - 0 temps: 80 C - 79 C - 75 C - 85 C
Summary at: Wed Feb 19 04:24:02 AM UTC 2025
36.0% proc'd: 31717 (16763 Gflop/s) - 31717 (16672 Gflop/s) - 31717 (16856 Gflop/s) - 27186 (14160 Gflop/s) errors: 0 - 0 - 0 - 0 temps: 83 C - 82 C - 78 C - 84 C
Summary at: Wed Feb 19 04:24:14 AM UTC 2025
47.0% proc'd: 40779 (15897 Gflop/s) - 40779 (16435 Gflop/s) - 45310 (16754 Gflop/s) - 36248 (12967 Gflop/s) errors: 0 - 0 - 0 - 0 temps: 85 C - 84 C - 82 C - 85 C
Summary at: Wed Feb 19 04:24:25 AM UTC 2025
58.0% proc'd: 49841 (14627 Gflop/s) - 54372 (14800 Gflop/s) - 54372 (16656 Gflop/s) - 45310 (11643 Gflop/s) errors: 0 - 0 - 0 - 0 temps: 85 C - 85 C - 84 C - 84 C
Summary at: Wed Feb 19 04:24:36 AM UTC 2025
69.0% proc'd: 58903 (13925 Gflop/s) - 63434 (14173 Gflop/s) - 63434 (15784 Gflop/s) - 49841 (10948 Gflop/s) errors: 0 - 0 - 0 - 0 temps: 84 C - 84 C - 84 C - 84 C
Summary at: Wed Feb 19 04:24:47 AM UTC 2025
80.0% proc'd: 67965 (13543 Gflop/s) - 72496 (13905 Gflop/s) - 72496 (15110 Gflop/s) - 58903 (10935 Gflop/s) errors: 0 - 0 - 0 - 0 temps: 84 C - 84 C - 84 C - 84 C
Summary at: Wed Feb 19 04:24:58 AM UTC 2025
91.0% proc'd: 77027 (13270 Gflop/s) - 77027 (13868 Gflop/s) - 81558 (14820 Gflop/s) - 63434 (10594 Gflop/s) errors: 0 - 0 - 0 - 0 temps: 84 C - 84 C - 84 C - 84 C
Summary at: Wed Feb 19 04:25:09 AM UTC 2025
100.0% proc'd: 86089 (13270 Gflop/s) - 86089 (13676 Gflop/s) - 90620 (14577 Gflop/s) - 72496 (10184 Gflop/s) errors: 0 - 0 - 0 - 0 temps: 84 C - 84 C - 84 C - 84 C
Killing processes.. done
Tested 4 GPUs:
GPU 0: OK
GPU 1: OK
GPU 2: OK
GPU 3: OK
内存
查看大小
$ sudo lshw -C memory
*-memory
description: System Memory
physical id: 4f
slot: System board or motherboard
size: 256GiB
$ free -h
total used free shared buff/cache available
Mem: 251Gi 1.0Gi 247Gi 2.0Mi 2.9Gi 249Gi
Swap: 8.0Gi 0B 8.0Gi
吞吐量
多线程随机写入效率
$ sysbench memory --memory-block-size=1M --memory-total-size=200G --threads=50 --memory-access-mode=rnd run
多线程随机读取效率
$ sysbench memory --memory-block-size=1M --memory-total-size=200G --memory-access-mode=rnd --threads=50 --memory-oper=read run
磁盘
查看大小,分区合理性
lsblk,fdisk,df -h等命令查看到的1GB=1024MB换算来的容量,而硬盘厂商一般用1GB=1000MB换算,因此容量看上去会比预期的少许多,只有用parted能看到符合容量标注的大小
ROTA值为1为HDD,0为SSD)
$ lsblk --output NAME,ROTA,SIZE,TYPE,RM,RO,MOUNTPOINTS
nvme0n1 0 3.5T disk 0 0
├─nvme0n1p1 0 1G part 0 0 /boot/efi
├─nvme0n1p2 0 2G part 0 0 /boot
└─nvme0n1p3 0 3.5T part 0 0
└─ubuntu--vg-ubuntu--lv 0 100G lvm 0 0 /
$ sudo parted /dev/nvme0n1 print
[sudo] password for hx:
Model: SAMSUNG MZQL23T8HCLS-00A07 (nvme)
Disk /dev/nvme0n1: 3841GB
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags:
Number Start End Size File system Name Flags
1 1049kB 1128MB 1127MB fat32 boot, esp
2 1128MB 3276MB 2147MB ext4
3 3276MB 3841GB 3837GB
IO性能
sysbench评估磁盘读写需要先prepare准备数据,然后run,测试完后cleanup清理测试数据
多线程随机写人测试
$ sysbench fileio --file-total-size=25G --file-test-mode=rndwr --threads=10 --file-num=100 prepare
$ sysbench fileio --file-total-size=25G --file-test-mode=rndwr --threads=10 --file-num=100 --report-interval=1 run
$ sysbench fileio --file-total-size=25G --file-test-mode=rndwr --threads=10 --file-num=100 cleanup
多线程随机读取测试
$ sysbench fileio --file-total-size=25G --file-test-mode=rndrd --threads=10 --file-num=100 prepare
$ sysbench fileio --file-total-size=25G --file-test-mode=rndrd --threads=10 --file-num=100 --report-interval=1 run
$ sysbench fileio --file-total-size=25G --file-test-mode=rndrd --threads=10 --file-num=100 cleanup
多线程随机读写混合,读写比6:4
$ sysbench fileio --file-total-size=25G --file-test-mode=rndrw --threads=10 --file-num=100 --file-rw-ratio=1.5 prepare
$ sysbench fileio --file-total-size=25G --file-test-mode=rndrw --threads=10 --file-num=100 --file-rw-ratio=1.5 --report-interval=1 run
$ sysbench fileio --file-total-size=25G --file-test-mode=rndrw --threads=10 --file-num=100 --file-rw-ratio=1.5 cleanup