一、简介
Arthas 是一款线上监控诊断产品,通过全局视角实时查看应用 load、内存、gc、线程的状态信息,并能在不修改应用代码的情况下,对业务问题进行诊断,包括查看方法调用的出入参、异常,监测方法执行耗时,类加载信息等,大大提升线上问题排查效率。
二、Arthas(阿尔萨斯)能为你做什么?
Arthas 是 Alibaba 开源的 Java 诊断工具,深受开发者喜爱。
当你遇到以下类似问题而束手无策时,Arthas可以帮助你解决:
- 这个类从哪个 jar 包加载的?为什么会报各种类相关的 Exception?
- 我改的代码为什么没有执行到?难道是我没 commit?分支搞错了?
- 遇到问题无法在线上 debug,难道只能通过加日志再重新发布吗?
- 线上遇到某个用户的数据处理有问题,但线上同样无法 debug,线下无法重现!
- 是否有一个全局视角来查看系统的运行状况?
- 有什么办法可以监控到 JVM 的实时运行状态?
- 怎么快速定位应用的热点,生成火焰图?
- 怎样直接从 JVM 内查找某个类的实例?
Arthas 支持 JDK 6+,支持 Linux/Mac/Windows,采用命令行交互模式,同时提供丰富的 Tab 自动补全功能,进一步方便进行问题的定位和诊断。
三、实操环境说明
服务器:阿里云ECS,Ubuntu 18.04 64位
本地机器:Windows 11 家庭中文版
连接服务器客户端: Windows Powersehll(x86)
四、快速入门
4.1 下载并启动 math-game
4.1.1 准备阶段
1、登录连接到ECS服务器(窗口1)
ssh root@47.105.*.74
#输入登录密码
2、创建文件夹,用于存math-game的jar包
mkdir math-game
cd math-game
4.1.2 下载并启动math-game
curl -O https://arthas.aliyun.com/math-game.jar
java -jar math-game.jar
math-game是一个简单的程序,每隔一秒生成一个随机数,再执行质因数分解,并打印出分解结果。math-game源代码:查看
4.2 下载并启动 arthas
4.2.1 准备阶段
1、新开一个窗口,登录连接到ECS服务器(窗口2)
ssh root@47.105.*.74
#输入登录密码
2、创建文件夹,用于存arthas的jar包
mkdir arthas
cd arthas
4.2.2 下载并启动arthas
curl -O https://arthas.aliyun.com/arthas-boot.jar
java -jar arthas-boot.jar
运行结果:
root@iZm5eetszs07500os8erolZ:~/data/arthas# java -jar arthas-boot.jar
[INFO] arthas-boot version: 3.6.5
[INFO] Found existing java process, please choose one and input the serial number of the process, eg : 1. Then hit ENTER.
* [1]: 24865 demo-0.0.1-SNAPSHOT.jar
[2]: 25771 math-game.jar
可以看到math-game运行的进程ID是25771,序号为2。
4.2.3 选择应用 java 进程math-game
math-game进程是第 2 个,则输入 2,再输入回车/enter。Arthas 会 attach 到目标进程上,并输出日志:
root@iZm5eetszs07500os8erolZ:~/data/arthas# java -jar arthas-boot.jar
[INFO] arthas-boot version: 3.6.5
[INFO] Found existing java process, please choose one and input the serial number of the process, eg : 1. Then hit ENTER.
* [1]: 24865 demo-0.0.1-SNAPSHOT.jar
[2]: 25771 math-game.jar
2
[INFO] arthas home: /root/.arthas/lib/3.6.5/arthas
[INFO] Try to attach process 25771
[INFO] Attach process 25771 success.
[INFO] arthas-client connect 127.0.0.1 3658
,---. ,------. ,--------.,--. ,--. ,---. ,---.
/ O \ | .--. ''--. .--'| '--' | / O \ ' .-'
| .-. || '--'.' | | | .--. || .-. |`. `-.
| | | || |\ \ | | | | | || | | |.-' |
`--' `--'`--' '--' `--' `--' `--'`--' `--'`-----'
wiki https://arthas.aliyun.com/doc
tutorials https://arthas.aliyun.com/doc/arthas-tutorials.html
version 3.6.5
main_class
pid 25771
time 2022-08-31 15:18:21
[arthas@25771]$
4.3 查看 dashboard
在窗口2中执行以下操作:
输入dashboard,按回车/enter,会展示当前进程的信息,按ctrl+c可以中断执行。
root@iZm5eetszs07500os8erolZ:~/data/arthas# java -jar arthas-boot.jar
[INFO] arthas-boot version: 3.6.5
[INFO] Found existing java process, please choose one and input the serial number of the process, eg : 1. Then hit ENTER.
* [1]: 24865 demo-0.0.1-SNAPSHOT.jar
[2]: 25771 math-game.jar
2
[INFO] arthas home: /root/.arthas/lib/3.6.5/arthas
[INFO] Try to attach process 25771
[INFO] Attach process 25771 success.
[INFO] arthas-client connect 127.0.0.1 3658
,---. ,------. ,--------.,--. ,--. ,---. ,---.
/ O \ | .--. ''--. .--'| '--' | / O \ ' .-'
| .-. || '--'.' | | | .--. || .-. |`. `-.
| | | || |\ \ | | | | | || | | |.-' |
`--' `--'`--' '--' `--' `--' `--'`--' `--'`-----'
wiki https://arthas.aliyun.com/doc
tutorials https://arthas.aliyun.com/doc/arthas-tutorials.html
version 3.6.5
main_class
pid 25771
time 2022-08-31 15:18:21
[arthas@25771]$ dashboard
ID NAME GROUP PRIORITY STATE %CPU DELTA_TIME TIME INTERRUPTE DAEMON
-1 C2 CompilerThread0 - -1 - 0.0 0.000 0:0.537 false true
-1 C1 CompilerThread1 - -1 - 0.0 0.000 0:0.536 false true
-1 VM Periodic Task Thread - -1 - 0.0 0.000 0:0.314 false true
1 main main 5 TIMED_WAIT 0.0 0.000 0:0.237 false false
-1 VM Thread - -1 - 0.0 0.000 0:0.091 false true
20 arthas-NettyHttpTelnetBootstrap-3- system 5 RUNNABLE 0.0 0.000 0:0.077 false true
13 arthas-NettyHttpTelnetBootstrap-3- system 5 RUNNABLE 0.0 0.000 0:0.028 false true
8 Attach Listener system 9 RUNNABLE 0.0 0.000 0:0.025 false true
21 arthas-command-execute system 5 TIMED_WAIT 0.0 0.000 0:0.003 false true
2 Reference Handler system 10 WAITING 0.0 0.000 0:0.001 false true
3 Finalizer system 8 WAITING 0.0 0.000 0:0.001 false true
14 arthas-NettyWebsocketTtyBootstrap- system 5 RUNNABLE 0.0 0.000 0:0.001 false true
15 arthas-NettyWebsocketTtyBootstrap- system 5 RUNNABLE 0.0 0.000 0:0.001 false true
16 arthas-shell-server system 9 TIMED_WAIT 0.0 0.000 0:0.001 false true
22 Timer-for-arthas-dashboard-0133c8d system 5 RUNNABLE 0.0 0.000 0:0.001 false true
4 Signal Dispatcher system 9 RUNNABLE 0.0 0.000 0:0.000 false true
10 arthas-timer system 9 WAITING 0.0 0.000 0:0.000 false true
17 arthas-session-manager system 9 TIMED_WAIT 0.0 0.000 0:0.000 false true
18 arthas-UserStat system 9 WAITING 0.0 0.000 0:0.000 false true
-1 Service Thread - -1 - 0.0 0.000 0:0.000 false true
Memory used total max usage GC
heap 19M 30M 483M 3.98% gc.copy.count 8
eden_space 528K 8704K 136576K 0.39% gc.copy.time(ms) 54
survivor_space 1M 1M 16M 6.39% gc.marksweepcompact.count 0
tenured_gen 17M 21M 333M 5.29% gc.marksweepcompact.time(ms) 0
nonheap 27M 28M -1 97.50%
code_cache 4M 4M 240M 2.05%
metaspace 20M 20M -1 97.34%
compressed_class_space 2M 2M 1024M 0.24%
direct 0K 0K - 0.00%
mapped 0K 0K - 0.00%
Runtime
os.name Linux
os.version 4.15.0-66-generic
java.version 1.8.0_312
java.home /usr/lib/jvm/java-8-openjdk-amd64/jre
systemload.average 0.04
processors 1
timestamp/uptime Wed Aug 31 15:20:47 CST 2022/511s
ID NAME GROUP PRIORITY STATE %CPU DELTA_TIME TIME INTERRUPTE DAEMON
22 Timer-for-arthas-dashboard-0133c8d system 5 RUNNABLE 0.73 0.036 0:0.038 false true
-1 C1 CompilerThread1 - -1 - 0.56 0.028 0:0.564 false true
-1 VM Thread - -1 - 0.51 0.025 0:0.116 false true
20 arthas-NettyHttpTelnetBootstrap-3- system 5 RUNNABLE 0.33 0.016 0:0.094 false true
-1 C2 CompilerThread0 - -1 - 0.2 0.010 0:0.547 false true
-1 VM Periodic Task Thread - -1 - 0.06 0.002 0:0.317 false true
1 main main 5 TIMED_WAIT 0.04 0.001 0:0.239 false false
3 Finalizer system 8 WAITING 0.0 0.000 0:0.001 false true
2 Reference Handler system 10 WAITING 0.0 0.000 0:0.001 false true
4 Signal Dispatcher system 9 RUNNABLE 0.0 0.000 0:0.000 false true
8 Attach Listener system 9 RUNNABLE 0.0 0.000 0:0.025 false true
10 arthas-timer system 9 WAITING 0.0 0.000 0:0.000 false true
13 arthas-NettyHttpTelnetBootstrap-3- system 5 RUNNABLE 0.0 0.000 0:0.028 false true
14 arthas-NettyWebsocketTtyBootstrap- system 5 RUNNABLE 0.0 0.000 0:0.001 false true
15 arthas-NettyWebsocketTtyBootstrap- system 5 RUNNABLE 0.0 0.000 0:0.001 false true
16 arthas-shell-server system 9 TIMED_WAIT 0.0 0.000 0:0.001 false true
17 arthas-session-manager system 9 TIMED_WAIT 0.0 0.000 0:0.000 false true
18 arthas-UserStat system 9 WAITING 0.0 0.000 0:0.000 false true
21 arthas-command-execute system 5 TIMED_WAIT 0.0 0.000 0:0.003 false true
-1 Service Thread - -1 - 0.0 0.000 0:0.000 false true
4.4 通过 thread 命令来获取到math-game进程的 Main Class
在窗口2中执行以下操作:
thread 1会打印线程 ID 1 的栈,通常是 main 函数的线程。
thread 1 | grep 'main('
运行结果:
[arthas@25771]$ thread 1 | grep 'main('
at demo.MathGame.main(MathGame.java:17)
[arthas@25771]$
4.5 通过 jad 来反编译 Main Class
在窗口2中执行以下操作:
jad demo.MathGame
运行结果:
$ jad demo.MathGame
ClassLoader:
+-sun.misc.Launcher$AppClassLoader@3d4eac69
+-sun.misc.Launcher$ExtClassLoader@66350f69
Location:
/tmp/math-game.jar
/*
* Decompiled with CFR 0_132.
*/
package demo;
import java.io.PrintStream;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
import java.util.Random;
import java.util.concurrent.TimeUnit;
public class MathGame {
private static Random random = new Random();
private int illegalArgumentCount = 0;
public static void main(String[] args) throws InterruptedException {
MathGame game = new MathGame();
do {
game.run();
TimeUnit.SECONDS.sleep(1L);
} while (true);
}
public void run() throws InterruptedException {
try {
int number = random.nextInt();
List<Integer> primeFactors = this.primeFactors(number);
MathGame.print(number, primeFactors);
}
catch (Exception e) {
System.out.println(String.format("illegalArgumentCount:%3d, ", this.illegalArgumentCount) + e.getMessage());
}
}
public static void print(int number, List<Integer> primeFactors) {
StringBuffer sb = new StringBuffer("" + number + "=");
Iterator<Integer> iterator = primeFactors.iterator();
while (iterator.hasNext()) {
int factor = iterator.next();
sb.append(factor).append('*');
}
if (sb.charAt(sb.length() - 1) == '*') {
sb.deleteCharAt(sb.length() - 1);
}
System.out.println(sb);
}
public List<Integer> primeFactors(int number) {
if (number < 2) {
++this.illegalArgumentCount;
throw new IllegalArgumentException("number is: " + number + ", need >= 2");
}
ArrayList<Integer> result = new ArrayList<Integer>();
int i = 2;
while (i <= number) {
if (number % i == 0) {
result.add(i);
number /= i;
i = 2;
continue;
}
++i;
}
return result;
}
}
Affect(row-cnt:1) cost in 970 ms.
4.6 watch
通过watch命令来查看demo.MathGame#primeFactors函数的返回值:
watch demo.MathGame primeFactors returnObj
$ watch demo.MathGame primeFactors returnObj
Press Ctrl+C to abort.
Affect(class-cnt:1 , method-cnt:1) cost in 107 ms.
ts=2018-11-28 19:22:30; [cost=1.715367ms] result=null
ts=2018-11-28 19:22:31; [cost=0.185203ms] result=null
ts=2018-11-28 19:22:32; [cost=19.012416ms] result=@ArrayList[
@Integer[5],
@Integer[47],
@Integer[2675531],
]
ts=2018-11-28 19:22:33; [cost=0.311395ms] result=@ArrayList[
@Integer[2],
@Integer[5],
@Integer[317],
@Integer[503],
@Integer[887],
]
ts=2018-11-28 19:22:34; [cost=10.136007ms] result=@ArrayList[
@Integer[2],
@Integer[2],
@Integer[3],
@Integer[3],
@Integer[31],
@Integer[717593],
]
ts=2018-11-28 19:22:35; [cost=29.969732ms] result=@ArrayList[
@Integer[5],
@Integer[29],
@Integer[7651739],
]
4.6 退出arthas
如果只是退出当前的连接,可以用quit或者exit命令。Attach 到目标进程上的 arthas 还会继续运行,端口会保持开放,下次连接时可以直接连接上。如果想完全退出 arthas,可以执行stop命令。
参考资料:
Arthas官方文档