问题描述
压测一台服务器,cpu一直飙高,排查问题线程
运维环境
k8s,java服务
步骤
- 首先登陆虚机,查看k8s上pod位置
[root@kube-node01 ~]# kubectl get namespace
NAME STATUS AGE
default Active 102d
yy Active 34d
- 找到问题pod(如果对k8s的基本概念不熟悉,可以先学习)
[root@kube-node01 ~]# kubectl get pods --namespace= yy |grep 'nihao'
nihao-6dc768ffdb-f7smm 2/2 Running 0 40m
- 进入对应pod服务,命令
kubectl exec -it $PODname --namespace=$命名空间 -- /bin/bash
[root@kube-node01 ~]# kubectl exec -it nihao-hessian-6dc768ffdb-f7smm --namespace=yy -- /bin/bash
Defaulting container name to nihao-hessian.
Use 'kubectl describe pod/nihao hessian-6dc768ffdb-f7smm -n yc' to see all of the containers in this pod.
[root@ nihao-hessian-6dc768ffdb-f7smm /]#
4.找到对应服务进程
[root@ nihao-hessian-6dc768ffdb-f7smm /]#top
top - 19:38:31 up 100 days, 2:45, 0 users, load average: 5.18, 3.37, 2.64
Tasks: 5 total, 1 running, 4 sleeping, 0 stopped, 0 zombie
%Cpu(s): 6.9 us, 0.7 sy, 0.0 ni, 92.1 id, 0.0 wa, 0.0 hi, 0.2 si, 0.0 st
KiB Mem : 13174908+total, 862080 free, 75972832 used, 54914184 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 46853388 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
30 root 20 0 18.8g 2.8g 15984 S 201.3 2.2 89:26.16 java
1 root 20 0 15264 1700 1328 S 0.0 0.0 0:00.04 entrypoint.sh
15 root 20 0 15260 1764 1360 S 0.0 0.0 0:00.00 startup.sh
419 root 20 0 15392 2196 1636 S 0.0 0.0 0:00.02 bash
437 root 20 0 59576 2116 1508 R 0.0 0.0 0:00.03 top
- 命令top -Hp pid查询占cpu最大线程
top - 19:52:56 up 100 days, 2:59, 0 users, load average: 5.08, 2.12, 2.08
Threads: 241 total, 28 running, 213 sleeping, 0 stopped, 0 zombie
%Cpu(s): 5.5 us, 0.8 sy, 0.0 ni, 93.5 id, 0.0 wa, 0.0 hi, 0.2 si, 0.0 st
KiB Mem : 13174908+total, 789344 free, 75998560 used, 54961192 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 46816480 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
294 root 20 0 18.9g 2.8g 15988 S 12.1 2.2 5:23.91 java
361 root 20 0 18.9g 2.8g 15988 S 12.1 2.2 5:18.12 java
301 root 20 0 18.9g 2.8g 15988 S 10.8 2.2 5:19.84 java
- 将对应线程id转换为16进制数
[root@ nihao-hessian-6dc768ffdb-f7smm /]# printf "%x\n" 294
126
- 查找对应的线程信息
[root@nihao-hessian-6dc768ffdb-f7smm /]# jstack 30 | grep 0x126 --color -ab30
165950:"http-nio-8080-exec-1" #204 daemon prio=5 os_prio=0 tid=0x00007f88c61fe000 nid=0x126 runnable [0x00007f87a4837000]
166065- java.lang.Thread.State: RUNNABLE
166101- at java.lang.Class.getDeclaredFields0(Native Method)
166155- at java.lang.Class.privateGetDeclaredFields(Class.java:2583)
166217- at java.lang.Class.getDeclaredField(Class.java:2068)
166271- at jdk.nashorn.internal.runtime.Context$ContextCodeInstaller$1.run(Context.java:209)
166357- at jdk.nashorn.internal.runtime.Context$ContextCodeInstaller$1.run(Context.java:204)
166443- at java.security.AccessController.doPrivileged(Native Method)
166506- at jdk.nashorn.internal.runtime.Context$ContextCodeInstaller.initialize(Context.java:204)
166597- at jdk.nashorn.internal.codegen.CompilationPhase$InstallPhase.transform(CompilationPhase.java:508)
166697- at jdk.nashorn.internal.codegen.CompilationPhase.apply(CompilationPhase.java:624)
166780- at jdk.nashorn.internal.codegen.Compiler.compile(Compiler.java:655)
166849- at jdk.nashorn.internal.runtime.Context.compile(Context.java:1317)
166917- - locked <0x00000000e9c00368> (a jdk.nashorn.internal.runtime.Context)
166989- at jdk.nashorn.internal.runtime.Context.compileScript(Context.java:1251)
167063- at jdk.nashorn.internal.runtime.Context.compileScript(Context.java:627)
167136- at jdk.nashorn.api.scripting.NashornScriptEngine.compileImpl(NashornScriptEngine.java:535)
167228- at jdk.nashorn.api.scripting.NashornScriptEngine.compileImpl(NashornScriptEngine.java:524)
167320- at jdk.nashorn.api.scripting.NashornScriptEngine.evalImpl(NashornScriptEngine.java:402)
167409- at jdk.nashorn.api.scripting.NashornScriptEngine.eval(NashornScriptEngine.java:155)
167494- at javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:264)
167568- at com.yeepay.yjf.core.aop.CacheHanlerAop.getSelDefKey(CacheHanlerAop.java:104)
167649- at com.yeepay.yjf.core.aop.CacheHanlerAop.cacheAop(CacheHanlerAop.java:52)
167725- at sun.reflect.GeneratedMethodAccessor274.invoke(Unknown Source)
167791- at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
167881- at java.lang.reflect.Method.invoke(Method.java:498)
167934- at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethodWithGivenArgs(AbstractAspectJAdvice.java:629)
168056- at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethod(AbstractAspectJAdvice.java:618)
168165- at org.springframework.aop.aspectj.AspectJAroundAdvice.invoke(AspectJAroundAdvice.java:70)
168257- at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:168)
168367- at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:92)
- 问题总结:
由上jstack日志可以看出,cpu耗损的主要位置,由此解决我们的问题