根据日志确定发生OOM的原因和区域,以下几个内存区域都可能发生OOM,先找到打印出的OOM错误日志和dump文件(linux系统的OOM不会产生),错误日志一般可以通过
1.堆heap
错误提示有:
java.lang.OutOfMemoryError: Java heap space
java.lang.OutOfMemoryError: unable to create new native thread
java.lang.OutOfMemoryError:GC overhead limit exceeded
这三种情况产生的原因都是因为JVM设置的堆内存较小,或者代码中产生了内存泄漏导致耗光了内存,定位步骤可以参照网上的攻略,步骤基本相同
1.先查看设置的xms和xmx大小,如果太小增大该参数;
2.如果增大了参数还是会出现此问题,可能是程序中存在内存泄漏,通过工具(jvisualvm或jconsole或mat或jmap等)进行定位:
ps -ef|grep java 查看所有的java进程,找到出现问题的java进程,例如此pid为45338
jmap -histo:live 45338|head -100 找到占用内存最大的对象
num #instances #bytes class name
----------------------------------------------
1: 211963 25960888 [C
2: 141554 12456752 java.lang.reflect.Method
3: 262890 8412480 java.util.concurrent.ConcurrentHashMap$Node
4: 151455 7269840 org.aspectj.weaver.reflect.ShadowMatchImpl
5: 209866 5036784 java.lang.String
6: 151455 4846560 org.aspectj.weaver.patterns.ExposedState
7: 93251 3730040 java.util.LinkedHashMap$Entry
8: 60755 3402280 java.util.LinkedHashMap
9: 801 2688000 [Ljava.util.concurrent.ConcurrentHashMap$Node;
10: 40718 2631792 [Ljava.lang.Object;
11: 39048 2623560 [Ljava.util.HashMap$Node;
12: 19925 2468120 [B
13: 30926 2226672 java.lang.reflect.Field
14: 18914 2155672 java.lang.Class
15: 88266 2022544 [Ljava.lang.Class;
16: 14117 1692472 [I
17: 39147 1252704 java.util.HashMap$Node
18: 34709 833016 org.springframework.core.MethodClassKey
19: 24055 769760 java.lang.ref.WeakReference
20: 30624 734976 java.util.ArrayList
21: 27200 684208 [Z
22: 26754 642088 [Lorg.aspectj.weaver.ast.Var;
23: 38527 616432 java.lang.Object
24: 15276 611040 java.util.WeakHashMap$Entry
25: 7266 547248 [Ljava.lang.reflect.Method;
26: 6573 525840 java.lang.reflect.Constructor
27: 7198 518256 org.springframework.core.annotation.AnnotationAttributes
28: 16154 516928 java.util.LinkedList
29: 16787 402888 java.util.LinkedList$Node
30: 24436 390976 java.util.LinkedHashSet
31: 7998 383904 java.util.HashMap
32: 12668 366872 [Ljava.lang.String;
33: 4630 355088 [S
34: 7749 309960 java.lang.ref.SoftReference
35: 3059 293664 org.springframework.beans.GenericTypeAwarePropertyDescriptor
36: 4000 288000 org.springframework.core.type.classreading.AnnotationMetadataReadingVisitor
37: 5054 283024 java.lang.Class$ReflectionData
38: 6275 251000 com.sun.org.apache.xerces.internal.dom.DeferredAttrImpl
39: 10072 241728 sun.reflect.generics.tree.SimpleClassTypeSignature
40: 5895 235800 com.sun.org.apache.xerces.internal.dom.DeferredTextImpl
41: 9744 233856 sun.reflect.annotation.AnnotationInvocationHandler
42: 4727 226896 org.springframework.core.ResolvableType
43: 9152 219648 org.springframework.cglib.core.Signature
44: 8533 204792 java.beans.MethodRef
45: 8972 199592 [Ljava.lang.reflect.Type;
46: 451 198640 [Ljava.util.WeakHashMap$Entry;
47: 10072 191568 [Lsun.reflect.generics.tree.TypeArgument;
48: 3129 175224 com.sun.org.apache.xerces.internal.dom.DeferredElementImpl
49: 10371 165936 java.util.LinkedHashMap$LinkedValues
50: 976 156160 org.springframework.beans.factory.support.RootBeanDefinition
51: 2615 152992 [Ljava.lang.reflect.Field;
52: 6364 152736 java.util.Collections$UnmodifiableRandomAccessList
找到异常的对象实例后进行定位,排查出代码存在的问题
2.栈
java.lang.StackOverflowError
java.lang.OutOfMemoryError: Java heap space
StackOverflowError出现的原因一般是递归的条件控制不好导致的无限递归调用而抛出的异常,后面异常同堆异常的排查方法。
3.直接内存
Exception in thread"main"java.lang.OutOfMemoryError
at sun.misc.Unsafe.allocateMemory(Native Method)
at org.fenixsoft.oom.DMOOM.main(DMOOM.java:20)
一般有如下三种方法可以申请直接内存(堆外内存)
1.Unsafe.allocateMemory()
2.ByteBuffer.allocateDirect()
3.native方法
具体可以参考这篇文章:
https://blog.csdn.net/zhuguang10/article/details/88966563
4. linux系统 OOM-Killer
例如jvm配置为
-Xms6g -Xmx6g -Xss512k -XX:MetaspaceSize=256m
-XX:MaxMetaspaceSize=256m
-XX:NewRatio=1 -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
-XX:+CMSClassUnloadingEnabled -XX:+DisableExplicitGC
-XX:MaxTenuringThreshold=2 -XX:CMSInitiatingOccupancyFraction=75
-verbose:gc -Xloggc:/Users/ding/Desktop/logs/gc.log -XX:+PrintGCDetails
-XX:+PrintGCDateStamps -XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/Users/ding/Desktop/logs
-XX:ErrorFile=/Users/ding/Desktop/logs/hs_err_pid%p.log
-Dons.client.logRoot=/Users/ding/Desktop/logs -Dons.client.logLevel=WARN
-Dons.client.logFileMaxIndex=2
如果此时业务日志文件没有报错日志,也没有生成dump文件,但是hs_err_pid文件会会产生此类日志
-bash-4.2$ cat hs_err_pid8777.log
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 249561088 bytes for committing reserved memory.
# Possible reasons:
# The system is out of physical RAM or swap space
# In 32 bit mode, the process size limit was hit
# Possible solutions:
# Reduce memory load on the system
# Increase physical memory or swap space
# Check if swap backing store is full
# Use 64 bit Java on a 64 bit OS
# Decrease Java heap size (-Xmx/-Xms)
# Decrease number of Java threads
# Decrease Java thread stack sizes (-Xss)
# Set larger code cache with -XX:ReservedCodeCacheSize=
# This output file may be truncated or incomplete.
#
# Out of Memory Error (os_linux.cpp:2640), pid=8777, tid=0x00002abdf7403700
#
# JRE version: Java(TM) SE Runtime Environment (8.0_131-b11) (build 1.8.0_131-b11)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.131-b11 mixed mode linux-amd64 compressed oops)
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
此时大概可以定位到是触发了linux的oom-killer,它是Linux内核的一种内存管理机制,在系统可用内存较少的情况下,内核为保证系统还能够继续运行下去,会选择杀掉一些进程释放掉一些内存。通常oom_killer的触发流程是:进程A想要分配物理内存(通常是读写内存)->触发缺页异常->内核去分配物理内存->物理内存不足,触发OOM;
查看/var/log/message(Out of memory: Kill process 29164 (java) score 908 or sacrifice child)
/var/log/message包括整体系统信息,其中也包含系统启动期间的日志。此外,mail,cron,daemon,kern和auth等内容也记录在var/log/messages日志中
Jul 18 14:35:19 izbp764664httdtdff11wtycz systemd: Starting Session 126131 of user admin.
Jul 18 14:38:43 izbp764664httdtdff11wtycz kernel: java invoked oom-killer: gfp_mask=0x280da, order=0, oom_score_adj=0
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: java cpuset=/ mems_allowed=0
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: CPU: 1 PID: 29168 Comm: java Not tainted 3.10.0-693.17.1.el7.x86_64 #1
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.
org 04/01/2014
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: Call Trace:
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [<ffffffff816a6071>] dump_stack+0x19/0x1b
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [<ffffffff816a1466>] dump_header+0x90/0x229
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [<ffffffff810eb28c>] ? ktime_get_ts64+0x4c/0xf0
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [<ffffffff8113f24f>] ? delayacct_end+0x8f/0xb0
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [<ffffffff81188274>] oom_kill_process+0x254/0x3d0
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [<ffffffff81187d1d>] ? oom_unkillable_task+0xcd/0x120
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [<ffffffff81187dc6>] ? find_lock_task_mm+0x56/0xc0
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [<ffffffff81188ab6>] out_of_memory+0x4b6/0x4f0
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [<ffffffff816a1f6a>] __alloc_pages_slowpath+0x5d6/0x724
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [<ffffffff8118ec85>] __alloc_pages_nodemask+0x405/0x420
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [<ffffffff811d6255>] alloc_pages_vma+0xb5/0x200
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [<ffffffff811b4480>] handle_mm_fault+0xb60/0xfa0
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [<ffffffff816b37e4>] __do_page_fault+0x154/0x450
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [<ffffffff816b3bc6>] trace_do_page_fault+0x56/0x150
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [<ffffffff816b325a>] do_async_page_fault+0x1a/0xd0
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [<ffffffff816af928>] async_page_fault+0x28/0x30
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: Mem-Info:
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: active_anon:915301 inactive_anon:36 isolated_anon:0#012 active_file:963 inactive_file:995 isolated_file:0#012
unevictable:0 dirty:11 writeback:0 unstable:0#012 slab_reclaimable:12666 slab_unreclaimable:4309#012 mapped:426 shmem:110 pagetables:3300 bounce:0#012 free:21
938 free_pcp:261 free_cma:0
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: Node 0 DMA free:15348kB min:276kB low:344kB high:412kB active_anon:420kB inactive_anon:0kB active_file:0kB ina
ctive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:4kB slab_unreclaimable:20kB kernel_stack:0kB pagetables:32kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: lowmem_reserve[]: 0 2814 3773 3773
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: Node 0 DMA32 free:54208kB min:50208kB low:62760kB high:75312kB active_anon:2756212kB inactive_anon:88kB active_file:2004kB inactive_file:2048kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3129216kB managed:2884196kB mlocked:0kB dirty:36kB writeback:0kB mapped:1076kB shmem:296kB slab_reclaimable:38172kB slab_unreclaimable:10040kB kernel_stack:2912kB pagetables:8532kB unstable:0kB bounce:0kB free_pcp:148kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: lowmem_reserve[]: 0 0 958 958
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: Node 0 Normal free:16980kB min:17096kB low:21368kB high:25644kB active_anon:904572kB inactive_anon:56kB active_file:1592kB inactive_file:1428kB unevictable:0kB isolated(anon):0kB isolated(file):100kB present:1048576kB managed:981532kB mlocked:0kB dirty:8kB writeback:0kB mapped:728kB shmem:144kB slab_reclaimable:12488kB slab_unreclaimable:7176kB kernel_stack:1552kB pagetables:4636kB unstable:0kB bounce:0kB free_pcp:204kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:7 all_unreclaimable? no
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: lowmem_reserve[]: 0 0 0 0
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: Node 0 DMA: 2*4kB (UE) 7*8kB (UEM) 2*16kB (UE) 2*32kB (UE) 3*64kB (UEM) 1*128kB (E) 2*256kB (UE) 2*512kB (EM) 3*1024kB (UEM) 1*2048kB (E) 2*4096kB (M) = 15328kB
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: Node 0 DMA32: 432*4kB (UEM) 1022*8kB (UEM) 866*16kB (UEM) 436*32kB (UEM) 120*64kB (UEM) 29*128kB (UEM) 10*256kB (UE) 4*512kB (UE) 1*1024kB (M) 0*2048kB 0*4096kB = 54736kB
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: Node 0 Normal: 102*4kB (EM) 364*8kB (UEM) 274*16kB (UEM) 101*32kB (UEM) 43*64kB (UEM) 12*128kB (UEM) 3*256kB (UM) 2*512kB (UM) 0*1024kB 0*2048kB 0*4096kB = 17016kB
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: 325 total pagecache pages
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: 0 pages in swap cache
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: Swap cache stats: add 0, delete 0, find 0/0
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: Free swap = 0kB
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: Total swap = 0kB
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: 1048446 pages RAM
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: 0 pages HighMem/MovableOnly
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: 78037 pages reserved
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [ 336] 0 336 22802 104 50 0 0 systemd-journal
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [ 359] 0 359 11092 262 22 0 -1000 systemd-udevd
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [ 393] 0 393 13863 114 27 0 -1000 auditd
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [ 412] 81 412 6137 118 17 0 -900 dbus-daemon
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [ 414] 998 414 133562 1422 58 0 0 polkitd
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [ 415] 0 415 5369 62 14 0 0 irqbalance
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [ 417] 0 417 6129 167 16 0 0 systemd-logind
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [ 419] 0 419 171693 444 154 0 0 rsyslogd
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [ 427] 0 427 6464 51 18 0 0 atd
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [ 442] 0 442 27511 34 10 0 0 agetty
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [ 444] 0 444 27511 32 11 0 0 agetty
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [ 713] 0 713 140599 3186 93 0 0 tuned
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [ 738] 38 738 7476 159 19 0 0 ntpd
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [ 1105] 0 1105 26499 245 54 0 -1000 sshd
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [ 1138] 0 1138 16599 204 16 0 0 aliyun-service
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [ 1293] 0 1293 8657 313 20 0 0 staragentd
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [ 1294] 0 1294 323022 3276 68 0 0 staragent-core
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [ 1295] 0 1295 200147 15278 71 0 0 staragent-ppf
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [18215] 0 18215 30529 734 61 0 0 nginx
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [28759] 0 28759 31558 155 18 0 0 crond
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [ 1327] 1000 1327 30530 835 60 0 0 nginx
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [ 1328] 1000 1328 30530 835 60 0 0 nginx
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [ 3947] 0 3947 8179 415 20 0 0 AliYunDunUpdate
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [ 4051] 0 4051 31625 697 59 0 0 AliYunDun
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [20736] 0 20736 56938 867 25 0 0 CmsGoAgent.linu
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [20742] 0 20742 199942 2717 40 0 0 exe
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [21951] 0 21951 36554 327 73 0 0 sshd
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [21955] 1000 21955 36554 326 71 0 0 sshd
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [21956] 1000 21956 28901 126 15 0 0 bash
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [29164] 1000 29164 1550710 879523 1872 0 0 java
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [29499] 0 29499 36554 326 73 0 0 sshd
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [29501] 1000 29501 36554 332 72 0 0 sshd
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [29502] 1000 29502 28871 126 13 0 0 bash
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: [29617] 1000 29617 26986 25 10 0 0 tail
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: Out of memory: Kill process 29164 (java) score 908 or sacrifice child
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: Killed process 29164 (java) total-vm:6202840kB, anon-rss:3518092kB, file-rss:0kB, shmem-rss:0kB
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: exe invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: exe cpuset=/ mems_allowed=0
Jul 18 14:38:44 izbp764664httdtdff11wtycz kernel: CPU: 0 PID: 20750 Comm: exe Not tainted 3.10.0-693.17.1.el7.x86_64 #1
机器是8G内存,java分配了6G堆内存,导致后续运行机器内存不足;修改JVM参数成4G,-Xms4g -Xmx4g,修改参数后再次oom时生成了dump文件,经过分析是exel解析出现问题,导致内存不够出现oom
5.永久代或者元数据空间
jdk1.8以后移除永久代,增加了元数据空间,常量保存在堆中,方法区放在元数据空间中,运行时出现此异常一般是动态代理生成class文件导致,例如jsp运行时的动态编译产生
参考文章
https://blog.csdn.net/windforce828/article/details/104819055/