报警名称: intelligent-rec Exceptions_OutOfMemoryError_Error
现有JVM 参数
`-Xmx20G
-Xms20G `
-XX:MaxDirectMemorySize=5G
-XX:+UnlockExperimentalVMOptions
-XX:G1NewSizePercent=5
-XX:G1HeapRegionSize=16M
-XX:G1RSetUpdatingPauseTimePercent=1
-XX:+ParallelRefProcEnabled
-XX:MetaspaceSize=1G
-XX:+UseG1GC
-XX:MaxGCPauseMillis=100
-XX:GCPauseIntervalMillis=300
-XX:G1MixedGCCountTarget=16
-XX:StringTableSize=4000000
-XX:+PrintStringTableStatistics
-XX:+PrintGCDetails
-XX:+PrintGCDateStamps
-Xloggc:/data/logs/intelligent-rec-gc-%p-%t.log
-XX:+PrintGCApplicationConcurrentTime
-XX:+PrintGCApplicationStoppedTime
-XX:+HeapDumpOnOutOfMemoryError
-Djava.net.preferIPv4Stack=true
`-XX:G1MaxNewSizePercent=50`
`-XX:InitiatingHeapOccupancyPercent=30`
-XX:G1ReservePercent=15
-XX:+UnlockDiagnosticVMOptions
-XX:+PrintHeapAtGC
新生代比例有两个数值指定,下限:-XX:G1NewSizePercent,默认值5%,上限:-XX:G1MaxNewSizePercent 默认值60%
相应老年代最大40%
GC日志截图分析
JVM内存分布
每代分配的大小 allocated
Indicates the allocated size for each generation. This data point is gathered from the GC log,
thus It may or may not match with the size that is specified by the JVM system properties (i.e. –Xmx, -Xms,…).
Say you have configured total heap size (i.e. –Xmx) as 2gb, whereas at runtime if JVM has allocated only 1gb,
then in this report you are going to see the allocated size as 1gb only.
每代分配的大小。这个数据点是从GC日志中收集的,
因此,它可能与JVM系统属性(例如-Xmx, -Xms,…)指定的大小匹配,也可能不匹配。
假设您已经将总堆大小(即-Xmx)配置为2gb,而在运行时,如果JVM只分配了1gb,
然后在这个报告中,您将看到分配的大小仅为1gb。
每一代的峰值内存利用率 peak
,通常它不会超过分配的大小
Peak memory utilization of each generation. Typically it won't exceed the `allocated` size.
However in few cases, we have seen peak utilization go beyond allocated size as well, especially in G1 GC.
每一代的峰值内存利用率。通常它不会超过分配的大小。然而,在少数情况下,峰值利用率也超过了分配的大小,特别是在G1 GC中。
-XX:G1NewSizePercent=5 -XX:G1MaxNewSizePercent=50 -Xmx20G(年轻代最小为20*0.05=1,最大为 20*0.5=10;老年代:20-10=10,20-1=19)
老年代内存徒增
https://zhuanlan.zhihu.com/p/79643783
老区不够了,这个时候会把young区所有对象不管死活都转成old区对象,所以总的内存使用量会暴增?
结论存疑,但年轻代晋升级老年代,当survivor空间不足时,会直接进入老年代,可以确认。
两种可能
- 可达对象(可能性较大)
- 不可达对象
Mixed gc停止
12:15 之后mix GC终止 说明垃圾不多 (严重)
GC日志出现 to-space exhausted 或者 to-space overflow 马上FullGC
FULL gc
Full GC (Allocation Failure
2021-05-23T12:24:47.616+0800: 403254.508: Total time for which application threads were stopped: 0.0385721 seconds, Stopping threads took: 0.0002912 seconds
2021-05-23T12:24:47.616+0800: 403254.508: Application time: 0.0001031 seconds
{Heap before GC invocations=29097 (full 0):
garbage-first heap total 20971520K, used 20938752K [0x00000002c0000000, 0x00000002c1002800, 0x00000007c0000000)
region size 16384K, 0 young (0K), 0 survivors (0K)
Metaspace used 115382K, capacity 122379K, committed 123648K, reserved 1157120K
class space used 13564K, capacity 14845K, committed 15104K, reserved 1048576K
2021-05-23T12:24:47.625+0800: 403254.517: [Full GC (Allocation Failure) 19G->15G(20G), 38.9985388 secs]
[Eden: 0.0B(1024.0M)->0.0B(1024.0M) Survivors: 0.0B->0.0B Heap: 20.0G(20.0G)->15.2G(20.0G)], [Metaspace: 115382K->115370K(1157120K)]
Heap after GC invocations=29098 (full 1):
garbage-first heap total 20971520K, used 15899924K [0x00000002c0000000, 0x00000002c1002800, 0x00000007c0000000)
region size 16384K, 0 young (0K), 0 survivors (0K)
Metaspace used 115370K, capacity 122361K, committed 123648K, reserved 1157120K
class space used 13563K, capacity 14842K, committed 15104K, reserved 1048576K
}
[Times: user=59.83 sys=0.37, real=39.00 secs]
YGC
105.0711085 secs
{Heap before GC invocations=29071 (full 0):
garbage-first heap total 20971520K, used 13940955K [0x00000002c0000000, 0x00000002c1002800, 0x00000007c0000000)
region size 16384K, 640 young (10485760K), 12 survivors (196608K)
Metaspace used 115257K, capacity 122315K, committed 123648K, reserved 1157120K
class space used 13556K, capacity 14845K, committed 15104K, reserved 1048576K
2021-05-23T12:21:02.040+0800: 403028.932: [GC pause (G1 Evacuation Pause) (young), 105.0711085 secs]
[Parallel Time: 105051.7 ms, GC Workers: 13]
[GC Worker Start (ms): Min: 403028933.0, Avg: 403028933.3, Max: 403028933.6, Diff: 0.6]
[Ext Root Scanning (ms): Min: 12.4, Avg: 12.7, Max: 13.0, Diff: 0.6, Sum: 165.2]
[Update RS (ms): Min: 5.1, Avg: 5.1, Max: 5.5, Diff: 0.4, Sum: 66.8]
[Processed Buffers: Min: 22, Avg: 51.3, Max: 89, Diff: 67, Sum: 667]
[Scan RS (ms): Min: 26.6, Avg: 26.9, Max: 27.0, Diff: 0.4, Sum: 349.6]
[Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
[Object Copy (ms): Min: 95548.6, Avg: 96411.4, Max: 97892.4, Diff: 2343.8, Sum: 1253348.3]
[Termination (ms): Min: 7113.7, Avg: 8594.7, Max: 9457.6, Diff: 2343.9, Sum: 111731.2]
[Termination Attempts: Min: 1443879, Avg: 1480471.4, Max: 1575977, Diff: 132098, Sum: 19246128]
[GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.3, Diff: 0.3, Sum: 1.8]
[GC Worker Total (ms): Min: 105050.5, Avg: 105051.0, Max: 105051.3, Diff: 0.8, Sum: 1365663.0]
[GC Worker End (ms): Min: 403133984.2, Avg: 403133984.3, Max: 403133984.4, Diff: 0.3]
[Code Root Fixup: 0.4 ms]
[Code Root Purge: 0.0 ms]
[Clear CT: 1.2 ms]
[Other: 17.8 ms]
[Choose CSet: 0.0 ms]
[Ref Proc: 4.7 ms]
[Ref Enq: 0.3 ms]
[Redirty Cards: 8.3 ms]
[Humongous Register: 0.1 ms]
[Humongous Reclaim: 0.0 ms]
[Free CSet: 1.8 ms]
[Eden: 10048.0M(10048.0M)->0.0B(352.0M) Survivors: 192.0M->1280.0M Heap: 13.3G(20.0G)->7754.5M(20.0G)]
Heap after GC invocations=29072 (full 0):
garbage-first heap total 20971520K, used 7940573K [0x00000002c0000000, 0x00000002c1002800, 0x00000007c0000000)
region size 16384K, 80 young (1310720K), 80 survivors (1310720K)
Metaspace used 115257K, capacity 122315K, committed 123648K, reserved 1157120K
class space used 13556K, capacity 14845K, committed 15104K, reserved 1048576K
}
[Times: user=1364.20 sys=1.26, real=105.07 secs]
-XX:ParallelGCThreads 加大线程数
并行GC线程数:通过 -XX:ParallelGCThreads来指定,也就是在STW阶段工作的GC线程数,其值遵循以下原则:
- 如果用户显示指定了ParallelGCThreads,则使用用户指定的值。
- 否则,需要根据实际的CPU所能够支持的线程数来计算ParallelGCThreads的值,计算方法见步骤③和步骤④。
- 如果物理CPU所能够支持线程数小于8,则ParallelGCThreads的值为CPU所支持的线程数。这里的阀值为8,是因为JVM中调用nof_parallel_worker_threads接口所传入的switch_pt的值均为8。
- 如果物理CPU所能够支持线程数大于8,则ParallelGCThreads的值为8加上一个调整值,调整值的计算方式为:物理CPU所支持的线程数减去8所得值的5/8或者5/16,JVM会根据实际的情况来选择具体是乘以5/8还是5/16。
比如,在64线程的x86 CPU上,如果用户未指定ParallelGCThreads的值,则默认的计算方式为:ParallelGCThreads = 8 + (64 - 8) * (5/8) = 8 + 35 = 43。
结论
- to-space exhausted 或者 to-space overflow 可能有大量对象无法晋升,老年代担保,直接进入老年代
- 在12:15 之后mix GC终止,导致该情况中止原因为
垃圾对象没有达到阀值 5%
,不启动Mixed。如果老年代为可达对象,情况会比较严重! 需要dump 内存进一步分析。
触发Mixed GC的堆垃圾占比:通过`-XX:G1HeapWastePercent`指定,默认值5%,也就是在全局标记结束后能够统计出所有Cset内可被回收的垃圾占整对的比例值,如果超过5%,那么就会触发之后的多轮Mixed GC,如果不超过,那么会在之后的某次Young GC中重新执行全局并发标记。
- 年轻代相对较大,存活对象较多时,复制压力较大。
建议解决方案
修改JVM参数
-XX:HeapDumpPath=/data/logs/gc%p-%t.hprof
-XX:SurvivorRatio=6
-XX:ParallelGCThreads=64
-XX:G1MaxNewSizePercent=30
-XX:ParallelGCThreads=64 该参数影响特别大
-XX:G1NewSizePercent=20
(默认5%)
设置年轻代占用堆最小百分比为20%来解决这个问题。因为有了这个设置,即使引用处理耗时变长,Eden区大小也不可能比这个阈值更低,从而避免对象提早晋升。同时,我们还添加了参数-XX:+ParallelRefProcEnabled
,从而在Remark阶段多线程并发处理引用对象。