最近为了严格控制控制每个java程序在mesos上的资源占用,通过cgroup对cpu&mem做了严格限制,这样设置对整个mesos-slave机器起到了保护作用,但是最近遇到服务经常由于oom被kill -9, 个别服务会影响用户体验。为了解决这个问题,专门对java Heap和mesos分配的内存做了调优。经过调整后,线上节省了75G的内存
==================
JVM Memory Component
-
Heap: The heap is the runtime data area from which memory for all class instances and arrays is allocated.
How to determine a proper java Heap size
Profile the service (what it does), estimate the request throuput
See if the service needs to load the 3rd party date file (such as the robot training datasets, the dataset size is prety large), if yes, the the -Xmx must be larger than the dataset size
-
Configure GC, watch the heap value after each Full GC, that value is the lowest value you should set.
1. How to config gc in java application GC_LOG_FILE='service-gc.log' java -jar -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:MetaspaceSize=100m -Xloggc:$GC_LOG_FILE \ -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=10M \ package.jar the gc.log output: 2017-07-28T05:59:17.749+0800: 49606.298: [Full GC (Ergonomics) [PSYoungGen: 288K->0K(348672K)] [ParOldGen: 699337K->75536K(699392K)] 699625K->75536K(1048064K), [Metaspace: 73744K->73744K(1116160K)], 0.1417127 secs] [Times: user=0.42 sys=0.00, real=0.14 secs]
2017-07-28T08:55:43.627+0800: 60192.176: [Full GC (Ergonomics) [PSYoungGen: 256K->0K(348672K)] [ParOldGen: 699270K->72322K(699392K)] 699526K->72322K(1048064K), [Metaspace: 73764K->73764K(1116160K)], 0.1451565 secs] [Times: user=0.43 sys=0.00, real=0.14 secs]
2017-07-28T11:37:47.821+0800: 69916.370: [Full GC (Ergonomics) [PSYoungGen: 384K->0K(347648K)] [ParOldGen: 699259K->73371K(699392K)] 699643K->73371K(1047040K), [Metaspace: 73894K->73894K(1116160K)], 0.1603913 secs] [Times: user=0.48 sys=0.00, real=0.16 secs]
2017-07-28T14:22:50.322+0800: 79818.872: [Full GC (Ergonomics) [PSYoungGen: 416K->0K(348672K)] [ParOldGen: 699275K->73551K(699392K)] 699691K->73551K(1048064K), [Metaspace: 73942K->73918K(1116160K)], 0.1654482 secs] [Times: user=0.49 sys=0.00, real=0.17 secs]
```
**In aboue gc log, the service -Xmx is set 1024M, the OldGen in heap changes from 699275K->73551K(699392K), so the full gc collected almost 620M memory, to tune this value we should set the -Xmx larger than 70M (690M - 620M), in this case, we can try to set it as 520M
`-Xms=512m -Xmx=512m`, then keep watching the gc log to determine if 512m is enough, if this setting is low, OOM error will occur**
Method area: Method area is created on virtual machine startup, shared among all java virtual machine threads and it is logically part of heap area. It stores per-class structures such as the run-time constant pool, field and method data, and the code for methods and constructors.
-
Stack (-Xss: default value 1024K): holds local variables and partial results, and plays a part in method invocation and return. Once a thread is created, it will hold 1024K memory at most, and this can be dynamically expanded, so it means each thread might occupy 1K ~ 1024K, the size depends on the thread function
How to determine a proper java stack size
-
Check the thread count of the jvm process.
e.g: java pid is 26065 cat /proc/26065/status |grep Threads Threads: 1569 This java process has 1569 threads, ease thread takes 1k ~ 1024K memory
-
Estimate the stack size as per the threads count
This is really not straightforward, it depends on what the thread does and how it handles the data, but at least each thread won't take more than 1M, if it does, a StackOverfowError will thrown. For this service instance, it occupies 2.8G mem in total (2G is for Heap), so each thread takes 0.5M on average.
-
Notes: As per above result, we set -Xms=2048m -Xmx=2048m, and limit the mesos memory as 4096m, if there are more concurrent requests, we can increase mesos limited memory
Native Method: full gc will call native method, will use certain amount of memory, this will only take tiny part of the jvm memory
pc (Program counter) register: Each Java Virtual Machine thread has its own pc (program counter) register. At any point, each Java Virtual Machine thread is executing the code of a single method, namely the current method (§2.6) for that thread. If that method is not native, the pc register contains the address of the Java Virtual Machine instruction currently being executed. If the method currently being executed by the thread is native, the value of the Java Virtual Machine's pc register is undefined.
Summary: Basically JVM memory size = Heap + Stack. Keep monitoring the full gc and decide the reasonalbe heap value.
Refrence:
http://howtodoinjava.com/core-java/garbage-collection/jvm-memory-model-structure-and-components/