1. 背景
通过jstack分析kafka堆栈信息时,发现jstack无法使用,并且提示如下异常:
Unable to open socket file: target process not responding or HotSpot VM not loaded
The -F option can be used when the target process is not responding
2. 分析
- 查看JStack.main源码,在没有加任何参数的情况下默认会执行runThreadDump方法
// now execute using the SA JStack tool or the built-in thread dumper if (useSA) { // parameters (<pid> or <exe> <core> String params[] = new String[paramCount]; for (int i=optionCount; i<args.length; i++ ){ params[i-optionCount] = args[i]; } runJStackTool(mixed, locks, params); } else { // pass -l to thread dump operation to get extra lock info String pid = args[optionCount]; String params[]; if (locks) { params = new String[] { "-l" }; } else { params = new String[0]; } runThreadDump(pid, params); }
- 查看runThreadDump方法,如果VirtualMachine.attach执行失败会抛出异常
private static void runThreadDump(String pid, String args[]) throws Exception { VirtualMachine vm = null; try { vm = VirtualMachine.attach(pid); } catch (Exception x) { String msg = x.getMessage(); if (msg != null) { System.err.println(pid + ": " + msg); } else { x.printStackTrace(); } if ((x instanceof AttachNotSupportedException) && (loadSAClass() != null)) { System.err.println("The -F option can be used when the target " + "process is not responding"); } System.exit(1); }
- 查看VirtualMachine.attach方法,根据系统环境返回对映的子类如:BsdVirtualMachine、LinuxVirtualMachine、SolarisVirtualMachine
public static VirtualMachine attach(String id) throws AttachNotSupportedException, IOException { if (id == null) { throw new NullPointerException("id cannot be null"); } List<AttachProvider> providers = AttachProvider.providers(); if (providers.size() == 0) { throw new AttachNotSupportedException("no providers installed"); } AttachNotSupportedException lastExc = null; for (AttachProvider provider: providers) { try { return provider.attachVirtualMachine(id); } catch (AttachNotSupportedException x) { lastExc = x; } } throw lastExc; }
- 因为我们的运行环境是Linux,所以查看LinuxAttachProvider.attachVirtualMachine方法,主要作用是实例化LinuxVirtualMachine类
public VirtualMachine attachVirtualMachine(VirtualMachineDescriptor vmd) throws AttachNotSupportedException, IOException { if (vmd.provider() != this) { throw new AttachNotSupportedException("provider mismatch"); } // To avoid re-checking if the VM if attachable, we check if the descriptor // is for a hotspot VM - these descriptors are created by the listVirtualMachines // implementation which only returns a list of attachable VMs. if (vmd instanceof HotSpotVirtualMachineDescriptor) { assert ((HotSpotVirtualMachineDescriptor)vmd).isAttachable(); checkAttachPermission(); return new LinuxVirtualMachine(this, vmd.id()); } else { return attachVirtualMachine(vmd.id()); } }
- 查看LinuxVirtualMachine类,只有在findSocketFile方法返回null的时候才会抛出异常
// "/tmp" is used as a global well-known location for the files // .java_pid<pid>. and .attach_pid<pid>. It is important that this // location is the same for all processes, otherwise the tools // will not be able to find all Hotspot processes. // Any changes to this needs to be synchronized with HotSpot. private static final String tmpdir = "/tmp"; // Indicates if this machine uses the old LinuxThreads static boolean isLinuxThreads; // The patch to the socket file created by the target VM String path; /** * Attaches to the target VM */ LinuxVirtualMachine(AttachProvider provider, String vmid) throws AttachNotSupportedException, IOException { super(provider, vmid); // This provider only understands pids int pid; try { pid = Integer.parseInt(vmid); } catch (NumberFormatException x) { throw new AttachNotSupportedException("Invalid process identifier"); } // Find the socket file. If not found then we attempt to start the // attach mechanism in the target VM by sending it a QUIT signal. // Then we attempt to find the socket file again. path = findSocketFile(pid); if (path == null) { File f = createAttachFile(pid); try { // On LinuxThreads each thread is a process and we don't have the // pid of the VMThread which has SIGQUIT unblocked. To workaround // this we get the pid of the "manager thread" that is created // by the first call to pthread_create. This is parent of all // threads (except the initial thread). if (isLinuxThreads) { int mpid; try { mpid = getLinuxThreadsManager(pid); } catch (IOException x) { throw new AttachNotSupportedException(x.getMessage()); } assert(mpid >= 1); sendQuitToChildrenOf(mpid); } else { sendQuitTo(pid); } // give the target VM time to start the attach mechanism int i = 0; long delay = 200; int retries = (int)(attachTimeout() / delay); do { try { Thread.sleep(delay); } catch (InterruptedException x) { } path = findSocketFile(pid); i++; } while (i <= retries && path == null); if (path == null) { throw new AttachNotSupportedException( "Unable to open socket file: target process not responding " + "or HotSpot VM not loaded"); } } finally { f.delete(); } }
- 查看LinuxVirtualMachine.findSocketFile方法,发现在tmpdir目录下找不到【.java_pid+pid】文件的时候才会返回null,查看源码tmpdir变量的值为/tmp
// Return the socket file for the given process. private String findSocketFile(int pid) { File f = new File(tmpdir, ".java_pid" + pid); if (!f.exists()) { return null; } return f.getPath(); }
3. 结论
根据上边的分析结果得出只有在/tmp/.java_pid+pid文件不存在的情况下才会抛出jstack执行失败异常,解决办法是重启应用程序,非常关键的一点是别去删除.java_pid_pid文件,由于我们使用的时centos7,默认7天之后会自动删除该文件,所以我们更改了系统配置禁止删除该文件