引言
前段时间,在使用Docker的时候出现了异常,后经排查发现容器进程变成了孤儿进程,类似于这样:
ps -ef |grep docker
root 9077 1 0 14:51 ? 00:00:00 docker-containerd-shim -namespace moby -workdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux/moby/b9c7dbb97bacd851e861b8edd5bf66226054d336e528ec7ed308558826701ab1 -address /var/run/docker/containerd/docker-containerd.sock -containerd-binary /usr/bin/docker-containerd -runtime-root /var/run/docker/runtime-runc
root 9102 1 0 14:51 ? 00:00:00 docker-containerd-shim -namespace moby -workdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux/moby/ac660ea7b14a67421367c6882d63026734e7ea51b35cc259d80214d86a7951fa -address /var/run/docker/containerd/docker-containerd.sock -containerd-binary /usr/bin/docker-containerd -runtime-root /var/run/docker/runtime-runc
root 9132 1 0 14:51 ? 00:00:00 docker-containerd-shim -namespace moby -workdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux/moby/b0ec4f46a4512fea68fd96b130bde1b287911421808922e21309d1655fb8ffc3 -address /var/run/docker/containerd/docker-containerd.sock -containerd-binary /usr/bin/docker-containerd -runtime-root /var/run/docker/runtime-runc
root 9171 1 0 14:51 ? 00:00:00 docker-containerd-shim -namespace moby -workdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux/moby/4a8e0271ad08854bd58f2952cb957a20e07d1d093849f72e6839cf8ee043921d -address /var/run/docker/containerd/docker-containerd.sock -containerd-binary /usr/bin/docker-containerd -runtime-root /var/run/docker/runtime-runc
......
docker-containerd-shim父进程变成了系统的1号进程systemd,而不再是dockerd进程
至于什么原因导致的,没有模拟出来,唯一能模拟出来上述情况的做法是kill -9 dockerd进程号,但我们在操作docker时估计也不会这么做。后来在使用中发现当有些容器的应用处于繁忙状态(高度使用cpu、mem等资源)时,如果systemctl stop docker,则繁忙容器对应的docker-containerd-shim进程多半会处于孤儿进程状态。想要对docker这一块的进程作了解,于是花了些时间研究了docker进程组的状况,如下为研究记录。
实践
- 实践环境:Centos7.4 +Docker18.03-ce
- 安装完docker并通过systemctl start docker启动dockerd服务后,系统中关于docker的进程组状态为:
pstree -a
systemd --system --deserialize 20
......
├─dockerd --storage-driver=overlay2
│ ├─docker-containe --config /var/run/docker/containerd/containerd.toml
│ │ └─8*[{docker-containe}]
│ └─10*[{dockerd}]
......
ps -ef | grep docker | grep -v grep
root 28114 1 0 15:25 ? 00:00:00 /usr/bin/dockerd --storage-driver=overlay2
root 28121 28114 0 15:25 ? 00:00:01 docker-containerd --config /var/run/docker/containerd/containerd.toml
可知:
1. 首先会有dockerd(docker daemon)进程,父进程为systemd 1号进程
2. 紧接着是docker-containerd(containerd is the executor for containers)进程,父进程为dockerd进程
注:pstree的这个显示10*[{dockerd}]表示dockerd进程的子线程,可以man pstree了解
- 创建一个nginx容器,查看容器进程信息
创建nginx容器并映射端口80->8888
docker run -itd --name nginx -p 8888:80 nginx:1.14
dd12c48715428d71ab64e1a7aa9242b8d6c9786e417c8595168646c39745b50f
查看nginx容器内运行的进程
docker top nginx
UID PID PPID C STIME TTY TIME CMD
root 28543 28527 0 16:39 pts/0 00:00:00 nginx: master process nginx -g daemon off;
101 28580 28543 0 16:39 pts/0 00:00:00 nginx: worker process
pstree -a
systemd --system --deserialize 20
......
├─dockerd --storage-driver=overlay2
│ ├─docker-containe --config /var/run/docker/containerd/containerd.toml
│ │ ├─docker-containe -namespace moby -workdir ...
│ │ │ ├─nginx
│ │ │ │ └─nginx
│ │ │ └─8*[{docker-containe}]
│ │ └─10*[{docker-containe}]
│ ├─docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 8888 -container-ip 172.17.0.2 -container-port 80
│ │ └─5*[{docker-proxy}]
│ └─11*[{dockerd}]
......
ps -ef | grep -E "docker|nginx" | grep -v grep
root 28114 1 0 15:25 ? 00:00:12 /usr/bin/dockerd --storage-driver=overlay2
root 28121 28114 0 15:25 ? 00:00:27 docker-containerd --config /var/run/docker/containerd/containerd.toml
root 28521 28114 0 16:39 ? 00:00:00 /usr/bin/docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 8888 -container-ip 172.17.0.2 -container-port 80
root 28527 28121 0 16:39 ? 00:00:00 docker-containerd-shim -namespace moby -workdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux/moby/dd12c48715428d71ab64e1a7aa9242b8d6c9786e417c8595168646c39745b50f -address /var/run/docker/containerd/docker-containerd.sock -containerd-binary /usr/bin/docker-containerd -runtime-root /var/run/docker/runtime-runc
root 28543 28527 0 16:39 pts/0 00:00:00 nginx: master process nginx -g daemon off;
101 28580 28543 0 16:39 pts/0 00:00:00 nginx: worker process
可知:
1. docker-proxy进程主要用来做端口映射,可以在dockerd设置--userland-proxy=false将其关闭;docker-proxy与docker-containerd进程一样均是dockerd的子进程
2. docker-containerd-shim是运行容器的进程,每一个容器均会对应一个containerd-shim进程,此进程是docker-containerd的子进程
3. 应用进程nginx是容器进程docker-containerd-shim的子进程,应用进程通常是entrypoint+cmd进程,容器内后续创建的进程均是entrypoint+cmd进程的子进程
- 查看多个容器的进程信息:
ps axf | grep docker -A 1
1223 pts/1 S+ 0:00 | \_ grep --color=auto docker -A 1
28046 ? Ss 0:00 \_ sshd: root@pts/0
--
29267 ? Ssl 0:57 /usr/bin/dockerd --storage-driver=overlay
29274 ? Ssl 0:04 \_ docker-containerd --config /var/run/docker/containerd/containerd.toml
29670 ? Sl 0:00 | \_ docker-containerd-shim -namespace moby -workdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux/moby/b4922fcaa64304c0aad46de5aaf487a16c3926ee04ffa74357c73eecd9f460d7 -address /var/run/docker/containerd/docker-containerd.sock -containerd-binary /usr/bin/docker-containerd -runtime-root /var/run/docker/runtime-runc
29684 pts/0 Ssl+ 0:01 | | \_ registry serve /etc/docker/registry/config.yml
32673 ? Sl 0:00 | \_ docker-containerd-shim -namespace moby -workdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux/moby/74e6fdbbd29e976d1c334a02558d653991fd6740a91b5796245599dd8624627a -address /var/run/docker/containerd/docker-containerd.sock -containerd-binary /usr/bin/docker-containerd -runtime-root /var/run/docker/runtime-runc
32689 pts/0 Ssl+ 0:04 | | \_ /bin/prometheus -config.file=/etc/prometheus/prometheus.yml -storage.local.path=/prometheus -web.console.libraries=/usr/share/prometheus/console_libraries -web.console.templates=/usr/share/prometheus/consoles
357 ? Sl 0:00 | \_ docker-containerd-shim -namespace moby -workdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux/moby/38d837585be68b658fdc678c80f5add34556e6b883a4f2125e9ad4fa8f915496 -address /var/run/docker/containerd/docker-containerd.sock -containerd-binary /usr/bin/docker-containerd -runtime-root /var/run/docker/runtime-runc
373 pts/0 Ssl+ 0:01 | | \_ /usr/sbin/grafana-server --homepath=/usr/share/grafana --config=/etc/grafana/grafana.ini cfg:default.log.mode=console cfg:default.paths.data=/var/lib/grafana cfg:default.paths.logs=/var/log/grafana cfg:default.paths.plugins=/var/lib/grafana/plugins
724 ? Sl 0:00 | \_ docker-containerd-shim -namespace moby -workdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux/moby/a1f7009ce53b17f31e6f84a1920ea770612dd15f030f1c561a31dd1c9a93ea78 -address /var/run/docker/containerd/docker-containerd.sock -containerd-binary /usr/bin/docker-containerd -runtime-root /var/run/docker/runtime-runc
749 pts/0 Ssl+ 0:01 | | \_ /bin/node_exporter -collector.procfs /host/proc -collector.sysfs /host/sys -collector.filesystem.ignored-mount-points ^/(sys|proc|dev|host|etc)($|/)
858 ? Sl 0:00 | \_ docker-containerd-shim -namespace moby -workdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux/moby/5a8ef16cc491583903aadc24ce282261e7821284b461cc6fa9b108f15846fdef -address /var/run/docker/containerd/docker-containerd.sock -containerd-binary /usr/bin/docker-containerd -runtime-root /var/run/docker/runtime-runc
874 pts/0 Ssl+ 0:20 | \_ /usr/bin/cadvisor -logtostderr -port 8090
29661 ? Sl 0:00 \_ /usr/bin/docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 5000 -container-ip 172.17.0.2 -container-port 5000
- docker进程组结构为:
/usr/bin/dockerd ......
\_ docker-containerd ......
\_ docker-containerd-shim ......
\_ 应用进程(ENTRYPOINT+CMD)
\_ docker-containerd-shim ......
\_app
......
\_ /usr/bin/docker-proxy ......
\_ /usr/bin/docker-proxy ......
.....
总结
从上述研究内容可知,dockerd的父进程为1号systemd进程,其子进程包含docker-containerd与docker-proxy进程,docker-proxy进程用于容器的路由设置。docker-containerd-shim是docker-containerd的子进程,一个容器对应一个docker-containerd-shim进程。另外发现耗子叔也遇到了类似的问题,文章说到在systemctl stop docker无响应的情况下,Ctrl+C的操作也会导致容器进程变为孤儿进程。