现象:
主机安装了frp软件,作为客户端,设置了开机自启动,但是每次开机自启动都会失败,但是开机后手动重启frpc服务则可以正常启动。
排查及处理过程
通过排查日志得知,frpc服务在NetworkManager网络服务完成启动之前进行了启动尝试,由于此时网络不通,frpc无法与服务端取得通讯,造成的启动失败。
- 自启动报错信息:
[root@appser3 ~]# systemctl status frpc
× frpc.service - frp cilent
Loaded: loaded (/etc/systemd/system/frpc.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Wed 2024-10-23 09:07:02 CST; 2min 59s ago
Process: 1039 ExecStart=/soft/frp_0.61.0/frpc -c /soft/frp_0.61.0/frpc.toml (code=exited, status=1/FAILURE)
Main PID: 1039 (code=exited, status=1/FAILURE)
Oct 23 09:07:02 appser3 systemd[1]: Started frp cilent.
Oct 23 09:07:02 appser3 frpc[1039]: 2024-10-23 09:07:02.385 [I] [sub/root.go:142] start frpc service for config file [/soft/frp_0.61.0/frpc.toml]
Oct 23 09:07:02 appser3 frpc[1039]: 2024-10-23 09:07:02.386 [I] [client/service.go:295] try to connect to server...
Oct 23 09:07:02 appser3 frpc[1039]: 2024-10-23 09:07:02.386 [W] [client/service.go:298] connect to server error: dial tcp 192.168.0.10:7000: connect: network is unreachable
Oct 23 09:07:02 appser3 frpc[1039]: 2024-10-23 09:07:02.386 [I] [sub/root.go:160] frpc service for config file [/soft/frp_0.61.0/frpc.toml] stopped
Oct 23 09:07:02 appser3 frpc[1039]: login to the server failed: dial tcp 192.168.0.10:7000: connect: network is unreachable. With loginFailExit enabled, no additional retries will be attempted
Oct 23 09:07:02 appser3 systemd[1]: frpc.service: Main process exited, code=exited, status=1/FAILURE
Oct 23 09:07:02 appser3 systemd[1]: frpc.service: Failed with result 'exit-code'.
- 手动重启服务可以恢复正常,说明跟服务端无关
[root@appser3 ~]# systemctl restart frpc
[root@appser3 ~]# systemctl status frpc
● frpc.service - frp cilent
Loaded: loaded (/etc/systemd/system/frpc.service; enabled; vendor preset: disabled)
Active: active (running) since Wed 2024-10-23 09:10:08 CST; 1s ago
Main PID: 2642 (frpc)
Tasks: 6 (limit: 408710)
Memory: 6.9M
CGroup: /system.slice/frpc.service
└─ 2642 /soft/frp_0.61.0/frpc -c /soft/frp_0.61.0/frpc.toml
Oct 23 09:10:08 appser3 systemd[1]: Started frp cilent.
Oct 23 09:10:08 appser3 frpc[2642]: 2024-10-23 09:10:08.793 [I] [sub/root.go:142] start frpc service for config file [/soft/frp_0.61.0/frpc.toml]
Oct 23 09:10:08 appser3 frpc[2642]: 2024-10-23 09:10:08.793 [I] [client/service.go:295] try to connect to server...
Oct 23 09:10:08 appser3 frpc[2642]: 2024-10-23 09:10:08.802 [I] [client/service.go:287] [bf8aaf54b463f90d] login to server success, get run id [bf8aaf54b463f90d]
Oct 23 09:10:08 appser3 frpc[2642]: 2024-10-23 09:10:08.803 [I] [proxy/proxy_manager.go:173] [bf8aaf54b463f90d] proxy added: [ssh-3]
Oct 23 09:10:08 appser3 frpc[2642]: 2024-10-23 09:10:08.804 [I] [client/control.go:168] [bf8aaf54b463f90d] [ssh-3] start proxy success
- 查看message日志查看启动过程
经过多次测试,发现frp软件会在NetworkManager服务重启过程中启动,由于网络还没准备好,造成frpc无法与frps通讯,从而启动失败。
[root@appser3 log]# more messages
…………
Oct 23 09:07:02 appser3 NetworkManager[1035]: <info> [1729645622.3553] hostname: hostname: using hostnamed
Oct 23 09:07:02 appser3 NetworkManager[1035]: <info> [1729645622.3554] hostname: hostname changed from (none) to "appser3"
Oct 23 09:07:02 appser3 NetworkManager[1035]: <info> [1729645622.3556] dns-mgr[0x559990cb5250]: init: dns=default rc-manager=symlink
Oct 23 09:07:02 appser3 NetworkManager[1035]: <info> [1729645622.3617] manager[0x559990cd6000]: rfkill: Wi-Fi hardware radio set enabled
Oct 23 09:07:02 appser3 NetworkManager[1035]: <info> [1729645622.3618] manager[0x559990cd6000]: rfkill: WWAN hardware radio set enabled
Oct 23 09:07:02 appser3 NetworkManager[1035]: <info> [1729645622.3730] Loaded device plugin: NMTeamFactory (/usr/lib64/NetworkManager/1.32.12-14.oe2203sp1/libnm-device-plugin-team.so)
Oct 23 09:07:02 appser3 NetworkManager[1035]: <info> [1729645622.3741] Loaded device plugin: NMAtmManager (/usr/lib64/NetworkManager/1.32.12-14.oe2203sp1/libnm-device-plugin-adsl.so)
Oct 23 09:07:02 appser3 NetworkManager[1035]: <info> [1729645622.3742] manager: rfkill: Wi-Fi enabled by radio killswitch; enabled by state file
Oct 23 09:07:02 appser3 NetworkManager[1035]: <info> [1729645622.3743] manager: rfkill: WWAN enabled by radio killswitch; enabled by state file
Oct 23 09:07:02 appser3 NetworkManager[1035]: <info> [1729645622.3743] manager: Networking is enabled by state file
Oct 23 09:07:02 appser3 NetworkManager[1035]: <info> [1729645622.3744] dhcp-init: Using DHCP client 'internal'
Oct 23 09:07:02 appser3 dbus-daemon[937]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service' requested by ':1.3' (uid=0 pid=1035 comm="/usr/sbin/Net
workManager --no-daemon ")
Oct 23 09:07:02 appser3 systemd[1]: Starting Network Manager Script Dispatcher Service...
Oct 23 09:07:02 appser3 NetworkManager[1035]: <info> [1729645622.3772] settings: Loaded settings plugin: ifcfg-rh ("/usr/lib64/NetworkManager/1.32.12-14.oe2203sp1/libnm-settings-plugin-ifcfg-rh.so")
Oct 23 09:07:02 appser3 NetworkManager[1035]: <info> [1729645622.3772] settings: Loaded settings plugin: keyfile (internal)
Oct 23 09:07:02 appser3 NetworkManager[1035]: <info> [1729645622.3840] device (lo): carrier: link connected
Oct 23 09:07:02 appser3 NetworkManager[1035]: <info> [1729645622.3843] manager: (lo): new Generic device (/org/freedesktop/NetworkManager/Devices/1)
Oct 23 09:07:02 appser3 NetworkManager[1035]: <info> [1729645622.3851] manager: (eth0): new Ethernet device (/org/freedesktop/NetworkManager/Devices/2)
Oct 23 09:07:02 appser3 NetworkManager[1035]: <info> [1729645622.3857] device (eth0): state change: unmanaged -> unavailable (reason 'managed', sys-iface-state: 'external')
Oct 23 09:07:02 appser3 frpc[1039]: #033[1;34m2024-10-23 09:07:02.385 [I] [sub/root.go:142] start frpc service for config file [/soft/frp_0.61.0/frpc.toml]
Oct 23 09:07:02 appser3 frpc[1039]: #033[0m#033[1;34m2024-10-23 09:07:02.386 [I] [client/service.go:295] try to connect to server...
Oct 23 09:07:02 appser3 frpc[1039]: #033[0m#033[1;33m2024-10-23 09:07:02.386 [W] [client/service.go:298] connect to server error: dial tcp 192.168.0.10:7000: connect: network is unreachable
Oct 23 09:07:02 appser3 frpc[1039]: #033[0m#033[1;34m2024-10-23 09:07:02.386 [I] [sub/root.go:160] frpc service for config file [/soft/frp_0.61.0/frpc.toml] stopped
Oct 23 09:07:02 appser3 frpc[1039]: #033[0mlogin to the server failed: dial tcp 192.168.0.10:7000: connect: network is unreachable. With loginFailExit enabled, no additional retries will be attempted
Oct 23 09:07:02 appser3 NetworkManager[1035]: <info> [1729645622.3869] device (eth0): carrier: link connected
Oct 23 09:07:02 appser3 systemd[1]: frpc.service: Main process exited, code=exited, status=1/FAILURE
Oct 23 09:07:02 appser3 systemd[1]: frpc.service: Failed with result 'exit-code'.
Oct 23 09:07:02 appser3 NetworkManager[1035]: <info> [1729645622.3900] device (eth0): state change: unavailable -> disconnected (reason 'none', sys-iface-state: 'managed')
Oct 23 09:07:02 appser3 NetworkManager[1035]: <info> [1729645622.3905] policy: auto-activating connection 'eth0' (5fb06bd0-0bb0-7ffb-45f1-d6edd65f3e03)
Oct 23 09:07:02 appser3 NetworkManager[1035]: <info> [1729645622.3911] device (eth0): Activation: starting connection 'eth0' (5fb06bd0-0bb0-7ffb-45f1-d6edd65f3e03)
Oct 23 09:07:02 appser3 NetworkManager[1035]: <info> [1729645622.3912] device (eth0): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'managed')
Oct 23 09:07:02 appser3 dbus-daemon[937]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Oct 23 09:07:02 appser3 systemd[1]: Started Network Manager Script Dispatcher Service.
Oct 23 09:07:02 appser3 NetworkManager[1035]: <info> [1729645622.3914] manager: NetworkManager state is now CONNECTING
Oct 23 09:07:02 appser3 NetworkManager[1035]: <info> [1729645622.3915] device (eth0): state change: prepare -> config (reason 'none', sys-iface-state: 'managed')
Oct 23 09:07:02 appser3 NetworkManager[1035]: <info> [1729645622.3919] device (eth0): state change: config -> ip-config (reason 'none', sys-iface-state: 'managed')
Oct 23 09:07:02 appser3 NetworkManager[1035]: <info> [1729645622.3923] dhcp4 (eth0): activation: beginning transaction (timeout in 45 seconds)
Oct 23 09:07:02 appser3 NetworkManager[1035]: <info> [1729645622.4031] dhcp4 (eth0): state changed unknown -> bound, address=192.168.0.3
Oct 23 09:07:02 appser3 NetworkManager[1035]: <info> [1729645622.4039] device (eth0): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'managed')
Oct 23 09:07:02 appser3 NetworkManager[1035]: <info> [1729645622.4079] device (eth0): state change: ip-check -> secondaries (reason 'none', sys-iface-state: 'managed')
Oct 23 09:07:02 appser3 NetworkManager[1035]: <info> [1729645622.4081] device (eth0): state change: secondaries -> activated (reason 'none', sys-iface-state: 'managed')
Oct 23 09:07:02 appser3 NetworkManager[1035]: <info> [1729645622.4084] manager: NetworkManager state is now CONNECTED_LOCAL
Oct 23 09:07:02 appser3 NetworkManager[1035]: <info> [1729645622.4087] manager: NetworkManager state is now CONNECTED_SITE
Oct 23 09:07:02 appser3 NetworkManager[1035]: <info> [1729645622.4089] policy: set 'eth0' (eth0) as default for IPv4 routing and DNS
Oct 23 09:07:02 appser3 NetworkManager[1035]: <info> [1729645622.4202] device (eth0): Activation: successful, device activated.
Oct 23 09:07:02 appser3 NetworkManager[1035]: <info> [1729645622.4206] manager: NetworkManager state is now CONNECTED_GLOBAL
Oct 23 09:07:02 appser3 NetworkManager[1035]: <info> [1729645622.4210] manager: startup complete
…………
- 修改frpc systemd启动服务脚本,延迟2秒钟启动。
[root@appser3 ~]# vi /etc/systemd/system/frpc.service
[Unit]
Description = frp cilent
After = network.target syslog.target
Wants = network.target
[Service]
Type = simple
ExecStartPre = -/bin/sleep 2s //延迟2秒钟启动服务
ExecStart = /soft/frp_0.61.0/frpc -c /soft/frp_0.61.0/frpc.toml
[Install]
WantedBy = multi-user.target
- 修改完成后重启,验证frpc服务状态
[root@appser3 ~]# systemctl status frpc
● frpc.service - frp cilent
Loaded: loaded (/etc/systemd/system/frpc.service; enabled; vendor preset: disabled)
Active: active (running) since Wed 2024-10-23 13:31:00 CST; 9min ago
Process: 1042 ExecStartPre=/bin/sleep 2s (code=exited, status=0/SUCCESS)
Main PID: 1663 (frpc)
Tasks: 7 (limit: 408710)
Memory: 18.8M
CGroup: /system.slice/frpc.service
└─ 1663 /soft/frp_0.61.0/frpc -c /soft/frp_0.61.0/frpc.toml
Oct 23 13:30:58 appser3 systemd[1]: Starting frp cilent...
Oct 23 13:31:00 appser3 systemd[1]: Started frp cilent.
Oct 23 13:31:00 appser3 frpc[1663]: 2024-10-23 13:31:00.592 [I] [sub/root.go:142] start frpc service for config file [/soft/frp_0.61.0/frpc.toml]
Oct 23 13:31:00 appser3 frpc[1663]: 2024-10-23 13:31:00.592 [I] [client/service.go:295] try to connect to server...
Oct 23 13:31:00 appser3 frpc[1663]: 2024-10-23 13:31:00.599 [I] [client/service.go:287] [c402a336e72a6ab6] login to server success, get run id [c402a336e72a6ab6]
Oct 23 13:31:00 appser3 frpc[1663]: 2024-10-23 13:31:00.599 [I] [proxy/proxy_manager.go:173] [c402a336e72a6ab6] proxy added: [ssh]
Oct 23 13:31:00 appser3 frpc[1663]: 2024-10-23 13:31:00.600 [I] [client/control.go:168] [c402a336e72a6ab6] [ssh] start proxy success
- 查看系统启动日志-messages
[root@appser3 ~]# tail -n 600 /var/log/messages | more
…………
Oct 23 13:30:58 appser3 NetworkManager[1038]: <info> [1729661458.6366] settings: Loaded settings plugin: ifcfg-rh ("/usr/lib64/NetworkManager/1.32.12-14.oe2203sp1/libnm-settings-plugin-ifcfg-rh.so")
Oct 23 13:30:58 appser3 NetworkManager[1038]: <info> [1729661458.6367] settings: Loaded settings plugin: keyfile (internal)
Oct 23 13:30:58 appser3 NetworkManager[1038]: <info> [1729661458.6426] device (lo): carrier: link connected
Oct 23 13:30:58 appser3 NetworkManager[1038]: <info> [1729661458.6429] manager: (lo): new Generic device (/org/freedesktop/NetworkManager/Devices/1)
Oct 23 13:30:58 appser3 NetworkManager[1038]: <info> [1729661458.6436] manager: (eth0): new Ethernet device (/org/freedesktop/NetworkManager/Devices/2)
Oct 23 13:30:58 appser3 NetworkManager[1038]: <info> [1729661458.6443] device (eth0): state change: unmanaged -> unavailable (reason 'managed', sys-iface-state: 'external')
Oct 23 13:30:58 appser3 NetworkManager[1038]: <info> [1729661458.6453] device (eth0): carrier: link connected
Oct 23 13:30:58 appser3 NetworkManager[1038]: <info> [1729661458.6477] device (eth0): state change: unavailable -> disconnected (reason 'none', sys-iface-state: 'managed')
Oct 23 13:30:58 appser3 NetworkManager[1038]: <info> [1729661458.6482] policy: auto-activating connection 'eth0' (5fb06bd0-0bb0-7ffb-45f1-d6edd65f3e03)
Oct 23 13:30:58 appser3 NetworkManager[1038]: <info> [1729661458.6487] device (eth0): Activation: starting connection 'eth0' (5fb06bd0-0bb0-7ffb-45f1-d6edd65f3e03)
Oct 23 13:30:58 appser3 NetworkManager[1038]: <info> [1729661458.6488] device (eth0): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'managed')
Oct 23 13:30:58 appser3 NetworkManager[1038]: <info> [1729661458.6490] manager: NetworkManager state is now CONNECTING
Oct 23 13:30:58 appser3 NetworkManager[1038]: <info> [1729661458.6492] device (eth0): state change: prepare -> config (reason 'none', sys-iface-state: 'managed')
Oct 23 13:30:58 appser3 dbus-daemon[944]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Oct 23 13:30:58 appser3 NetworkManager[1038]: <info> [1729661458.6498] device (eth0): state change: config -> ip-config (reason 'none', sys-iface-state: 'managed')
Oct 23 13:30:58 appser3 systemd[1]: Started Network Manager Script Dispatcher Service.
Oct 23 13:30:58 appser3 NetworkManager[1038]: <info> [1729661458.6502] dhcp4 (eth0): activation: beginning transaction (timeout in 45 seconds)
Oct 23 13:30:58 appser3 NetworkManager[1038]: <info> [1729661458.6610] dhcp4 (eth0): state changed unknown -> bound, address=192.168.0.3
Oct 23 13:30:58 appser3 NetworkManager[1038]: <info> [1729661458.6618] device (eth0): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'managed')
Oct 23 13:30:58 appser3 NetworkManager[1038]: <info> [1729661458.6656] device (eth0): state change: ip-check -> secondaries (reason 'none', sys-iface-state: 'managed')
Oct 23 13:30:58 appser3 NetworkManager[1038]: <info> [1729661458.6658] device (eth0): state change: secondaries -> activated (reason 'none', sys-iface-state: 'managed')
Oct 23 13:30:58 appser3 NetworkManager[1038]: <info> [1729661458.6662] manager: NetworkManager state is now CONNECTED_LOCAL
Oct 23 13:30:58 appser3 NetworkManager[1038]: <info> [1729661458.6664] manager: NetworkManager state is now CONNECTED_SITE
Oct 23 13:30:58 appser3 NetworkManager[1038]: <info> [1729661458.6665] policy: set 'eth0' (eth0) as default for IPv4 routing and DNS
Oct 23 13:30:58 appser3 NetworkManager[1038]: <info> [1729661458.6781] device (eth0): Activation: successful, device activated.
Oct 23 13:30:58 appser3 NetworkManager[1038]: <info> [1729661458.6786] manager: NetworkManager state is now CONNECTED_GLOBAL
Oct 23 13:30:58 appser3 NetworkManager[1038]: <info> [1729661458.6790] manager: startup complete
…………
Oct 23 13:31:00 appser3 systemd[1]: Started frp cilent.
Oct 23 13:31:00 appser3 systemd[1]: Reached target Multi-User System.
Oct 23 13:31:00 appser3 systemd[1]: Reached target Cloud-init target.
Oct 23 13:31:00 appser3 systemd[1]: Starting Record Runlevel Change in UTMP...
Oct 23 13:31:00 appser3 systemd[1]: systemd-update-utmp-runlevel.service: Deactivated successfully.
Oct 23 13:31:00 appser3 systemd[1]: Finished Record Runlevel Change in UTMP.
Oct 23 13:31:00 appser3 systemd[1]: Startup finished in 702ms (kernel) + 5.956s (initrd) + 4.043s (userspace) = 10.703s.
Oct 23 13:31:00 appser3 frpc[1663]: #033[1;34m2024-10-23 13:31:00.592 [I] [sub/root.go:142] start frpc service for config file [/soft/frp_0.61.0/frpc.toml]
Oct 23 13:31:00 appser3 frpc[1663]: #033[0m#033[1;34m2024-10-23 13:31:00.592 [I] [client/service.go:295] try to connect to server...
Oct 23 13:31:00 appser3 frpc[1663]: #033[0m#033[1;34m2024-10-23 13:31:00.599 [I] [client/service.go:287] [c402a336e72a6ab6] login to server success, get run id [c402a336e72a6ab6]
Oct 23 13:31:00 appser3 frpc[1663]: #033[0m#033[1;34m2024-10-23 13:31:00.599 [I] [proxy/proxy_manager.go:173] [c402a336e72a6ab6] proxy added: [ssh]
Oct 23 13:31:00 appser3 frpc[1663]: #033[0m#033[1;34m2024-10-23 13:31:00.600 [I] [client/control.go:168] [c402a336e72a6ab6] [ssh] start proxy success
…………
由上面日志可见,frpc是在NetworkManager重启完成后再进行的启动,至此问题得到解决。