keepalived检测脚本命令权限导致的故障

两台IBM-MQ服务器,使用keepalived搭建高可用ha

在配置检测脚本时发现,脚本似乎工作异常,一检测就自动停止了keepalived服务
检测脚本如下(比较粗糙,只是为了说明问题):

$ cat chk-mq.sh
#!/bin/bash

qm_sta=`/opt/mqm/bin/dspmq -m xxx-qm| awk '{print $2}'`  # xxx-qm是队列管理器名称
if [ $qm_sta != "STATUS(Running)" ]; then
    /opt/mqm/bin/strmqm xxx-qm
    sleep 3

    if [ `/opt/mqm/bin/dspmq -m xxx-qm| awk '{print $2}'` != "STATUS(Running)" ];then
        systemctl stop keepalived
    fi
fi

查看日志如下:

$ tail -f /var/log/messages
......
Jul 21 04:02:53 atms3-ibmmq00 Keepalived_healthcheckers[14109]: Opening file '/etc/keepalived/keepalived.conf'.
Jul 21 04:02:53 atms3-ibmmq00 Keepalived_vrrp[14110]: Registering Kernel netlink reflector
Jul 21 04:02:53 atms3-ibmmq00 Keepalived_vrrp[14110]: Registering Kernel netlink command channel
Jul 21 04:02:53 atms3-ibmmq00 Keepalived_vrrp[14110]: Registering gratuitous ARP shared channel
Jul 21 04:02:53 atms3-ibmmq00 Keepalived_vrrp[14110]: Opening file '/etc/keepalived/keepalived.conf'.
Jul 21 04:02:53 atms3-ibmmq00 Keepalived_vrrp[14110]: VRRP_Instance(VI_1) removing protocol VIPs.
Jul 21 04:02:53 atms3-ibmmq00 Keepalived_vrrp[14110]: SECURITY VIOLATION - scripts are being executed but script_security not enabled.
Jul 21 04:02:53 atms3-ibmmq00 Keepalived_vrrp[14110]: Using LinkWatch kernel netlink reflector...
Jul 21 04:02:53 atms3-ibmmq00 Keepalived_vrrp[14110]: VRRP_Instance(VI_1) Entering BACKUP STATE
Jul 21 04:02:53 atms3-ibmmq00 Keepalived_vrrp[14110]: VRRP sockpool: [ifindex(2), proto(112), unicast(0), fd(10,11)]
Jul 21 04:02:54 atms3-ibmmq00 Keepalived_vrrp[14110]: VRRP_Instance(VI_1) Changing effective priority from 100 to 80
Jul 21 04:02:56 atms3-ibmmq00 Keepalived[14108]: Stopping
Jul 21 04:02:56 atms3-ibmmq00 systemd: Stopping LVS and VRRP High Availability Monitor...
Jul 21 04:02:56 atms3-ibmmq00 Keepalived_healthcheckers[14109]: Stopped
Jul 21 04:02:57 atms3-ibmmq00 Keepalived_vrrp[14110]: Stopped
Jul 21 04:02:57 atms3-ibmmq00 Keepalived[14108]: Stopped Keepalived v1.3.5 (03/19,2017), git commit v1.3.5-6-g6fa32f2
Jul 21 04:02:57 atms3-ibmmq00 systemd: Stopped LVS and VRRP High Availability Monitor.

但是,直接手工去执行这个检测脚本,是可以成功的,它会启动被停止的队列管理器:xxx-qm

调试好久,然后将启动队列语句“/opt/mqm/bin/strmqm xxx-qm” 后添加输出重定向: /opt/mqm/bin/strmqm hacdm >> /tmp/ibmmq.log 2>&1
再重启keepalived服务,然后检查 /tmp/ibmmq.log 文件输出,得到:AMQ7077E: You are not authorized to perform the requested operation.
说明是执行strmqm命令的权限问题,于是修改为:sudo -u mqm /opt/mqm/bin/strmqm hacdm 后检测脚本正常。
而dspmq 命令却没有权限问题。

总结:在keepalived服务配置检测或其它类型脚本时,
1、如果其中有调用、启动其它用户程序/服务的命令,要注意它的权限问题;
2、当怀疑命令存在问题时,可以通过日志重定向来证实之;
3、在调用启动类命令时,要注意留一点缓冲时间再去检测启动结果;当然,这个缓冲时间不要比keepalived.conf中的interval(检测间隔)参数小

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
【社区内容提示】社区部分内容疑似由AI辅助生成,浏览时请结合常识与多方信息审慎甄别。
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

相关阅读更多精彩内容

友情链接更多精彩内容