随着微服务的普及,系统规模也由原来的单机运行转换到集群模式,小则三五台,大则成百上千。如果应用的更新仍由人力操作,则无疑延长了整个升级流程。博主在近期项目也遇到了类似的问题。以前的客户都配备了devOps平台,整个升级操作都屏蔽人物操作。奈何这次的客户只负责掏几台机器,并没有配备相关的运维支持。考虑到时间及成本,我们决定使用shell脚本实现集群应用的升级。
我们这次使用了9台机器,两台机器负责负载均衡,域名实现负载访问,两台机器作为服务消费者,三台机器作为服务提供者,两台文件服务器。因为生产者/消费者的机器都会挂在文件服务器,我们决定在其中文件服务器上面建立一个发布目录,用shell脚本批量ssh各台服务器的子脚本,完成服务器的自动化升级。
因为子脚本大同小异,这里仅拿其中一个脚本为例:
#!/bin/bash
# author juque
# date 2022-11-11 16:33:00
GHOME=/data
APP_HOME=$GHOME/app/provider
PUBLISH_HOME=/sharedata/publish
APP_FILE=systemservice.jar
# 进入安装目录
cd $APP_HOME || exists
# 备份文件
if [[ -f $APP_FILE ]]; then
echo "备份程序文件..."
mv $APP_FILE backup/$APP_FILE.bak_"$(date +'%Y%m%d%H%M%S')"
else
echo "无需备份程序文件"
fi
# 拷贝安装包到当前目录
if [[ -f "$PUBLISH_HOME/$APP_FILE" ]]; then
echo "检测到文件:${PUBLISH_HOME}/${APP_FILE}"
else
echo "未能检测到文件,中止执行:${PUBLISH_HOME}/${APP_FILE}"
exit 1
fi
echo "开始拷贝应用: ${APP_FILE}..."
cp $PUBLISH_HOME/$APP_FILE .
LOG_FILE=${APP_HOME}/logs/nohup-p-"$(date +'%Y-%m-%d')".out
# 检查日志文件是否存在,存在则清空
if [[ -f "$LOG_FILE" ]]; then
echo "nohup文件已存在,即将清空文件"
cat /dev/null > "$LOG_FILE"
fi
# 启动应用
./start.sh restart
# 检查应用是否正常启动
COUNT=0
FLAG=0
while [[ (-f "$LOG_FILE") && "$COUNT" -lt 7 ]];
do
echo "正在检查服务状态"
GREP_CMD=$(grep -A 1 "Started SystemserviceApplication" "${LOG_FILE}")
if [[ -n $GREP_CMD ]]; then
echo "${GREP_CMD}"
FLAG=1
echo "start successful..."
break
else
sleep 5
((COUNT++))
fi
done
if [[ $FLAG == 0 ]]; then
echo "start fail..."
fi
注意以下要点:
- 启动应用的步骤(./start.sh restart)可以替换成自己应用的启动方式
- 检查应用是否正常启动的步骤是根据实际情况估算出来的轮询时长,博主的应用启动大概30s左右,所以估算了时间是35s。
- 文件备份的步骤可以根据实际情况删减,但是经常备份是个好习惯。
- 子脚本一般和应用包放在同一个目录,可以省去很多目录的操作;
下面是调度脚本:
#!/bin/bash
USER=juque
BHOME=/data/
PROVIDER_HOST=("192.168.132.24" "192.168.132.6" "192.168.132.48")
CUSUMER_HOST=("192.168.132.37" "192.168.132.42")
WEB_HOST=("192.168.132.25" "192.168.132.29")
# 执行Systemservice安装
installProvider() {
echo "provider begin..."
for HOST in "${PROVIDER_HOST[@]}"
do
read -p "Update ${HOST}? (Y/N)" flg
if [[ "Y" == "$flg" ]] || [[ "y" == "$flg" ]]; then
ssh ${USER}@${HOST} "source ~/.bash_profile && cd ${BHOME}/provider/ && sh install_p.sh"
else
echo "read ${HOST} fail..."
fi
done
}
#执行Conusmer端安装
installCustomer() {
echo "customer begin..."
for HOST in "${CUSUMER_HOST[@]}"
do
read -p "Update ${HOST}? (Y/N)" flg
if [[ "Y" == "$flg" ]] || [[ "y" == "$flg" ]]; then
ssh ${USER}@${HOST} "source ~/.bash_profile && cd ${BHOME}/customer/ && sh install_c.sh"
else
echo "read ${HOST} fail..."
fi
done
}
# 安装web
installWeb() {
echo "web begin..."
for HOST in "${WEB_HOST[@]}"
do
read -p "Update ${HOST}? (Y/N)" flg
if [[ "Y" == "$flg" ]] || [[ "y" == "$flg" ]]; then
scp ${BHOME}"/sharedata/publish/dist.zip" ${USER}@${HOST}":/${BHOME}/"
ssh ${USER}@${HOST} "cd ${BHOME}/app/ && sh install_web.sh"
else
echo "read ${HOST} fail..."
fi
done
}
ACTION=${1}
if [ "$ACTION" == "provider" ]; then
installProvider
elif [ "$ACTION" == "customer" ]; then
installCustomer
elif [ "$ACTION" == "web" ]; then
installWeb
elif [ "$ACTION" == "all" ]; then
installProvider
installCustomer
installWeb
else
echo "please input any arg:[provider, customer, web, all]"
fi
- 其中的ssh、scp指令会提示输入密码,交互很不友好,可以使用export处理下。