单机 etcd 升级
从 v3.4.20 升级到 v3.5.7,etcd的升级直接替换服务程序即可,以防万一先数据备份。
升级前数据写入测试数据,用于升级后验证
[root@zhouyu etcd-v3.4.20-linux-amd64]# ./etcdctl put /test/version 3.4.20
OK
[root@zhouyu etcd-v3.4.20-linux-amd64]# ./etcdctl put /test/time 20230222-16:14
OK
[root@zhouyu etcd-v3.4.20-linux-amd64]# ./etcdctl get --prefix /
/test/time
20230222-16:14
/test/version
3.4.20
1. 先备份数据,以防万一
[root@zhouyu etcd-v3.4.20-linux-amd64]# ./etcdctl snapshot save backup3-4-20.db
{"level":"info","ts":1677053760.5306118,"caller":"snapshot/v3_snapshot.go:119","msg":"created temporary db file","path":"backup3-4-20.db.part"}
2. 停etcd服务
kill -9 $(pgrep etcd)
3. 替换新版本etcd文件
[root@zhouyu etcd-v3.4.20-linux-amd64]# cp /tmp/etcd-download-test/etcd* ./
4. 启动新版本etcd,并检查数据和版本
[root@zhouyu etcd]# ./start.sh
[root@zhouyu etcd]# cd etcd-v3.4.20-linux-amd64/
[root@zhouyu etcd-v3.4.20-linux-amd64]# ./etcdctl get --prefix /
/test/time
20230222-16:14
/test/version
3.4.20
[root@zhouyu etcd-v3.4.20-linux-amd64]# ./etcdctl version
etcdctl version: 3.5.7
API version: 3.5
[root@zhouyu etcd-v3.4.20-linux-amd64]# ./etcd --version
etcd Version: 3.5.7
Git SHA: 215b53cf3
Go Version: go1.17.13
Go OS/Arch: linux/amd64
单机升级成功
etcd 扩容
先规划3个实例如下
name | peer_url | client_url |
---|---|---|
instance1 | http://0.0.0.0:2379 | http://0.0.0.0:2380 |
instance2 | http://0.0.0.0:2479 | http://0.0.0.0:2480 |
instance3 | http://0.0.0.0:2579 | http://0.0.0.0:2580 |
现有目录结构,etcd启动配置文件,启动脚本,停止脚本如下
[root@master(106.210) /homed/etcd]# ll
drwx------ 3 root root 4096 2月 22 16:35 data
drwxr-xr-x 3 528287 89939 4096 1月 20 18:16 etcd-v3.5.7-linux-amd64
-rw-r--r-- 1 root root 18458320 2月 23 09:15 etcd-v3.5.7-linux-amd64.tar.gz
-rw-r--r-- 1 root root 791 1月 3 11:18 etcd.yml
drwxr-xr-x 2 root root 4096 11月 14 11:57 log
-rwxrwxrwx 1 root root 87 11月 14 11:56 start.sh
-rwxrwxrwx 1 root root 22 10月 12 11:15 stop.sh
[root@master(106.210) /homed/etcd]# cat start.sh
nohup ./etcd-v3.5.7-linux-amd64/etcd --config-file=etcd.yml > ./log/per.log 2>&1 &
[root@master(106.210) /homed/etcd]# cat stop.sh
kill -9 $(pgrep etcd)
[root@master(106.210) /homed/etcd]# cat etcd.yml
name: 'etcd-cfgcenter' # 节点名称
data-dir: '/homed/etcd/data' # 数据存储目录
listen-client-urls: 'http://0.0.0.0:2379' # 监听地址,地址写法是 scheme://IP:port,可以多个并用逗号隔开,如果配置是http://0.0.0.0:2379,将不限制node访问地址
advertise-client-urls: 'http://0.0.0.0:2379' # 用于通知其他ETCD节点,客户端接入本节点的监听地址,一般来说advertise-client-urls是listen-client-urls子集,这些URL可以包含域名。
enable-v2: true
log-level: debug
logger: 'zap'
log-outputs: ['./log/ectd.log',]
auto-compaction-mode: periodic
auto-compaction-retention: 60m
# 修改空间配额为$((6*1024*1024*1024)),默认 2G,最大 8G
quota-backend-bytes: 6442450944
1. 先备份单机的数据
[root@master(106.210) /homed/etcd/etcd-v3.5.7-linux-amd64]# ./etcdctl snapshot save backup_20230223.db
{"level":"info","ts":1677115055.0267951,"caller":"snapshot/v3_snapshot.go:119","msg":"created temporary db file","path":"backup_20230223.db.part"}
{"level":"info","ts":"2023-02-23T09:17:35.029+0800","caller":"clientv3/maintenance.go:200","msg":"opened snapshot stream; downloading"}
{"level":"info","ts":1677115055.0295746,"caller":"snapshot/v3_snapshot.go:127","msg":"fetching snapshot","endpoint":"127.0.0.1:2379"}
{"level":"info","ts":"2023-02-23T09:18:01.641+0800","caller":"clientv3/maintenance.go:208","msg":"completed snapshot read; closing"}
{"level":"info","ts":1677115088.5754817,"caller":"snapshot/v3_snapshot.go:142","msg":"fetched snapshot","endpoint":"127.0.0.1:2379","size":"2.1 GB","took":33.548306184}
{"level":"info","ts":1677115088.5756047,"caller":"snapshot/v3_snapshot.go:152","msg":"saved","path":"backup_20230223.db"}
Snapshot saved at backup_20230223.db
2. 停单机
kill -9 $(pgrep etcd)
3. 启动新集群instance1
调整单个实例的启动脚本和结束脚本
[root@master(106.210) /homed/etcd]# mkdir instance1
[root@master(106.210) /homed/etcd]# cp etcd.yml instance1/
[root@master(106.210) /homed/etcd]# cd instance1/
[root@master(106.210) /homed/etcd/instance1]# cd ..
[root@master(106.210) /homed/etcd]# vim start_instance.sh
[root@master(106.210) /homed/etcd]# cat start_instance.sh
CUR_DIR=$(dirname $(readlink -f "$0"))
ETCD=$CUR_DIR"/etcd"
if [ -z "$1" ]; then
echo "请输入需要启动的实例"
else
echo "启动的实例: $1"
INSTANCE_DIR=$CUR_DIR"/$1"
if [ ! -d "$INSTANCE_DIR" ]; then
echo "目录 $INSTANCE_DIR 不存在"
exit 1
fi
PRO_PID=$(ps aux| grep $1 | grep -v grep | awk '{print$2}' )
if [[ $PRO_PID ]]; then
echo "进程已存在: pid=$PRO_PID"
exit 3
fi
CFG_FILE=$INSTANCE_DIR"/etcd.yml"
if [ ! -f "$CFG_FILE" ]; then
echo "配置文件 $CFG_FILE 不存在"
exit 2
fi
echo "配置文件: $CFG_FILE"
DATA_DIR=$INSTANCE_DIR"/data"
if [ ! -d "$DATA_DIR" ];then
echo "第一次启动创建存储目录 $DATA_DIR"
mkdir $DATA_DIR
chmod 700 $DATA_DIR
fi
LOG_DIR=$INSTANCE_DIR"/log"
if [ ! -d "$LOG_DIR" ];then
echo "创建日志目录 $LOG_DIR"
mkdir $LOG_DIR
fi
START_LOG=$LOG_DIR"/start.log"
echo "启动日志路径: $START_LOG"
nohup $ETCD --config-file=$CFG_FILE > $START_LOG 2>&1 &
#$ETCD --config-file=$CFG_FILE > $START_LOG
fi
[root@master(106.210) /homed/etcd]# cat stop_instance.sh
if [ -z "$1" ]; then
echo "请输入需要停止的进程名"
else
echo "停止的进程: $1"
ps -ef |grep $1 | grep -v grep | awk '{print$2}' | xargs kill -9
fi
[root@master(106.210) /homed/etcd]# chmod 777 *.sh
[root@master(106.210) /homed/etcd]# ll
总用量 2099760
-rw------- 1 root root 2116444192 2月 23 09:18 backup_20230223.db
drwx------ 3 root root 4096 2月 22 16:35 data
drwxr-xr-x 3 528287 89939 4096 1月 20 18:16 etcd-v3.5.7-linux-amd64
-rw-r--r-- 1 root root 18458320 2月 23 09:15 etcd-v3.5.7-linux-amd64.tar.gz
-rw-r--r-- 1 root root 791 1月 3 11:18 etcd.yml
drwxr-xr-x 2 root root 4096 2月 23 09:25 instance1
drwxr-xr-x 2 root root 4096 11月 14 11:57 log
-rwxrwxrwx 1 root root 1154 2月 23 09:26 start_instance.sh
-rwxrwxrwx 1 root root 86 2月 23 09:21 start.sh
-rwxrwxrwx 1 root root 201 2月 23 09:26 stop_instance.sh
-rwxrwxrwx 1 root root 22 10月 12 11:15 stop.sh
调整instance1的配置,新增集群配置
[root@master(106.210) /homed/etcd]# cat instance1/etcd.yml
name: 'instance1' # 节点名称
data-dir: '/homed/etcd/instance1/data' # 数据存储目录
listen-client-urls: 'http://0.0.0.0:2379' # 监听地址,地址写法是 scheme://IP:port,可以多个并用逗号隔开,如果配置是http://0.0.0.0:2379,将不限制node访问地址
advertise-client-urls: 'http://0.0.0.0:2379' # 用于通知其他ETCD节点,客户端接入本节点的监听地址,一般来说advertise-client-urls是listen-client-urls子集,这些URL可以包含域名。
enable-v2: true
log-level: debug
logger: 'zap'
log-outputs: ['/homed/etcd/instance1/log/ectd.log',]
auto-compaction-mode: periodic
auto-compaction-retention: 60m
# 修改空间配额为$((6*1024*1024*1024)),默认 2G,最大 8G
quota-backend-bytes: 6442450944
# 扩展集群新增配置
listen-peer-urls: 'http://0.0.0.0:2380'
#initial-cluster: 'instance1=http://0.0.0.0:2380'
initial-advertise-peer-urls: 'http://0.0.0.0:2380'
initial-cluster-token: 'etcd-cluster'
initial-cluster: "instance1=http://localhost:2380"
initial-cluster-state: 'new'
force-new-cluster: true
启动instance1,并查看启动
[root@master(106.210) /homed/etcd]# ./start_instance.sh instance1
启动的实例: instance1
配置文件: /r2/homed/etcd/instance1/etcd.yml
第一次启动创建存储目录 /r2/homed/etcd/instance1/data
创建日志目录 /r2/homed/etcd/instance1/log
[root@master(106.210) /homed/etcd]# ./etcdctl member list
8e9e05c52164694d, started, etcd-cfgcenter, http://localhost:2380, http://0.0.0.0:2379, false
4. 恢复备份数据到instance1
[root@master(106.210) /homed/etcd]# ./etcdctl snapshot restore ./backup_20230223.db --data-dir="/homed/etcd/instance1/data/"
Deprecated: Use `etcdutl snapshot restore` instead.
Error: data-dir "/homed/etcd/instance1/data/" not empty or could not be read
提示数据目不是空,或者不能读,删除该目录再次恢复数据
[root@master(106.210) /homed/etcd]# rm -rf ./instance1/data
您在 /var/spool/mail/root 中有邮件
[root@master(106.210) /homed/etcd]# ./etcdctl snapshot restore ./backup_20230223.db --data-dir="/homed/etcd/instance1/data/"
Deprecated: Use `etcdutl snapshot restore` instead.
2023-02-23T10:15:47+08:00 info snapshot/v3_snapshot.go:248 restoring snapshot {"path": "./backup_20230223.db", "wal-dir": "/homed/etcd/instance1/data/member/wal", "data-dir": "/homed/etcd/instance1/data/", "snap-dir": "/homed/etcd/instance1/data/member/snap", "stack": "go.etcd.io/etcd/etcdutl/v3/snapshot.(*v3Manager).Restore\n\tgo.etcd.io/etcd/etcdutl/v3@v3.5.7/snapshot/v3_snapshot.go:254\ngo.etcd.io/etcd/etcdutl/v3/etcdutl.SnapshotRestoreCommandFunc\n\tgo.etcd.io/etcd/etcdutl/v3@v3.5.7/etcdutl/snapshot_command.go:147\ngo.etcd.io/etcd/etcdctl/v3/ctlv3/command.snapshotRestoreCommandFunc\n\tgo.etcd.io/etcd/etcdctl/v3/ctlv3/command/snapshot_command.go:129\ngithub.com/spf13/cobra.(*Command).execute\n\tgithub.com/spf13/cobra@v1.1.3/command.go:856\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\tgithub.com/spf13/cobra@v1.1.3/command.go:960\ngithub.com/spf13/cobra.(*Command).Execute\n\tgithub.com/spf13/cobra@v1.1.3/command.go:897\ngo.etcd.io/etcd/etcdctl/v3/ctlv3.Start\n\tgo.etcd.io/etcd/etcdctl/v3/ctlv3/ctl.go:107\ngo.etcd.io/etcd/etcdctl/v3/ctlv3.MustStart\n\tgo.etcd.io/etcd/etcdctl/v3/ctlv3/ctl.go:111\nmain.main\n\tgo.etcd.io/etcd/etcdctl/v3/main.go:59\nruntime.main\n\truntime/proc.go:255"}
2023-02-23T10:16:30+08:00 info membership/store.go:141 Trimming membership information from the backend...
2023-02-23T10:16:31+08:00 info membership/cluster.go:421 added member {"cluster-id": "cdf818194e3a8c32", "local-member-id": "0", "added-peer-id": "8e9e05c52164694d", "added-peer-peer-urls": ["http://localhost:2380"]}
2023-02-23T10:16:31+08:00 info snapshot/v3_snapshot.go:269 restored snapshot {"path": "./backup_20230223.db", "wal-dir": "/homed/etcd/instance1/data/member/wal", "data-dir": "/homed/etcd/instance1/data/", "snap-dir": "/homed/etcd/instance1/data/member/snap"}
5. 验证恢复数据是否正常
[root@master(106.210) /homed/etcd]# ./etcdctl get --prefix /homedsrv/0755-91/2
^C
获取数据发现长时间没响应,去日志目录看看日志
[root@master(106.210) /homed/etcd]# cd instance1/log/
[root@master(106.210) /homed/etcd/instance1/log]# ll
总用量 16
-rw-r--r-- 1 root root 13634 2月 23 10:15 ectd.log
-rw-r--r-- 1 root root 0 2月 23 10:15 start.log
[root@master(106.210) /homed/etcd/instance1/log]# tail ectd.log
{"level":"info","ts":"2023-02-23T10:15:29.213+0800","caller":"zapgrpc/zapgrpc.go:174","msg":"[core] pickfirstBalancer: UpdateSubConnState: 0xc0002be920, {READY <nil>}"}
{"level":"info","ts":"2023-02-23T10:15:29.213+0800","caller":"zapgrpc/zapgrpc.go:174","msg":"[core] Channel Connectivity change to READY"}
{"level":"debug","ts":"2023-02-23T10:15:29.216+0800","caller":"etcdserver/server.go:2142","msg":"Applying entries","num-entries":1}
{"level":"debug","ts":"2023-02-23T10:15:29.216+0800","caller":"etcdserver/server.go:2145","msg":"Applying entry","index":4,"term":2,"type":"EntryNormal"}
{"level":"debug","ts":"2023-02-23T10:15:29.216+0800","caller":"etcdserver/server.go:2204","msg":"apply entry normal","consistent-index":3,"entry-index":4,"should-applyV3":true}
{"level":"debug","ts":"2023-02-23T10:15:29.216+0800","caller":"etcdserver/server.go:2227","msg":"applyEntryNormal","V2request":"ID:112456383074158852 Method:\"PUT\" Path:\"/0/version\" Val:\"3.5.0\" "}
{"level":"info","ts":"2023-02-23T10:15:29.216+0800","caller":"membership/cluster.go:584","msg":"set initial cluster version","cluster-id":"a0d2de0531db7884","local-member-id":"1c70f9bbb41018f","cluster-version":"3.5"}
{"level":"info","ts":"2023-02-23T10:15:29.216+0800","caller":"api/capability.go:75","msg":"enabled capabilities for version","cluster-version":"3.5"}
{"level":"info","ts":"2023-02-23T10:15:29.216+0800","caller":"etcdserver/server.go:2595","msg":"cluster version is updated","cluster-version":"3.5"}
{"level":"fatal","ts":"2023-02-23T10:15:58.684+0800","caller":"etcdserver/server.go:885","msg":"failed to purge wal file","error":"open /homed/etcd/instance1/data/member/wal: no such file or directory","stacktrace":"go.etcd.io/etcd/server/v3/etcdserver.(*EtcdServer).purgeFile\n\tgo.etcd.io/etcd/server/v3/etcdserver/server.go:885\ngo.etcd.io/etcd/server/v3/etcdserver.(*EtcdServer).GoAttach.func1\n\tgo.etcd.io/etcd/server/v3/etcdserver/server.go:2754"}
提示 failed to purge wal file","error":"open /homed/etcd/instance1/data/member/wal: no such file or directory" 数据目录不存在,应该是刚才导数据前删目录导致,导完数据已经有目录了,尝试重启下服务。
[root@master(106.210) /homed/etcd/instance1/log]# cd ../..
[root@master(106.210) /homed/etcd]# ./stop_instance.sh instance1
停止的进程: instance1
用法:
kill [选项] <pid|名称> [...]
选项:
-a, --all 不限于只对和当前进程的用户 id 相同的进程进行
名称-进程id 转换
-s, --signal <信号> 发送指定的信号
-q, --queue <信号> 使用 sigqueue(2) 代替 kill(2)
-p, --pid 打印 pid 而不向它们发送信号
-l, --list [=<信号>] 列出信号名,或将一个信号转换为名称
-L, --table 列出信号名和数值
-h, --help 显示此帮助并退出
-V, --version 输出版本信息并退出
更多信息请参阅 kill(1)。
[root@master(106.210) /homed/etcd]# ./start_instance.sh instance1
启动的实例: instance1
配置文件: /r2/homed/etcd/instance1/etcd.yml
启动日志路径: /r2/homed/etcd/instance1/log/start.log
[root@master(106.210) /homed/etcd]# ./etcdctl get --prefix /homedsrv/0755-91/2
/homedsrv/0755-91/20900/attr
{"service_name": "ipys", "multi_instance": true, "net_type": 0, "api_type": 0, "loadbalance": 1}
/homedsrv/0755-91/22900/attr
{"service_name": "iaps", "multi_instance": true, "net_type": 0, "api_type": 0, "loadbalance": 1}
/homedsrv/0755-91/24900/attr
{"service_name": "iepgs", "multi_instance": true, "net_type": 0, "api_type": 0, "loadbalance": 1}
/homedsrv/0755-91/24901/attr
{"service_name": "iepgs", "multi_instance": true, "net_type": 0, "api_type": 0, "loadbalance": 1}
/homedsrv/0755-91/26900/attr
{"service_name": "iacs", "multi_instance": true, "net_type": 0, "api_type": 0, "loadbalance": 1}
/homedsrv/0755-91/26901/attr
{"service_name": "iacs", "multi_instance": true, "net_type": 0, "api_type": 0, "loadbalance": 1}
/homedsrv/0755-91/28900/attr
{"service_name": "dtvs", "multi_instance": false, "net_type": 0, "api_type": 1, "loadbalance": 1}
/homedsrv/0755-91/29900/attr
{"service_name": "iouts", "multi_instance": true, "net_type": 0, "api_type": 0, "loadbalance": 1}
[root@master(106.210) /homed/etcd]# ./etcdctl member list -w=table
+------------------+---------+-----------+-----------------------+---------------------+------------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER |
+------------------+---------+-----------+-----------------------+---------------------+------------+
| 8e9e05c52164694d | started | instance1 | http://localhost:2380 | http://0.0.0.0:2379 | false |
+------------------+---------+-----------+-----------------------+---------------------+------------+
重启服务后获取数据正常,继续下一步
6. 给集群加入节点instance2
[root@master(106.210) /homed/etcd]# ./etcdctl member add instance2 --peer-urls="http://0.0.0.0:2480"
Member b0cce666a47c30d2 added to cluster cdf818194e3a8c32
ETCD_NAME="instance2"
ETCD_INITIAL_CLUSTER="instance1=http://localhost:2380,instance2=http://0.0.0.0:2480"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://0.0.0.0:2480"
ETCD_INITIAL_CLUSTER_STATE="existing"
7. 启动instance2
新增instance2目录并调整instance2启动配置如下
[root@master(106.210) /homed/etcd]# mkdir instance2
[root@master(106.210) /homed/etcd]# cd instance2/
[root@master(106.210) /homed/etcd/instance2]# vim etcd.yml
[root@master(106.210) /homed/etcd]# cat instance2/etcd.yml
# 节点名称
name: 'instance2'
# 数据存储目录
data-dir: '/homed/etcd/instance2/data'
# 监听地址,地址写法是 scheme://IP:port,可以多个并用逗号隔开,>如果配置是http://0.0.0.0:2379,将不限制node访问地址
listen-client-urls: 'http://0.0.0.0:2479'
# 用于通知其他ETCD节点,客户端接入本节点的监听地址,一般来说advertise-client-urls是listen-client-urls子集,这些URL可以包含域名。
advertise-client-urls: 'http://0.0.0.0:2479'
enable-v2: true
log-level: debug
logger: 'zap'
log-outputs: ['/homed/etcd/instance2/log/ectd.log',]
# 集群配置
# 告知集群其他节点url
initial-advertise-peer-urls: 'http://0.0.0.0:2480'
# 监听URL,用于与其他节点通讯
listen-peer-urls: 'http://0.0.0.0:2480'
# 集群中所有节点
initial-cluster: "instance1=http://localhost:2380,instance2=http://0.0.0.0:2480"
initial-cluster-state: 'existing'
启动instance2并检查集群
[root@master(106.210) /homed/etcd]# ./start_instance.sh instance2
启动的实例: instance2
配置文件: /r2/homed/etcd/instance2/etcd.yml
第一次启动创建存储目录 /r2/homed/etcd/instance2/data
创建日志目录 /r2/homed/etcd/instance2/log
启动日志路径: /r2/homed/etcd/instance2/log/start.log
[root@master(106.210) /homed/etcd]# ps aux| grep instance
root 180729 2.5 3.2 8071556 2134272 pts/37 Sl 10:35 0:01 /r2/homed/etcd/etcd --config-file=/r2/homed/etcd/instance1/etcd.yml
root 182672 0.5 0.0 11214468 22360 pts/37 Sl 10:36 0:00 /r2/homed/etcd/etcd --config-file=/r2/homed/etcd/instance2/etcd.yml
root 182723 0.0 0.0 112684 988 pts/37 S+ 10:36 0:00 grep --color=auto instance
[root@master(106.210) /homed/etcd]# ./etcdctl member list -w=table
+------------------+---------+-----------+-----------------------+---------------------+------------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER |
+------------------+---------+-----------+-----------------------+---------------------+------------+
| 8e9e05c52164694d | started | instance1 | http://localhost:2380 | http://0.0.0.0:2379 | false |
| b0cce666a47c30d2 | started | instance2 | http://0.0.0.0:2480 | http://0.0.0.0:2479 | false |
+------------------+---------+-----------+-----------------------+---------------------+------------+
8. 给集群加入节点instance3
[root@master(106.210) /homed/etcd]# ./etcdctl member add instance3 --peer-urls="http://0.0.0.0:2580"
Member f0b67ae931afbf5a added to cluster cdf818194e3a8c32
ETCD_NAME="instance3"
ETCD_INITIAL_CLUSTER="instance1=http://localhost:2380,instance2=http://0.0.0.0:2480,instance3=http://0.0.0.0:2580"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://0.0.0.0:2580"
ETCD_INITIAL_CLUSTER_STATE="existing"
9. 启动instance3
新增instance3目录并调整instance3启动配置如下
[root@master(106.210) /homed/etcd]# mkdir instance3
[root@master(106.210) /homed/etcd]# vim instance3/etcd.yml
[root@master(106.210) /homed/etcd]# cat instance3/etcd.yml
# 节点名称
name: 'instance3'
# 数据存储目录
data-dir: '/homed/etcd/instance3/data'
# 监听地址,地址写法是 scheme://IP:port,可以多个并用逗号隔开,>如果配置是http://0.0.0.0:2379,将不限制node访问地址
listen-client-urls: 'http://0.0.0.0:2579'
# 用于通知其他ETCD节点,客户端接入本节点的监听地址,一般来说advertise-client-urls是listen-client-urls子集,这些URL可以包含域名。
advertise-client-urls: 'http://0.0.0.0:2579'
enable-v2: true
log-level: debug
logger: 'zap'
log-outputs: ['/homed/etcd/instance3/log/ectd.log',]
# 集群配置
# 告知集群其他节点url
initial-advertise-peer-urls: 'http://0.0.0.0:2580'
# 监听URL,用于与其他节点通讯
listen-peer-urls: 'http://0.0.0.0:2580'
# 集群中所有节点
initial-cluster: "instance1=http://localhost:2380,instance2=http://0.0.0.0:2480,instance3=http://0.0.0.0:2580"
initial-cluster-state: 'existing'
启动instance3并检查集群
[root@master(106.210) /homed/etcd]# ./start_instance.sh instance3
启动的实例: instance3
配置文件: /r2/homed/etcd/instance3/etcd.yml
第一次启动创建存储目录 /r2/homed/etcd/instance3/data
创建日志目录 /r2/homed/etcd/instance3/log
启动日志路径: /r2/homed/etcd/instance3/log/start.log
[root@master(106.210) /homed/etcd]# ps aux| grep instance
root 180729 2.6 3.3 8139976 2158344 pts/37 Sl 10:35 0:10 /r2/homed/etcd/etcd --config-file=/r2/homed/etcd/instance1/etcd.yml
root 182672 2.2 3.2 11218180 2145232 pts/37 Sl 10:36 0:07 /r2/homed/etcd/etcd --config-file=/r2/homed/etcd/instance2/etcd.yml
root 189667 35.0 0.0 11216260 36352 pts/37 Sl 10:42 0:03 /r2/homed/etcd/etcd --config-file=/r2/homed/etcd/instance3/etcd.yml
root 190524 0.0 0.0 112684 988 pts/37 S+ 10:42 0:00 grep --color=auto instance
[root@master(106.210) /homed/etcd]# ./etcdctl member list -w=table
+------------------+---------+-----------+-----------------------+---------------------+------------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER |
+------------------+---------+-----------+-----------------------+---------------------+------------+
| 8e9e05c52164694d | started | instance1 | http://localhost:2380 | http://0.0.0.0:2379 | false |
| b0cce666a47c30d2 | started | instance2 | http://0.0.0.0:2480 | http://0.0.0.0:2479 | false |
| f0b67ae931afbf5a | started | instance3 | http://0.0.0.0:2580 | http://0.0.0.0:2579 | false |
+------------------+---------+-----------+-----------------------+---------------------+------------+
10. 检查集群数据
[root@master(106.210) /homed/etcd]# ETCD_ENDPOINTS=0.0.0.0:2380,0.0.0.0:2480,0.0.0.0:2580
[root@master(106.210) /homed/etcd]# ./etcdctl --endpoints=$ETCD_ENDPOINTS get --prefix /homedsrv/0755-91/26
/homedsrv/0755-91/26900/attr
{"service_name": "iacs", "multi_instance": true, "net_type": 0, "api_type": 0, "loadbalance": 1}
/homedsrv/0755-91/26901/attr
{"service_name": "iacs", "multi_instance": true, "net_type": 0, "api_type": 0, "loadbalance": 1}
能正常获取到数据,
11. 最后配置调整
将instance1和instance2的etcd.yml配置中的 initial-cluster 和 initial-cluster-state 调整成和instance3一样,并去除instance1中的 force-new-cluster: true 配置项
initial-cluster: "instance1=http://localhost:2380,instance2=http://0.0.0.0:2480,instance3=http://0.0.0.0:2580"
initial-cluster-state: 'existing'
12. 调整完配置再重启集群试试
重启集群前先设置数据
[root@master(106.210) /homed/etcd]# ./etcdctl --endpoints=$ETCD_ENDPOINTS put /test/clust instance1=http://localhost:2380,instance2=http://0.0.0.0:2480,instance3=http://0.0.0.0:2580
执行重启
[root@master(106.210) /homed/etcd]# ./stop.sh
停止的进程: instance1
停止的进程: instance2
停止的进程: instance3
[root@master(106.210) /homed/etcd]# ps aux| grep instanc
root 201568 0.0 0.0 112684 988 pts/37 S+ 10:50 0:00 grep --color=auto instanc
[root@master(106.210) /homed/etcd]# ./start.sh
启动的实例: instance1
配置文件: /r2/homed/etcd/instance1/etcd.yml
启动日志路径: /r2/homed/etcd/instance1/log/start.log
启动的实例: instance2
配置文件: /r2/homed/etcd/instance2/etcd.yml
启动日志路径: /r2/homed/etcd/instance2/log/start.log
启动的实例: instance3
配置文件: /r2/homed/etcd/instance3/etcd.yml
启动日志路径: /r2/homed/etcd/instance3/log/start.log
重启后再拉取数据并检查集群状态
[root@master(106.210) /homed/etcd]# ./etcdctl --endpoints=$ETCD_ENDPOINTS get --prefix /test/clust
/test/clust
instance1=http://localhost:2380,instance2=http://0.0.0.0:2480,instance3=http://0.0.0.0:2580
[root@master(106.210) /homed/etcd]# ./etcdctl --endpoints=$ETCD_ENDPOINTS endpoint status -w=table
+--------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+--------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| 0.0.0.0:2380 | 8e9e05c52164694d | 3.5.7 | 2.1 GB | false | false | 9 | 33 | 33 | |
| 0.0.0.0:2480 | b0cce666a47c30d2 | 3.5.7 | 2.1 GB | true | false | 9 | 33 | 33 | |
| 0.0.0.0:2580 | f0b67ae931afbf5a | 3.5.7 | 2.1 GB | false | false | 9 | 33 | 33 | |
+--------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
[root@master(106.210) /homed/etcd]# ./etcdctl --endpoints=$ETCD_ENDPOINTS endpoint health -w=table
+--------------+--------+-------------+-------+
| ENDPOINT | HEALTH | TOOK | ERROR |
+--------------+--------+-------------+-------+
| 0.0.0.0:2480 | true | 15.091262ms | |
| 0.0.0.0:2380 | true | 12.541677ms | |
| 0.0.0.0:2580 | true | 12.521246ms | |
+--------------+--------+-------------+-------+
升级集群完成