背景及简介
官网地址: https://www.consul.io/
下载地址: https://releases.hashicorp.com/consul/1.3.1/
consul是google开源的一个使用go语言开发的服务发现、配置管理中心服务。内置了服务注册与发现框 架、分布一致性协议实现、健康检查、Key/Value存储、多数据中心方案,不再需要依赖其他工具。服务部署简单,只有一个可运行的二进制的包。每个节点都需要运行agent,他有两种运行模式server和client。每个数据中心官方建议需要3或5个server节点以保证数据安全,同时保证server-leader的选举能够正确的进行。与consul类似的工具还有很多几个:ZooKeeper, etcd
- 名词概念
- client: CLIENT表示consul的client模式,就是客户端模式。是consul节点的一种模式,这种模式下,所有注册到当前节点的服务会被转发到SERVER,本身是不持久化这些信息。
- server: SERVER表示consul的server模式,表明这个consul是个server,这种模式下,功能和CLIENT都一样,唯一不同的是,它会把所有的信息持久化的本地,这样遇到故障,信息是可以被保留的。
- server-leader: 中间那个SERVER下面有LEADER的字眼,表明这个SERVER是它们的老大,它和其它SERVER不一样的一点是,它需要负责同步注册的信息给其它的SERVER,同时也要负责各个节点的健康监测。
- raft: server节点之间的数据一致性保证,一致性协议使用的是raft,而zookeeper用的paxos,etcd采用的也是taft。
- 服务发现协议: consul采用http和dns协议,etcd只支持http
- 服务注册: consul支持两种方式实现服务注册,一种是通过consul的服务注册http API,由服务自己调用API实现注册,另一种方式是通过json个是的配置文件实现注册,将需要注册的服务以json格式的配置文件给出。consul官方建议使用第二种方式。
- 服务发现: consul支持两种方式实现服务发现,一种是通过http API来查询有哪些服务,另外一种是通过consul agent 自带的DNS(8600端口),域名是以NAME.service.consul的形式给出,NAME即在定义的服务配置文件中,服务的名称。DNS方式可以通过check的方式检查服务。
- 服务间的通信协议: Consul使用gossip协议管理成员关系、广播消息到整个集群,他有两个gossip pool(LAN pool和WAN pool),LAN pool是同一个数据中心内部通信的,WAN pool是多个数据中心通信的,LAN pool有多个,WAN pool只有一个。
- LAN Gossip——它包含所有位于同一个局域网或者数据中心的所有节点。
- WAN Gossip——它只包含Server。这些server主要分布在不同的数据中心并且通常通过因特网或者广域网通信。
- RPC——远程过程调用。这是一个允许client请求server的请求/响应机制。
简单来说就是:client相当于我们平时说的LB,负责将请求转发到Server,Server中有一个leader,负责Server集群的同步和监测,这个server-leader在不指定的情况下回随机推举出一个,当然也可以手动指定。这个在ACL配置的时候需要保证Server-leader是同一个。
单机安装consul
下载consul
$ sudo wget https://releases.hashicorp.com/consul/1.3.1/consul_1.3.1_linux_amd64.zip
$ sudo unzip consul_1.3.1_linux_amd64.zip
$ sudo mv consul /usr/bin/
$ consul --version
Consul v1.3.1
Protocol 2 spoken by default, understands 2 to 3 (agent will automatically use protocol >2 when speaking to compatible agents)
启动consul
$ sudo mkdir /data/app/consul
$ sudo chown `whoami`. /data/app/consul/
$ sudo nohup /usr/bin/consul agent -server -data-dir=/data/app/consul -bootstrap -ui -advertise=10.208.1.10 -client=10.208.1.10 > /data/app/consul/consul.log 2>&1 &
$ tail -f /data/app/consul/consul.log
2019/01/24 20:16:10 [INFO] consul: Adding LAN server azr-sal1002 (Addr: tcp/10.208.1.10:8300) (DC: dc1)
2019/01/24 20:16:10 [WARN] agent/proxy: running as root, will not start managed proxies
2019/01/24 20:16:10 [INFO] agent: Started DNS server 127.0.0.1:8600 (udp)
2019/01/24 20:16:10 [INFO] agent: Started DNS server 127.0.0.1:8600 (tcp)
2019/01/24 20:16:10 [INFO] agent: Started HTTP server on 127.0.0.1:8500 (tcp)
2019/01/24 20:16:10 [INFO] agent: started state syncer
2019/01/24 20:16:17 [ERR] agent: failed to sync remote state: No cluster leader
2019/01/24 20:16:18 [WARN] raft: Heartbeat timeout from "" reached, starting election
2019/01/24 20:16:18 [INFO] raft: Node at 10.208.1.10:8300 [Candidate] entering Candidate state in term 3
2019/01/24 20:16:18 [INFO] raft: Election won. Tally: 1
2019/01/24 20:16:18 [INFO] raft: Node at 10.208.1.10:8300 [Leader] entering Leader state
2019/01/24 20:16:18 [INFO] consul: cluster leadership acquired
2019/01/24 20:16:18 [INFO] consul: New leader elected: azr-sal1002
2019/01/24 20:16:18 [INFO] agent: Synced node info
==> Newer Consul version available: 1.4.1 (currently running: 1.3.1)
访问ui
参数解释
命令行参数
-bind:为该节点绑定一个地址
-enable-script-checks=true:设置检查服务为可用
-join:加入到已有的集群中
-server 表示当前使用的server模式
-node:指定当前节点在集群中的名称
-config-file - 要加载的配置文件
-config-dir:指定配置文件,定义服务的,默认所有以.json结尾的文件都会读
-datacenter: 数据中心没名称,不设置的话默认为dc
-client: 客户端模式
-ui: 使用consul自带的ui界面
-data-dir consul存储数据的目录
-bootstrap:用来控制一个server是否在bootstrap模式,在一个datacenter中只能有一个server处于bootstrap模式,当一个server处于bootstrap模式时,可以自己选举为raft leader。
-bootstrap-expect:在一个datacenter中期望提供的server节点数目,当该值提供的时候,consul一直等到达到指定sever数目的时候才会引导整个集群,该标记不能和bootstrap公用
这两个参数十分重要, 二选一,如果两个参数不使用的话,会出现就算你使用join将agent加入了集群仍然会报
2018/10/14 15:40:00 [ERR] agent: failed to sync remote state: No cluster leader
配置文件参数
ui: 相当于-ui 命令行标志。
acl_token:agent会使用这个token和consul server进行请求
acl_ttl:控制TTL的cache,默认是30s
addresses:一个嵌套对象,可以设置以下key:dns、http、rpc
advertise_addr:等同于-advertise
bootstrap:等同于-bootstrap
bootstrap_expect:等同于-bootstrap-expect
bind_addr:等同于-bind
ca_file:提供CA文件路径,用来检查客户端或者服务端的链接
cert_file:必须和key_file一起
check_update_interval:
client_addr:等同于-client
datacenter:等同于-dc
data_dir:等同于-data-dir
disable_anonymous_signature:在进行更新检查时禁止匿名签名
enable_debug:开启debug模式
enable_syslog:等同于-syslog
encrypt:等同于-encrypt
key_file:提供私钥的路径
leave_on_terminate:默认是false,如果为true,当agent收到一个TERM信号的时候,它会发送leave信息到集群中的其他节点上。
log_level:等同于-log-level node_name:等同于-node
ports:这是一个嵌套对象,可以设置以下key:dns(dns地址:8600)、http(http api地址:8500)、rpc(rpc:8400)、serf_lan(lan port:8301)、serf_wan(wan port:8302)、server(server rpc:8300)
protocol:等同于-protocol
rejoin_after_leave:等同于-rejoin
retry_join:等同于-retry-join
retry_interval:等同于-retry-interval
server:等同于-server
syslog_facility:当enable_syslog被提供后,该参数控制哪个级别的信息被发送,默认Local0
ui_dir:等同于-ui-dir
集群搭建(单机)
因为没有资源,只能在一台机器上装伪集群,如果是三台服务器来做的话, 不需要写json配置文件,直接用命令行启动就可以
# 创建节点数据目录
$ mkdir -pv /data/app/consul/{node1,node2,node3}
mkdir: created directory ‘/data/app/consul/node1’
mkdir: created directory ‘/data/app/consul/node2’
mkdir: created directory ‘/data/app/consul/node3’
节点1配置
$ vim /data/app/consul/node1/basic.json
{
"datacenter": "dc1",
"data_dir": "/data/app/consul/node1",
"log_level": "INFO",
"server": true,
"node_name": "node1",
"ui": true,
"bind_addr": "10.208.1.10",
"client_addr": "10.208.1.10",
"advertise_addr": "10.208.1.10",
"bootstrap_expect": 3,
"ports":{
"http": 8500,
"dns": 8600,
"server": 8300,
"serf_lan": 8301,
"serf_wan": 8302
}
}
$ nohup /usr/bin/consul agent -config-file=/data/app/consul/node1/basic.json > /data/app/consul/node1/consul.log 2>&1 &
$ tail -100f /data/app/consul/node1/consul.log
节点2配置
$ vim /data/app/consul/node2/basic.json
{
"datacenter": "dc1",
"data_dir": "/data/app/consul/node2",
"log_level": "INFO",
"server": true,
"node_name": "node2",
"bind_addr": "10.208.1.10",
"client_addr": "10.208.1.10",
"advertise_addr": "10.208.1.10",
"ports":{
"http": 8510,
"dns": 8610,
"server": 8310,
"serf_lan": 8311,
"serf_wan": 8312
}
}
$ nohup /usr/bin/consul agent -config-file=/data/app/consul/node2/basic.json -retry-join=10.208.1.10:8301 > /data/app/consul/node2/consul.log 2>&1 &
$ tail -100f /data/app/consul/node2/consul.log
节点3配置
$ vim /data/app/consul/node3/basic.json
{
"datacenter": "dc1",
"data_dir": "/data/app/consul/node3",
"log_level": "INFO",
"server": true,
"node_name": "node3",
"bind_addr": "10.208.1.10",
"client_addr": "10.208.1.10",
"advertise_addr": "10.208.1.10",
"ports":{
"http": 8520,
"dns": 8620,
"server": 8320,
"serf_lan": 8321,
"serf_wan": 8322
}
}
$ nohup /usr/bin/consul agent -config-file=/data/app/consul/node3/basic.json -retry-join=10.208.1.10:8301 > /data/app/consul/node3/consul.log 2>&1 &
$ tail -100f /data/app/consul/node3/consul.log
查看节点1日志变化
2019/01/24 22:48:58 [INFO] serf: EventMemberJoin: node2.dc1 10.208.1.10
2019/01/24 22:49:59 [INFO] serf: EventMemberJoin: node3.dc1 10.208.1.10
...
2019/01/24 22:49:59 [INFO] consul: Found expected number of peers, attempting bootstrap: 10.208.1.10:8320,10.208.1.10:8300,10.208.1.10:8310
2019/01/24 22:49:59 [INFO] consul: Handled member-join event for server "node3.dc1" in area "wan"
2019/01/24 22:50:05 [WARN] raft: Heartbeat timeout from "" reached, starting election
2019/01/24 22:50:05 [INFO] raft: Node at 10.208.1.10:8300 [Candidate] entering Candidate state in term 2
2019/01/24 22:50:05 [INFO] raft: Election won. Tally: 2
2019/01/24 22:50:05 [INFO] raft: Node at 10.208.1.10:8300 [Leader] entering Leader state
2019/01/24 22:50:05 [INFO] raft: Added peer faa05ada-4e06-6d5a-f35b-286c57826231, starting replication
2019/01/24 22:50:05 [INFO] raft: Added peer be2837bd-3b87-07f9-a776-863ed5966ffb, starting replication
2019/01/24 22:50:05 [INFO] consul: cluster leadership acquired
2019/01/24 22:50:05 [INFO] consul: New leader elected: node1
2019/01/24 22:50:05 [WARN] raft: AppendEntries to {Voter be2837bd-3b87-07f9-a776-863ed5966ffb 10.208.1.10:8310} rejected, sending older logs (next: 1)
2019/01/24 22:50:05 [INFO] raft: pipelining replication to peer {Voter be2837bd-3b87-07f9-a776-863ed5966ffb 10.208.1.10:8310}
2019/01/24 22:50:05 [INFO] consul: member 'node1' joined, marking health alive
2019/01/24 22:50:05 [INFO] consul: member 'node2' joined, marking health alive
2019/01/24 22:50:05 [INFO] agent: Synced node info
2019/01/24 22:50:05 [INFO] consul: member 'node3' joined, marking health alive
2019/01/24 22:50:06 [WARN] raft: AppendEntries to {Voter faa05ada-4e06-6d5a-f35b-286c57826231 10.208.1.10:8320} rejected, sending older logs (next: 1)
2019/01/24 22:50:07 [INFO] raft: pipelining replication to peer {Voter faa05ada-4e06-6d5a-f35b-286c57826231 10.208.1.10:8320}
访问UI
查看集群信息
$ /usr/bin/consul members -http-addr=10.208.1.10:8500
Node Address Status Type Build Protocol DC Segment
node1 10.208.1.10:8301 alive server 1.3.1 2 dc1 <all>
node2 10.208.1.10:8311 alive server 1.3.1 2 dc1 <all>
node3 10.208.1.10:8321 alive server 1.3.1 2 dc1 <all>
$ /usr/bin/consul info -http-addr=10.208.1.10:8500
agent:
check_monitors = 0
check_ttls = 0
checks = 0
services = 0
build:
prerelease =
revision = f2b13f30
version = 1.3.1
consul:
bootstrap = false
known_datacenters = 1
leader = true
leader_addr = 10.208.1.10:8300
server = true
raft:
applied_index = 80
commit_index = 80
fsm_pending = 0
last_contact = 0
last_log_index = 80
last_log_term = 2
last_snapshot_index = 0
last_snapshot_term = 0
latest_configuration = [{Suffrage:Voter ID:faa05ada-4e06-6d5a-f35b-286c57826231 Address:10.208.1.10:8320} {Suffrage:Voter ID:5aee898c-ead4-f844-0d70-37ee7d9e9fb3
Address:10.208.1.10:8300} {Suffrage:Voter ID:be2837bd-3b87-07f9-a776-863ed5966ffb Address:10.208.1.10:8310}]
latest_configuration_index = 1
num_peers = 2
protocol_version = 3
protocol_version_max = 3
protocol_version_min = 0
snapshot_version_max = 1
snapshot_version_min = 0
state = Leader
term = 2
runtime:
arch = amd64
cpu_count = 4
goroutines = 104
max_procs = 4
os = linux
version = go1.11.1
serf_lan:
coordinate_resets = 0
encrypted = false
event_queue = 0
event_time = 2
failed = 0
health_score = 0
intent_queue = 0
left = 0
member_time = 3
members = 3
query_queue = 0
query_time = 1
serf_wan:
coordinate_resets = 0
encrypted = false
event_queue = 0
event_time = 1
failed = 0
health_score = 0
intent_queue = 0
left = 0
member_time = 5
members = 3
query_queue = 0
query_time = 1