1 概述
在文章ElasticSearch Peer to Peer Recovery我们提到了P2P恢复之后会涉及到Primary身份的交互,Primary身份的交互则需要对ReplicationTracker
有所了解。
首先看一下该类源码的注释:
This class is responsible for tracking the replication group with its progress and safety markers (local and global checkpoints). The global checkpoint is the highest sequence number for which all lower (or equal) sequence number have been processed on all shards that are currently active. Since shards count as "active" when the master starts them, and before this primary shard has been notified of this fact, we also include shards that have completed recovery. These shards have received all old operations via the recovery mechanism and are kept up to date by the various replications actions. The set of shards that are taken into account for the global checkpoint calculation are called the "in-sync shards". The global checkpoint is maintained by the primary shard and is replicated to all the replicas (via {@link GlobalCheckpointSyncAction}).
大概翻译如下(可能不准确):
ReplicationTracker
负责维护一个分片的副本、各个副本当前进度以及其安全标记,其中进度指的是checkpoint的位置,安全标记指的是当前副本(local)checkpoint、全局checkpoint。全局checkpoint是所有active状态副本当前都到达的一个最大序号。被主节点启动的分片以及完成恢复流程的分片被称为active分片。这些分片通过恢复机制接收并重放了所有旧的操作(旧指的是在该分片启动之前进行的操作)并通过各种复制动作保持全局最新状态。在计算全局checkpoint时使用的分片集合被称为in-sync shards
。全局checkpoint由主分片维护并通过GlobalCheckpointSyncAction
复制到所有的副本。
》》写本文时发现对checkpoint的理解不够深入,后续有了新理解之后更新》》