Ceph MDS States状态详解

MDS States

元数据服务器(MDS)在CephFS的正常操作过程中经历多个状态。例如,一些状态指示MDS从MDS的先前实例从故障转移中恢复。在这里,我们将记录所有这些状态,并包括状态图来可视化转换。

State Descriptions

Common states

状态 说明
up:active This is the normal operating state of the MDS. It indicates that the MDS and its rank in the file system is available.

这个状态是正常运行的状态。 这个表明该mds在rank中是可用的状态。
up:standby The MDS is available to takeover for a failed rank (see also :ref:mds-standby). The monitor will automatically assign an MDS in this state to a failed rank once available.

这个状态是灾备状态,用来接替主挂掉的情况。
up:standby_replay The MDS is following the journal of another up:active MDS. Should the active MDS fail, having a standby MDS in replay mode is desirable as the MDS is replaying the live journal and will more quickly takeover. A downside to having standby replay MDSs is that they are not available to takeover for any other MDS that fails, only the MDS they follow.

灾备守护进程就会持续读取某个处于 up 状态的 rank 的元数据日志。这样它就有元数据的热缓存,在负责这个 rank 的守护进程失效时,可加速故障切换。

一个正常运行的 rank 只能有一个灾备重放守护进程( standby replay daemon ),如果两个守护进程都设置成了灾备重放状态,那么其中任意一个会取胜,另一个会变为普通的、非重放灾备状态。

一旦某个守护进程进入灾备重放状态,它就只能为它那个 rank 提供灾备。如果有另外一个 rank 失效了,即使没有灾备可用,这个灾备重放守护进程也不会去顶替那个失效的。
up:boot This state is broadcast to the Ceph monitors during startup. This state is never visible as the Monitor immediately assign the MDS to an available rank or commands the MDS to operate as a standby. The state is documented here for completeness.

此状态在启动期间被广播到CEPH监视器。这种状态是不可见的,因为监视器立即将MDS分配给可用的秩或命令MDS作为备用操作。这里记录了完整性的状态。
up:creating The MDS is creating a new rank (perhaps rank 0) by constructing some per-rank metadata (like the journal) and entering the MDS cluster.
up:starting The MDS is restarting a stopped rank. It opens associated per-rank metadata and enters the MDS cluster.
up:stopping When a rank is stopped, the monitors command an active MDS to enter the up:stopping state. In this state, the MDS accepts no new client connections, migrates all subtrees to other ranks in the file system, flush its metadata journal, and, if the last rank (0), evict all clients and shutdown (see also :ref:cephfs-administration).
up:replay The MDS taking over a failed rank. This state represents that the MDS is recovering its journal and other metadata.

日志恢复阶段,他将日志内容读入内存后,在内存中进行回放操作。
up:resolve The MDS enters this state from up:replay if the Ceph file system has multiple ranks (including this one), i.e. it's not a single active MDS cluster. The MDS is resolving any uncommitted inter-MDS operations. All ranks in the file system must be in this state or later for progress to be made, i.e. no rank can be failed/damaged or up:replay.

用于解决跨多个mds出现权威元数据分歧的场景,对于服务端包括子树分布、Anchor表更新等功能,客户端包括rename、unlink等操作。
up:reconnect An MDS enters this state from up:replay or up:resolve. This state is to solicit reconnections from clients. Any client which had a session with this rank must reconnect during this time, configurable via mds_reconnect_timeout.

恢复的mds需要与之前的客户端重新建立连接,并且需要查询之前客户端发布的文件句柄,重新在mds的缓存中创建一致性功能和锁的状态。mds不会同步记录文件打开的信息,原因是需要避免在访问mds时产生多余的延迟,并且大多数文件是以只读方式打开。
up:rejoin The MDS enters this state from up:reconnect. In this state, the MDS is rejoining the MDS cluster cache. In particular, all inter-MDS locks on metadata are reestablished.
If there are no known client requests to be replayed, the MDS directly becomes up:active from this state.

把客户端的inode加载到mds cache
up:clientreplay The MDS may enter this state from up:rejoin. The MDS is replaying any client requests which were replied to but not yet durable (not journaled). Clients resend these requests during up:reconnect and the requests are replayed once again. The MDS enters up:active after completing replay.
down:failed No MDS actually holds this state. Instead, it is applied to the rank in the file system
down:damaged No MDS actually holds this state. Instead, it is applied to the rank in the file system
down:stopped No MDS actually holds this state. Instead, it is applied to the rank in the file system

主从切换流程:

  • handle_mds_map state change up:boot --> up:replay
  • handle_mds_map state change up:replay --> up:reconnect
  • handle_mds_map state change up:reconnect --> up:rejoin
  • handle_mds_map state change up:rejoin --> up:active

State Diagram

This state diagram shows the possible state transitions for the MDS/rank. The legend is as follows:

Color

  • 绿色: MDS是活跃的.
  • 橙色: MDS处于过渡临时状态,试图变得活跃.
  • 红色: MDS指示一个状态,该状态导致被标记为失败.
  • 紫色: MDS和rank为停止.
  • 红色: MDS指示一个状态,该状态导致被标记为损坏.

Shape

  • 圈:MDS保持这种状态.
  • 六边形:没有MDS保持这个状态.

Lines

  • A double-lined shape indicates the rank is "in".
image.png

参考:
https://github.com/ceph/ceph/blob/master/doc/cephfs/mds-states.rst

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 219,753评论 6 508
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 93,668评论 3 396
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 166,090评论 0 356
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 59,010评论 1 295
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 68,054评论 6 395
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 51,806评论 1 308
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 40,484评论 3 420
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 39,380评论 0 276
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 45,873评论 1 319
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 38,021评论 3 338
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 40,158评论 1 352
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 35,838评论 5 346
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 41,499评论 3 331
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 32,044评论 0 22
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 33,159评论 1 272
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 48,449评论 3 374
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 45,136评论 2 356

推荐阅读更多精彩内容