PBFT详细分析n大于3f+1

Practical Byzantine Fault Tolerance

Suppose you have N replicas, f of which might crash (non-Byzantine failure)
What quorum size Q do you need to guarantee liveness and safety?

  • Liveness: (or pseudo-liveness, i.e., avoiding stuck states)
    There must be a non-failed quorum (quorum availability)
    Hence: Q <= N - f
  • Safety: Any two quorums must intersect at one or more nodes
    Otherwise, two quorums could independently accept operations, diverge
    This property is often known as the quorum intersection property
    Hence: 2Q - N > 0
    So: N < 2Q <= 2(N - f)
    Note highest possible f: N < 2N-2f; f < N/2
    And if N = 2f + 1, smallest Q is 2Q > 2f + 1; Q = f + 1

Now say we throw in Byzantine failures. One view...
Say you have N nodes, f of which might experience Byzantine failure.
First, how can Byzantine failures be worse than non-Byzantine?
Byzantine nodes can vote for both a statement and its contradiction
Make different statements to different nodes
Consequences
Risks driving non-failed nodes into divergent states
Risks driving non-failed nodes into "stuck states"
E.g., cause split vote on seemingly irrefutable statement
Paxos example: You think majority aborted some ballot b v
You vote to commit b' v' (where b' > b, v' != v)
Can't convince other nodes it is safe to vote for b'

What quorum size Q do we need in Byzantine setting?

  • Liveness: Q <= N - f
    As in non-Byzantine case, failed nodes might not reply
  • Safety: Quorum intersection must contain one non-faulty node
    Idea: out of f+1 nodes, at most one can be faulty
    Hence: 2Q - N > f (since f could be malicious)
    So: N + f < 2Q <= 2(N - f)
    Highest f: N+f < 2N-2f; 3f < N; f < N/3
    And if N = 3f + 1, the smallest Q is:
    N + f < 2Q; 3f + 1 + f < 2Q; 2f + 1/2 < Q; Q_min = 2f + 1

So how does PBFT protocol work?
Number replica cohorts 1, 2, 3, ..., 3f+1
Number requests with consecutive sequence numbers (not viewstamps)
System goes through a series of views
In view v, replica number v mod (3f+1) is designated the primary
Primary is responsible for selecting the order of operations
Assigns an increasing sequence number to each operation
In normal-case operation, use two-round protocol for request r:
Round 1 (pre-prepare, prepare) goal:
Ensure at least f+1 honest replicas agree that
If request r executes in view v, will execute with sequence no. n
Round 2 (commit) goal:
Ensure at least f+1 honest replicas agree that
Request r has executed in view v with sequence no. n

Protocol for normal-case operation
Let c be client
r_i be replica i, or p primary, b_i backup i
R set of all replicas

c -> p:  m = {REQUEST, o, t, c}_Kc
p -> R:  {PRE-PREPARE, v, n, d}_Kp, m     (note d = H(m))

b_i -> R: {PREPARE, v, n, d, i}_K{r_i}
[Note all messages signed, so will omit signatures and use < > henceforth.]

replica r_i now waits for PRE-PREPARE + 2f matching PREPARE messages
puts these messages in its log
then we say prepared(m, v, n, i) is TRUE

Note: If prepared(m, v, n, i) is TRUE for honest replica r_i
then prepared(m', v, n, j) where m' != m FALSE for any honest r_j
So no other operation can execute with view v sequence number n

Are we done? Just reply to client? No
Just because some other m' won't execute at (v,n) doesn't mean m will
Suppose r_i is compromised right after prepared(m, v, n, i)
Suppose no other replica received r_i's prepare message
Suppose f replicas are slow and never even received the PRE-PREPARE
No other honest replica will know the request prepared!
Particularly if p fails, request might not get executed!

So we say operation doesn't execute until
prepared(m, v, n, i) is TRUE for f+1 non-faulty replicas r_i
We say committed(m, v, n) is TRUE when this property holds

So how does a replica know committed(m, v, n) holds?
Add one more message:

r_i -> R: <COMMIT, v, n, d, i> (sent only after prepared(m,v,n,i))

replica r_i waits for 2f+1 identical COMMIT messages (including its own)
committed-local(m, v, n, i) is TRUE when:
prepared(m, v, n, i) is TRUE, and
r_i has 2f+1 matching commits in its log

Note: If committed-local(m, v, n, i) is TRUE for any non-faulty r_i
Then means committed(m, v, n) is TRUE.
r_i knows when committed-local is TRUE
So committed-local is a replica's way of knowing that committed is TRUE

r_i replies to client when committed-local(m, v, n, i) is TRUE
Client waits for f+1 matching replies, then returns to client
Why f+1 and not 2f+1?
Because of f+1, at least one replica r_i is non-faulty
So client knows committed-local(m, v, n, i)
Which in turn implies committed(m, v, n)
Note tentative reply optimization:
r_i can send tentative reply to client after prepared(m, v, n, i)
Client can accept result after 2f+1 matching tentative replies. Why?
f+1 of those replies must be from honest nodes
And at least 1 of those f+1 will be part of 2f+1 forming a new view
So that 1 node will make sure operation makes it to new view

Garbage collecting the message log
make periodic checkpoints
Broadcast <CHECKPOINT, n, d, i>, where d = digest of state
When 2f+1 signed CHECKPOINTs received
restrict sequence numbers are between h and H
h = sequence number of last stable checkpoint
H = h + k (e.g., k might be 2 * checkpoint interval of 100)
delete all messages below sequence number of stable checkpoint

View changes
When client doesn't get an answer, broadcasts message to all replicas
If a backup notices primary is slow/unresponsive:

  • broadcast <VIEW-CHANGE v+1, n, C, P, i>
    C is 2f+1 signed checkpoint messages for last stable checkpoint
    P = {P_m} where each P_m is signed PRE-PREPARE + 2f signed PREPARES
    i.e., P is set of all PREPAREd messages since checkpoint
    + proof that the messages really are prepared

When primary of view v+1 sees 2f signed VIEW-CHANGE messages from others

  • New primary broadcasts <NEW-VIEW, v+1, V, O>
    V is set of at lesat 2f+1 VIEW-CHANGE messages (including by new primary)
    O is a set of pre-prepare messages, for operations that are:
    - after last stable checkpoint
    - appear in the set P of one of the VIEW-CHANGE messages
    O also contains dummy messages to fill in sequence number gaps

Replicas may optain any missing state from each other
(e.g., stable checkpoint data, or missing operation, since
reissued pre-prepare messages only contain digest of request)

What happens if primary creates incorrect O in NEW-VIEW message?
E.g., might send null requests for operations that prepared
Other replicas can compute O from V, and can reject NEW-VIEW message
What happens if primary sends different V's to different backups?
Still okay, because any committed operation will be in 2f+1 VIEW-CHANGE msgs
of which f+1 must be honest, so at least one member of V will have operation
So new primary cannot cause committed operations to be dropped
Only operations for which client has not yet seen the answer

Discussion
what problem does BFS solve?

  • is IS going to run BFS to deal with byzantine failures?
  • what failures are we talking about?
    compromised servers
  • what about compromised clients?
    authentication and authorization
    how can we extend the system to allow for more than (n-1)/3
    failures over its lifetime?
  • detect failed replicas using proactive recovery
    • recover the system periodically, no matter what
    • makes bad nodes good again
  • tricky stuff
    • an attacker might steal compromised replica's keys
  • with how many replicas will BFS work reasonably well?
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 213,254评论 6 492
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 90,875评论 3 387
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 158,682评论 0 348
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 56,896评论 1 285
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 66,015评论 6 385
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 50,152评论 1 291
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 39,208评论 3 412
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 37,962评论 0 268
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 44,388评论 1 304
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 36,700评论 2 327
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 38,867评论 1 341
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 34,551评论 4 335
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 40,186评论 3 317
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 30,901评论 0 21
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,142评论 1 267
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 46,689评论 2 362
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 43,757评论 2 351

推荐阅读更多精彩内容

  • rljs by sennchi Timeline of History Part One The Cognitiv...
    sennchi阅读 7,316评论 0 10
  • 今晚老公在青岛,下班后6点十几分刚进门,大宝说,妈妈,你可回来了,我们终于可以吃饭了,你不回来,奶奶不让吃饭...
    轩萌妈阅读 134评论 0 0
  • “他醒了。” ——ONE 屏幕上显出一团锯齿状的波峰与波谷,那其实是一团纤细、明亮的曲线,在主波外还衍生出二级与三...
    澄宝想睡觉阅读 868评论 0 2
  • 这是一个不经修饰的句子,略粗糙、略庸俗。 尽管,在快要放弃的时候,可以起到一定的正面作用,可这要加上太多的限定因素...
    Chang_Huan_Yu阅读 455评论 0 1
  • 11月14日读完此书。作者是英国作家彼得弗兰科潘。早在两个星期之前便看完了这部关于世界史的书,书评却一直拖到了今天...
    故纸旧人阅读 556评论 0 0