Presentation is loading. Please wait.

Presentation is loading. Please wait.

Distributed Systems CS

Similar presentations


Presentation on theme: "Distributed Systems CS"— Presentation transcript:

1 Distributed Systems CS 15-440
Fault Tolerance – Part II Lecture 22, November 29, 2017 Mohammad Hammoud

2 Today… Last Session: Fault-tolerance - Part I Today’s Session:
Fault-tolerance – Part II Handling byzantine failures Overview Final Announcement: The final exam is tomorrow, Nov 30 from 4:00 to 7:00PM in the classroom. It is open book, open notes.

3 Handling Byzantine Failures
Byzantine failures are hard to detect and mask A server may produce an output it should have never produced (due to an erroneous computation) Worse yet, a server may be maliciously working (either singlehandedly or collaboratively with other servers) to produce intentionally wrong answers Byzantine failures can be typically handled using agreement (or consensus) protocols

4 Consensus in Faulty Systems
Reaching a consensus in a distributed system is only possible in the following circumstances: Lamport’s assumption Message Ordering Unordered Ordered Process Behavior Synchronous Bounded Communication Delay Unbounded Asynchronous Unicast Multicast Message Transmission

5 Byzantine Generals Problem
Lamport’s Assumptions: There are n processes (or generals) Each process i will send a value vi (i.e., “attack” or “wait” or something else) to every other process There are at most m faulty processes (or traitors) Each process will construct a vector V of length n If process i is non-faulty, V[i] = vi Else, V[i] is undefined

6 Byzantine Generals Problem
Case I: n = 4 and m = 1 Step3: Every process passes its vector to every other process Step1: Each process sends its value to all others Step2: Each process collects values received in a vector 1 Got (1, 2, y, 4) (a, b, c, d) (1, 2, z, 4) 2 Got (1, 2, x, 4) (e, f, g, h) (1, 2, z, 4) 1 1 2 1 Got(1, 2, x, 4) 2 Got(1, 2, y, 4) 3 Got(1, 2, 3, 4) 4 Got(1, 2, z, 4) 2 1 2 4 2 1 x 4 Got (1, 2, x, 4) (1, 2, y, 4) (i, j, k, l) y 4 3 4 4 z Faulty process

7 Byzantine Generals Problem
Step 4: Each process examines the ith element of each of the newly received vectors If any value has a majority, that value is put into the result vector If a value has no majority, the corresponding element of the result vector is marked UNKNOWN 1 Got (1, 2, y, 4) (a, b, c, d) (1, 2, z, 4) The algorithm detected the traitor! 2 Got (1, 2, x, 4) (e, f, g, h) (1, 2, z, 4) 4 Got (1, 2, x, 4) (1, 2, y, 4) ( i, j, k, l) Result Vector: (1, 2, UNKNOWN, 4) Result Vector: (1, 2, UNKNOWN, 4) Result Vector: (1, 2, UNKNOWN, 4)

8 Byzantine Generals Problem
Case II: n = 3 and m = 1 Step3: Every process passes its vector to every other process Step1: Each process sends its value to all others Step2: Each process collects values received in a vector 1 1 2 1 Got (1, 2, y) (a, b, c) 2 Got (1, 2, x) (d, e, f) 1 Got(1, 2, x) 2 Got(1, 2, y) 3 Got(1, 2, 3) 2 1 2 1 x y 3 Faulty process

9 Byzantine Generals Problem
Step 4: Each process examines the ith element of each of the newly received vectors If any value has a majority, that value is put into the result vector If a value has no majority, the corresponding element of the result vector is marked UNKNOWN The algorithm failed to detect the traitor! 1 Got (1, 2, y) (a, b, c) 2 Got (1, 2, x) (d, e, f) Result Vector: (UNKOWN, UNKNOWN, UNKNOWN) Result Vector: (UNKOWN, UNKNOWN, UNKNOWN)

10 Concluding Remarks on the Byzantine Generals Problem
Lamport et al. (1982) proved that in a system with m faulty processes, an agreement can be achieved only if 2m+1 correctly functioning processes are present, for a total of 3m+1 BGP (n, m) is solvable iff n ≥ (3m + 1) In other words, a consensus is only possible if more than two- thirds of the processes are working properly


Download ppt "Distributed Systems CS"

Similar presentations


Ads by Google