Copyright 2006 Koren & Krishna ECE655/ByzGen.1 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Fault Tolerant Computing ECE 655.
Published byModified over 3 years ago
Presentation on theme: "Copyright 2006 Koren & Krishna ECE655/ByzGen.1 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Fault Tolerant Computing ECE 655."— Presentation transcript:
Copyright 2006 Koren & Krishna ECE655/ByzGen.1 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Fault Tolerant Computing ECE 655 Byzantine Failures
Copyright 2006 Koren & Krishna ECE655/ByzGen.2 Failure Types Fail-Stop: Fails by stopping. No output is produced. Consistent Output: All users see the same wrong output. Byzantine Failure: No consistency is guaranteed: different users may see different values of the output.
Copyright 2006 Koren & Krishna ECE655/ByzGen.3 Original Motivation A computer system uses sensor inputs to control some process. Is there a way to ensure that the output of a sensor is seen consistently by all the functional processors despite the sensor exhibiting Byzantine failure? The problem was originally uncovered at NASA Langley and studied by contractors from SRI International in 1978.
Copyright 2006 Koren & Krishna ECE655/ByzGen.4 Byzantine Generals Problem The Byzantine army is besieging a city, each division commanded by its own general. The commander-in-chief coordinates the divisions. There could be traitors among these commanders. The c-in-c has two alternatives: Attack Retreat If the actions taken by each division is not consistent with that of the others, the army will be defeated. Commands are sent in the form of two-party oral messages. Assume messages are not lost.
Copyright 2006 Koren & Krishna ECE655/ByzGen.5 Requirements for Success The decision algorithm must satisfy the following conditions of interactive consistency: IC1 All loyal generals must agree on the same plan of action. IC2. If the c-in-c is loyal, the loyal generals must obey his order. It is NOT an aim of the algorithm to identify the traitors.
Copyright 2006 Koren & Krishna ECE655/ByzGen.6 Traitors’ Aim One or more of the generals (including the c-in-c) could be a traitor. The aim of the traitors is to cause the Byzantine army to be defeated by violating the conditions for victory.
Copyright 2006 Koren & Krishna ECE655/ByzGen.7 Impossibility Result Try to solve the problem with the c-in-c and two divisional generals, A and B. Case 1. The c-in-c is a traitor: tells A to attack and B to retreat. Case 2. The c-in-c is loyal but A is a traitor. He tells both A and B to attack; however, A tells B that his orders from the c-in-c are to retreat. What should B do? Conclusion: The problem cannot be solved for a 3- node system with one Byzantine failure.
Copyright 2006 Koren & Krishna ECE655/ByzGen.8 Byzantine Generals Algorithm General Approach: The c-in-c sends his order to each of the divisional generals. Each divisional general recursively uses the Byzantine Generals algorithm to disseminate to his colleagues the order he received from the c-in-c. The key algorithm parameter is m, the maximum number of traitors to be allowed for.
Copyright 2006 Koren & Krishna ECE655/ByzGen.9 Algorithm OM(0) Algorithm OM(0) The c-in-c sends his order to each divisional general. Each divisional general obeys the order he receives from the c-in-c.
Copyright 2006 Koren & Krishna ECE655/ByzGen.10 Algorithm OM(m) Step 1. The c-in-c sends his order to each divisional general. Step 2. Each divisional general uses OM(m-1) to disseminate the order he got from the c-in-c. At the end of this step, each divisional general has a vector containing: (a) The order he received from the c-in-c and (b) The order disseminated by every other divisional general. Step 3. Each divisional general follows the majority decision from the vector obtained in Step 2.
Copyright 2006 Koren & Krishna ECE655/ByzGen.11 Claim If the total number of generals is N>=3m+1, running OM(m) will ensure that the interactive consistency conditions are satisfied. If the total number of generals is N<3m+1, no algorithm exists that can ensure the interactive consistency conditions are satisfied for this fault model.
Copyright 2006 Koren & Krishna ECE655/ByzGen.12 Proof of Correctness Induction proof: induction basis of m=0 is obvious. Suppose the result holds for up to m=M. Consider what happens if m=M+1. N>=3m+1 is a condition. Case 1. The c-in-c is a traitor: There are up to m-1 traitors among the N-1 divisional generals. Now, use the induction hypothesis. Case 2. The c-in-c is loyal: The c-in-c sent a consistent order to everyone. There are up to m traitors among the N-1 divisional generals.
Copyright 2006 Koren & Krishna ECE655/ByzGen.13 Modification: Signed Messages Orders are sent by means of signed messages. Assumptions about signed messages: A loyal general’s signature cannot be forged. Anyone can authenticate a general’s signature.
Copyright 2006 Koren & Krishna ECE655/ByzGen.14 Signed Messages Algorithm Step 1. The c-in-c sends a signed order to the divisional generals. Step 2. When a divisional general receives an order, he Adds it to a vector of copies of this order, V. If this order has fewer than m distinct signatures on it, he then »Adds his own signature to the order »Sends the order (augmented with his signature) to every divisional general who has not signed it. Step 3. After all generals have been heard from (or a timeout has passed), decide on course of action. [If the c-in-c is unmasked as a traitor, use a pre-agreed default action.]