CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Byzantine Fault Tolerance --- 1 Steve Ko Computer Sciences and Engineering University at Buffalo.

Slides:



Advertisements
Similar presentations
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Leader Election Steve Ko Computer Sciences and Engineering University at Buffalo.
Advertisements

CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Consensus Steve Ko Computer Sciences and Engineering University at Buffalo.
Byzantine Generals. Outline r Byzantine generals problem.
1 Indranil Gupta (Indy) Lecture 8 Paxos February 12, 2015 CS 525 Advanced Distributed Systems Spring 2015 All Slides © IG 1.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Byzantine Fault Tolerance Steve Ko Computer Sciences and Engineering University at Buffalo.
The Byzantine Generals Problem Boon Thau Loo CS294-4.
The Byzantine Generals Problem Leslie Lamport, Robert Shostak, Marshall Pease Distributed Algorithms A1 Presented by: Anna Bendersky.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Byzantine Fault Tolerance Steve Ko Computer Sciences and Engineering University at Buffalo.
Last Class: Weak Consistency
CS 425 / ECE 428 Distributed Systems Fall 2014 Indranil Gupta (Indy) Lecture 19: Paxos All slides © IG.
Distributed Algorithms – 2g1513 Lecture 9 – by Ali Ghodsi Fault-Tolerance in Distributed Systems.
Chapter 19 Recovery and Fault Tolerance Copyright © 2008.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Wrap-up Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Wrap-up Steve Ko Computer Sciences and Engineering University at Buffalo.
CS 425/ECE 428/CSE424 Distributed Systems (Fall 2009) Lecture 9 Consensus I Section Klara Nahrstedt.
CSE 60641: Operating Systems Implementing Fault-Tolerant Services Using the State Machine Approach: a tutorial Fred B. Schneider, ACM Computing Surveys.
CSE 486/586 CSE 486/586 Distributed Systems Byzantine Fault Tolerance Steve Ko Computer Sciences and Engineering University at Buffalo.
UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Replication Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Byzantine Fault Tolerance Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Google Spanner Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Paxos Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586 CSE 486/586 Distributed Systems Leader Election Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems The Internet in 2 Hours: The First Hour Steve Ko Computer Sciences and Engineering University.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Paxos Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Consistency Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586 Distributed Systems Byzantine Fault Tolerance
reaching agreement in the presence of faults
Synchronizing Processes
COMP28112 – Lecture 15 Byzantine fault tolerance: dealing with arbitrary failures The Byzantine Generals’ problem (Byzantine Agreement) 26-Jan-18 COMP28112.
CSE 486/586 Distributed Systems Reliable Multicast --- 1
CSE 486/586 Distributed Systems Wrap-up
The consensus problem in distributed systems
CSE 486/586 Distributed Systems Leader Election
CS 525 Advanced Distributed Systems Spring 2013
CSE 486/586 Distributed Systems Mid-Semester Overview
Steve Ko Computer Sciences and Engineering University at Buffalo
CSE 486/586 Distributed Systems Paxos
CSE 486/586 Distributed Systems Consistency --- 1
Steve Ko Computer Sciences and Engineering University at Buffalo
CSE 486/586 Distributed Systems Wrap-up
COMP28112 – Lecture 14 Byzantine fault tolerance: dealing with arbitrary failures The Byzantine Generals’ problem (Byzantine Agreement) 13-Oct-18 COMP28112.
Dependability Dependability is the ability to avoid service failures that are more frequent or severe than desired. It is an important goal of distributed.
CSE 486/586 Distributed Systems Byzantine Fault Tolerance
COMP28112 – Lecture 13 Byzantine fault tolerance: dealing with arbitrary failures The Byzantine Generals’ problem (Byzantine Agreement) 19-Nov-18 COMP28112.
CS 525 Advanced Distributed Systems Spring 2018
Distributed Consensus
Distributed Systems, Consensus and Replicated State Machines
CSE 486/586 Distributed Systems Consistency --- 1
Distributed Systems CS
CS 425 / ECE 428 Distributed Systems Fall 2017 Indranil Gupta (Indy)
Byzantine Faults definition and problem statement impossibility
CSE 486/586 Distributed Systems Leader Election
CSE 486/586 Distributed Systems Concurrency Control --- 3
The Byzantine Generals Problem
COMP28112 – Lecture 13 Byzantine fault tolerance: dealing with arbitrary failures The Byzantine Generals’ problem (Byzantine Agreement) 22-Feb-19 COMP28112.
COMP28112 – Lecture 13 Byzantine fault tolerance: dealing with arbitrary failures The Byzantine Generals’ problem (Byzantine Agreement) 24-Apr-19 COMP28112.
CSE 486/586 Distributed Systems Global States
CSE 486/586 Distributed Systems Wrap-up
CSE 486/586 Distributed Systems Consistency --- 1
CSE 486/586 Distributed Systems Paxos
CSE 486/586 Distributed Systems Reliable Multicast --- 2
Steve Ko Computer Sciences and Engineering University at Buffalo
CSE 486/586 Distributed Systems Byzantine Fault Tolerance
CSE 486/586 Distributed Systems Concurrency Control --- 3
CSE 486/586 Distributed Systems Reliable Multicast --- 1
CSE 486/586 Distributed Systems Leader Election
CSE 486/586 Distributed Systems Consensus
CSE 486/586 Distributed Systems Mid-Semester Overview
Presentation transcript:

CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Byzantine Fault Tolerance Steve Ko Computer Sciences and Engineering University at Buffalo

CSE 486/586, Spring 2014 Recap Spanner –Geo-distributed database –Supports a relational data model with a SQL-like language –Supports distributed transactions with linearizability Transaction ordering for linearizability –Tight time synchronization –TrueTime-based timestamps –Principle: using a time value that is certain TrueTime –TT.now() returns an interval [earliest, latest]. –TT.after(t) is true if t has definitely passed. –TT.before(t) is true if t has definitely not arrived. 2

CSE 486/586, Spring 2014 Byzantine Fault Tolerance Fault categories –Benign: failures we’ve been talking about –Byzantine: arbitrary failures Benign –Fail-stop & crash: process halted –Omission: msg loss, send-omission, receive-omission –All entities still follow the protocol Byzantine –A broader category than benign failures –Process or channel exhibits arbitrary behavior. –May deviate from the protocol –Can be malicious (attacks, software bugs, etc.) 3

CSE 486/586, Spring 2014 Byzantine Fault Tolerance Result: with f faulty nodes, we need 3f + 1 nodes to tolerate their Byzantine behavior. –Fundamental limitation –Today’s goal is to understand this limitation. –Next lecture: a protocol that provides this guarantee. How about Paxos (that tolerates benign failures)? –With f faulty nodes, we need 2f + 1 to obtain the majority. 4

CSE 486/586, Spring 2014 “Byzantine” Leslie Lamport (again!) defined the problem & presented the result. “I have long felt that, because it was posed as a cute problem about philosophers seated around a table, Dijkstra's dining philosopher's problem received much more attention than it deserves.” “At the time, Albania was a completely closed society, and I felt it unlikely that there would be any Albanians around to object, so the original title of this paper was The Albanian Generals Problem.” “…The obviously more appropriate Byzantine generals then occurred to me.” 5

CSE 486/586, Spring 2014 Introducing the Byzantine Generals Imagine several divisions of the Byzantine army camped outside of a city Each division has a general. The generals can only communicate by a messenger. 6

CSE 486/586, Spring 2014 Introducing the Byzantine Generals They must decide on a common plan of action. –What is this problem? But, some of the generals can be traitors. 7 Attack Retreat Attack Attack/Retrea t

CSE 486/586, Spring 2014 Requirements All loyal generals decide upon the same plan of action (e.g., attack or retreat). A small number of traitors cannot cause the loyal generals to adopt a bad plan. There has to be a way to communicate one’s opinion to others correctly. 8

CSE 486/586, Spring 2014 The Byzantine Generals Problem The problem boils down to how a single general sends the general’s own value to the others. –Thus, we can simplify it in terms of a single commanding general sending an order to lieutenant generals. Byzantine Generals Problem: a commanding general must send an order to n-1 lieutenant generals such that –All loyal lieutenants obey the same order. –If the commanding general is loyal, then every loyal lieutenant obeys the order the commanding general sends. We’ll try a simple strategy and see if it works. –All-to-all communication: every general sends the opinion & repeatedly sends others’ opinions for reliability. –Majority: the final decision is the decision of the majority –Similar to reliable multicast 9

CSE 486/586, Spring 2014 CSE 486/586 Administrivia PA4 due next 1:59pm Final: 5/14, Wednesday, 3:30pm – 6:30pm –Norton 112 –Everything –No restroom use (this quickly becomes chaotic) –Bring an erasure, if you’d like. Important things about the final week –PA4 scores will be released by Wednesday. –Thursday and Friday office hours are for PA4. –No office hours from Monday to Wednesday –Scoring will hopefully be done by the end of the week. 10

CSE 486/586, Spring 2014 Question Can three generals agree on the plan of action? –One commander –Two lieutenants –One of them can be a traitor. –This means that we have 2f + 1 nodes. Protocol –Commander sends out an order (“attack”/“retreat”). –Lieutenants relay the order to each other for reliability. –Lieutenants follow the order of the commander. Can you come up with some scenarios where this protocol doesn’t work? 11

CSE 486/586, Spring 2014 Understanding the Problem 12 Commander (Traitor) Lieutenant 1Lieutenant 2 “attack” “retreat” “he said ‘retreat’”

CSE 486/586, Spring 2014 Understanding the Problem 13 Commander Lieutenant 1 Lieutenant 2 (Traitor) “attack” “he said ‘retreat’”

CSE 486/586, Spring 2014 Understanding the Problem With three generals, it is impossible to solve this problem with one traitor. Why not Paxos? –Paxos works with 2f + 1 nodes when f nodes are faulty. –In Paxos, f nodes can fail (or disappear) from the system, but they don’t lie and they are not malicious. In the Byzantine generals problem, f nodes might be alive and malicious. In general, you need 3f + 1 nodes to tolerate f faulty nodes in the Byzantine generals problem. Why? 14

CSE 486/586, Spring 2014 Intuition for the Result Problem setting –General question: how do we reach consensus in the presence of faulty (malicious) nodes? –Let’s say each honest node runs a deterministic algorithm that gives the same answer (yes/no). –We choose a quorum’s answer, since there can be malicious nodes that give a wrong answer intentionally. Question: how many votes do I need? –In Paxos, I need f + 1 votes (agreeing on either yes or no) out of 2f + 1 nodes, since that’s the majority. Will this work with Byzantine failures? –I.e., just like Paxos, let’s just collect f + 1 answers. –The principle is that the outcome should be determined by the answers of the honest nodes, not the malicious nodes. 15

CSE 486/586, Spring 2014 Intuition for the Result Let’s apply this to the Byzantine generals problem. –Principle: The outcome should be determined by the answers of the honest nodes, not the malicious nodes. –Let’s say we obtain f + 1 votes. –Up to f nodes can be malicious  getting f + 1 votes means that the result can contain up to f wrong answers. Example –2f + 1 nodes, and outcome by f + 1 votes. –f faulty nodes say no. –f non-faulty nodes say yes –1 non-faulty node says yes. –Ideal outcome? –Actual outcome? What do we need? 16

CSE 486/586, Spring 2014 Intuition for the Result We need more votes from the honest nodes than the faulty nodes, so the faulty nodes can’t influence the outcome. Unlike Paxos, we can’t simply collect f + 1 votes, since malicious nodes might give wrong answers. We need to obtain 2f + 1 answers. Then we have at least f + 1 votes from honest nodes, one more than the number of potential faulty nodes. Then we need to see if f + 1 votes say the same thing out of 2f + 1. This way, we can make sure that honest nodes determine the outcome. 17

CSE 486/586, Spring 2014 Intuition for the Result But, f nodes still might just simply fail, not reply at all. How do we get 2f + 1 replies when there are f failed nodes? Thus, we need at least 3f + 1 processes in total to tolerate f faulty processes. 18

CSE 486/586, Spring 2014 Summary Byzantine generals problem –They must decide on a common plan of action. –But, some of the generals can be traitors. Requirements –All loyal generals decide upon the same plan of action (e.g., attack or retreat). –A small number of traitors cannot cause the loyal generals to adopt a bad plan. Impossibility results –With three generals, it’s impossible to reach a consensus with one traitor –In general, with less than 3f + 1 nodes, we cannot tolerate f faulty nodes. 19

CSE 486/586, Spring Acknowledgements These slides contain material developed and copyrighted by Indranil Gupta (UIUC).