Byzantine Fault Tolerance CS 425: Distributed Systems Fall 2011 1 Material drived from slides by I. Gupta and N.Vaidya.

Slides:



Advertisements
Similar presentations
Consensus on Transaction Commit
Advertisements

NETWORK ALGORITHMS Presenter- Kurchi Subhra Hazra.
Agreement: Byzantine Generals UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau Paper: “The.
CS 5204 – Operating Systems1 Paxos Student Presentation by Jeremy Trimble.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Byzantine Fault Tolerance Steve Ko Computer Sciences and Engineering University at Buffalo.
1 Attested Append-Only Memory: Making Adversaries Stick to their Word Byung-Gon Chun (ICSI) October 15, 2007 Joint work with Petros Maniatis (Intel Research,
1 Principles of Reliable Distributed Systems Lectures 11: Authenticated Byzantine Consensus Spring 2005 Dr. Idit Keidar.
Yee Jiun Song Cornell University. CS5410 Fall 2008.
CS 582 / CMPE 481 Distributed Systems
Computer Science Lecture 17, page 1 CS677: Distributed OS Last Class: Fault Tolerance Basic concepts and failure models Failure masking using redundancy.
EEC 693/793 Special Topics in Electrical Engineering Secure and Dependable Computing Lecture 16 Wenbing Zhao Department of Electrical and Computer Engineering.
EEC 693/793 Special Topics in Electrical Engineering Secure and Dependable Computing Lecture 15 Wenbing Zhao Department of Electrical and Computer Engineering.
Attested Append-only Memory: Making Adversaries Stick to their Word Distributed Storage Systems CS presented by: Hussam Abu-Libdeh.
EEC 693/793 Special Topics in Electrical Engineering Secure and Dependable Computing Lecture 16 Wenbing Zhao Department of Electrical and Computer Engineering.
Last Class: Weak Consistency
EEC 693/793 Special Topics in Electrical Engineering Secure and Dependable Computing Lecture 15 Wenbing Zhao Department of Electrical and Computer Engineering.
Practical Byzantine Fault Tolerance (The Byzantine Generals Problem)
EEC 688 Secure and Dependable Computing Lecture 16 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Byzantine fault tolerance
EEC 688/788 Secure and Dependable Computing Lecture 14 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
HQ Replication: Efficient Quorum Agreement for Reliable Distributed Systems James Cowling 1, Daniel Myers 1, Barbara Liskov 1 Rodrigo Rodrigues 2, Liuba.
Practical Byzantine Fault Tolerance
Practical Byzantine Fault Tolerance Jayesh V. Salvi
Byzantine fault-tolerance COMP 413 Fall Overview Models –Synchronous vs. asynchronous systems –Byzantine failure model Secure storage with self-certifying.
From Viewstamped Replication to BFT Barbara Liskov MIT CSAIL November 2007.
1 ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE R.Kotla, L. Alvisi, M. Dahlin, A. Clement and E. Wong U. T. Austin Best Paper Award at SOSP 2007.
Byzantine fault tolerance
Practical Byzantine Fault Tolerance and Proactive Recovery
Paxos: Agreement for Replicated State Machines Brad Karp UCL Computer Science CS GZ03 / M st, 23 rd October, 2008.
Byzantine Fault Tolerance CS 425: Distributed Systems Fall 2012 Lecture 26 November 29, 2012 Presented By: Imranul Hoque 1.
EECS 262a Advanced Topics in Computer Systems Lecture 25 Byzantine Agreement November 28 th, 2012 John Kubiatowicz and Anthony D. Joseph Electrical Engineering.
CSE 60641: Operating Systems Implementing Fault-Tolerant Services Using the State Machine Approach: a tutorial Fred B. Schneider, ACM Computing Surveys.
EEC 688/788 Secure and Dependable Computing Lecture 15 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department
Byzantine Fault Tolerance
Systems Research Barbara Liskov October Replication Goal: provide reliability and availability by storing information at several nodes.
BChain: High-Throughput BFT Protocols
Byzantine Fault Tolerance
Tolerating Latency in Replicated State Machines through Client Speculation April 22, 2009 Benjamin Wester1, James Cowling2, Edmund B. Nightingale3, Peter.
Secure Causal Atomic Broadcast, Revisited
Principles of Computer Security
View Change Protocols and Reconfiguration
Byzantine Fault Tolerance
Outline Distributed Mutual Exclusion Distributed Deadlock Detection
Principles of Computer Security
Jacob Gardner & Chuan Guo
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
IS 651: Distributed Systems Byzantine Fault Tolerance
Fault-tolerance techniques RSM, Paxos
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
From Viewstamped Replication to BFT
EEC 688/788 Secure and Dependable Computing
HW5 What’s a quorum? How can we use the concept of quorum?
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
Building Dependable Distributed Systems, Copyright Wenbing Zhao
View Change Protocols and Reconfiguration
John Kubiatowicz Electrical Engineering and Computer Sciences
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
Sisi Duan Assistant Professor Information Systems
Presentation transcript:

Byzantine Fault Tolerance CS 425: Distributed Systems Fall Material drived from slides by I. Gupta and N.Vaidya

Reading List L. Lamport, R. Shostak, M. Pease, “The Byzantine Generals Problem,” ACM ToPLaS M. Castro and B. Liskov, “Practical Byzantine Fault Tolerance,” OSDI

Byzantine Generals Problem A sender wants to send message to n-1 other peers Fault-free nodes must agree Sender fault-free  agree on its message Up to f failures

Byzantine Generals Problem A sender wants to send message to n-1 other peers Fault-free nodes must agree Sender fault-free  agree on its message Up to f failures

S v v v Byzantine Generals Algorithm 5 value v Faulty peer

S v v v 6 value v v v Byzantine Generals Algorithm

S v v v 7 value v v v ? ? Byzantine Generals Algorithm

S v v v 8 value v v v v ? v ? Byzantine Generals Algorithm

3 2 v v v 9 value v x v v ? v ? [v,v,?] S 1 Byzantine Generals Algorithm

3 2 v v v 10 value v x v v ? v ? v v S 1 Majority vote results in correct result at good peers Byzantine Generals Algorithm

S v w x 11 Faulty source Byzantine Generals Algorithm

S 3 2 v w x 12 w w 1 Byzantine Generals Algorithm

S 3 2 v w x 13 x w w v x v 1 Byzantine Generals Algorithm

S 3 2 v w x 14 x w w v x v 1 [v,w,x] Byzantine Generals Algorithm

S 3 2 v w x 15 x w w v x v 1 [v,w,x] Vote result identical at good peers Byzantine Generals Algorithm

Known Results Need 3f + 1 nodes to tolerate f failures Need Ω(n 2 ) messages in general 16

Ω(n 2 ) Message Complexity Each message at least 1 bit Ω(n 2 ) bits “communication complexity” to agree on just 1 bit value 17

Practical Byzantine Fault Tolerance Computer systems provide crucial services Computer systems fail – Crash-stop failure – Crash-recovery failure – Byzantine failure Example: natural disaster, malicious attack, hardware failure, software bug, etc. Need highly available service 18 Replicate to increase availability

Challenges 19 Request A Request B Client

Requirements All replicas must handle same requests despite failure. Replicas must handle requests in identical order despite failure. 20

Challenges 21 2: Request B 1: Request A Client

State Machine Replication 22 2: Request B 1: Request A 2: Request B 1: Request A 2: Request B 1: Request A 2: Request B 1: Request A Client How to assign sequence number to requests?

Primary Backup Mechanism 23 Client 2: Request B 1: Request A What if the primary is faulty? Agreeing on sequence number Agreeing on changing the primary (view change) What if the primary is faulty? Agreeing on sequence number Agreeing on changing the primary (view change) View 0

Normal Case Operation Three phase algorithm: – PRE-PREPARE picks order of requests – PREPARE ensures order within views – COMMIT ensures order across views Replicas remember messages in log Messages are authenticated – {.} σk denotes a message sent by k 24

Pre-prepare Phase Primary: Replica 0 Replica 1 Replica 2 Replica 3 Request: m {PRE-PREPARE, v, n, m} σ0 Fail 25

Prepare Phase Request: m PRE-PREPARE Primary: Replica 0 Replica 1 Replica 2 Replica 3 Fail Accepted PRE-PREPARE 26

Prepare Phase Request: m PRE-PREPARE Primary: Replica 0 Replica 1 Replica 2 Replica 3 Fail {PREPARE, v, n, D(m), 1} σ1 Accepted PRE-PREPARE 27

Prepare Phase Request: m PRE-PREPARE Primary: Replica 0 Replica 1 Replica 2 Replica 3 Fail {PREPARE, v, n, D(m), 1} σ1 Accepted PRE-PREPARE Collect PRE-PREPARE + 2f matching PREPARE 28

Commit Phase Request: m PRE-PREPARE Primary: Replica 0 Replica 1 Replica 2 Replica 3 Fail PREPARE {COMMIT, v, n, D(m)} σ2 29

Commit Phase (2) Request: m PRE-PREPARE Primary: Replica 0 Replica 1 Replica 2 Replica 3 Fail PREPARE COMMIT Collect 2f+1 matching COMMIT: execute and reply 30

View Change Provide liveness when primary fails – Timeouts trigger view changes – Select new primary (= view number mod 3f+1) Brief protocol – Replicas send VIEW-CHANGE message along with the requests they prepared so far – New primary collects 2f+1 VIEW-CHANGE messages – Constructs information about committed requests in previous views 31

View Change Safety Goal: No two different committed request with same sequence number across views Quorum for Committed Certificate (m, v, n) At least one correct replica has Prepared Certificate (m, v, n) At least one correct replica has Prepared Certificate (m, v, n) View Change Quorum View Change Quorum 32

Related Works Fault Tolerance Fail Stop Fault Tolerance Paxos 1989 (TR) VS Replication PODC 1988 Byzantine Fault Tolerance Byzantine Agreement Rampart TPDS 1995 SecureRing HICSS 1998 PBFT OSDI ‘99 BASE TOCS ‘03 Byzantine Quorums Malkhi-Reiter JDC 1998 Phalanx SRDS 1998 Fleet ToKDI ‘00 Q/U SOSP ‘05 Hybrid Quorum HQ Replication OSDI ‘06 33