1 ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE R.Kotla, L. Alvisi, M. Dahlin, A. Clement and E. Wong U. T. Austin Best Paper Award at SOSP 2007.

Slides:



Advertisements
Similar presentations
NETWORK ALGORITHMS Presenter- Kurchi Subhra Hazra.
Advertisements

CS 5204 – Operating Systems1 Paxos Student Presentation by Jeremy Trimble.
6.852: Distributed Algorithms Spring, 2008 Class 7.
P. Kouznetsov, 2006 Abstracting out Byzantine Behavior Peter Druschel Andreas Haeberlen Petr Kouznetsov Max Planck Institute for Software Systems.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Byzantine Fault Tolerance Steve Ko Computer Sciences and Engineering University at Buffalo.
The SMART Way to Migrate Replicated Stateful Services Jacob R. Lorch, Atul Adya, Bill Bolosky, Ronnie Chaiken, John Douceur, Jon Howell Microsoft Research.
1 Attested Append-Only Memory: Making Adversaries Stick to their Word Byung-Gon Chun (ICSI) October 15, 2007 Joint work with Petros Maniatis (Intel Research,
Distributed Systems Fall 2010 Replication Fall 20105DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Yee Jiun Song Cornell University. CS5410 Fall 2008.
Database Replication techniques: a Three Parameter Classification Authors : Database Replication techniques: a Three Parameter Classification Authors :
SRG PeerReview: Practical Accountability for Distributed Systems Andreas Heaberlen, Petr Kouznetsov, and Peter Druschel SOSP’07.
CS 582 / CMPE 481 Distributed Systems
EEC 693/793 Special Topics in Electrical Engineering Secure and Dependable Computing Lecture 16 Wenbing Zhao Department of Electrical and Computer Engineering.
EEC 693/793 Special Topics in Electrical Engineering Secure and Dependable Computing Lecture 15 Wenbing Zhao Department of Electrical and Computer Engineering.
2/23/2009CS50901 Implementing Fault-Tolerant Services Using the State Machine Approach: A Tutorial Fred B. Schneider Presenter: Aly Farahat.
Attested Append-only Memory: Making Adversaries Stick to their Word Distributed Storage Systems CS presented by: Hussam Abu-Libdeh.
EEC 693/793 Special Topics in Electrical Engineering Secure and Dependable Computing Lecture 16 Wenbing Zhao Department of Electrical and Computer Engineering.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
EEC 693/793 Special Topics in Electrical Engineering Secure and Dependable Computing Lecture 15 Wenbing Zhao Department of Electrical and Computer Engineering.
Practical Byzantine Fault Tolerance (The Byzantine Generals Problem)
EEC 688 Secure and Dependable Computing Lecture 16 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Byzantine fault tolerance
Distributed Databases
Byzantine Fault Tolerance CS 425: Distributed Systems Fall Material drived from slides by I. Gupta and N.Vaidya.
Fault Tolerance via the State Machine Replication Approach Favian Contreras.
Bringing Paxos Consensus in Multi-agent Systems Andrei Mocanu Costin Bădică University of Craiova.
BFTCloud: A Byzantine Fault Tolerance Framework for Voluntary-Resource Cloud Computing Yilei Zhang, Zibin Zheng, and Michael R. Lyu
Chapter 19 Recovery and Fault Tolerance Copyright © 2008.
Presented by Keun Soo Yim March 19, 2009
EEC 688/788 Secure and Dependable Computing Lecture 14 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
HQ Replication: Efficient Quorum Agreement for Reliable Distributed Systems James Cowling 1, Daniel Myers 1, Barbara Liskov 1 Rodrigo Rodrigues 2, Liuba.
Practical Byzantine Fault Tolerance
Practical Byzantine Fault Tolerance Jayesh V. Salvi
Byzantine fault-tolerance COMP 413 Fall Overview Models –Synchronous vs. asynchronous systems –Byzantine failure model Secure storage with self-certifying.
From Viewstamped Replication to BFT Barbara Liskov MIT CSAIL November 2007.
Byzantine fault tolerance
Practical Byzantine Fault Tolerance and Proactive Recovery
Paxos A Consensus Algorithm for Fault Tolerant Replication.
Byzantine Fault Tolerance CS 425: Distributed Systems Fall 2012 Lecture 26 November 29, 2012 Presented By: Imranul Hoque 1.
CSE 60641: Operating Systems Implementing Fault-Tolerant Services Using the State Machine Approach: a tutorial Fred B. Schneider, ACM Computing Surveys.
EEC 688/788 Secure and Dependable Computing Lecture 15 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Systems Research Barbara Liskov October Replication Goal: provide reliability and availability by storing information at several nodes.
Replication Improves reliability Improves availability ( What good is a reliable system if it is not available?) Replication must be transparent and create.
Fault Tolerance
Antidio Viguria Ann Krueger A Nonblocking Quorum Consensus Protocol for Replicated Data Divyakant Agrawal and Arthur J. Bernstein Paper Presentation: Dependable.
BChain: High-Throughput BFT Protocols
Tolerating Latency in Replicated State Machines through Client Speculation April 22, 2009 Benjamin Wester1, James Cowling2, Edmund B. Nightingale3, Peter.
Implementing Consistency -- Paxos
Outline Announcements Fault Tolerance.
Principles of Computer Security
Jacob Gardner & Chuan Guo
Replication Improves reliability Improves availability
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
From Viewstamped Replication to BFT
IS 651: Distributed Systems Fault Tolerance
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
Building Dependable Distributed Systems, Copyright Wenbing Zhao
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
The SMART Way to Migrate Replicated Stateful Services
EEC 688/788 Secure and Dependable Computing
Implementing Consistency -- Paxos
Sisi Duan Assistant Professor Information Systems
Presentation transcript:

1 ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE R.Kotla, L. Alvisi, M. Dahlin, A. Clement and E. Wong U. T. Austin Best Paper Award at SOSP 2007

2 Motivation Why implement Byzantine Fault-Tolerant replication? –Increasing value of data and decreasing cost of hardware –More non-stop-fail behaviors than believed –BFT is becoming cheaper –Cost of 3-way non-BFT replication close to cost of BFT replication

3 Zyzzyva (I) Uses speculation to reduce the cost of BFT replication –Primary replica proposes order of client requests to all secondary replicas ( standard ) –Secondary replicas speculatively execute the request without going through an agreement protocol to validate that order ( new idea )

4 Zyzzyva (II) As a result –States of correct replicas may diverge –Replicas may send diverging replies to client Zyzzyva’s solution –Clients detect inconsistencies –Help convergence of correct replicas to a single total ordering of requests –Reject inconsistent replies

5 How? Clients observe a replicated state machine Replies contain enough information to let clients ascertain if the replies and the history are stable and guaranteed to be eventually committed Replicas have checkpoints

6 Byzantine agreement (I) No solution for less than four entities

7 Byzantine agreement (II) To achieve agreement in the presence of f failed nodes (“traitors”) we need –3 f + 1 entities

8 Practical BFT (I) Practical Byzantine Fault-Tolerant protocol (PBFT) [Castro and Liskov 1999]

9 Practical BFT (II) Replicas decide on correct ordering

10 Practical BFT (III) 1.Client sends signed request to primary replica 2.Primary assigns a sequence number to the request and sends to all other replicas a PRE-PREPARE message 3.Secondary replicas validate the message and send a PREPARE message to all replicas 4. Replicas that can collect 2 f PREPARE messages send a COMMIT message to all replicas 5. Replicas that can collect 2 f+ 1 COMMIT message send a REPLY to the client

11 A shortened version Faster agreement is achieved thanks to a more complex view change protocol

12 The explanation (I) " No replicated service that uses the traditional view change protocol can be live without an agreement protocol that includes both the prepare and commit full exchanges" "The traditional view change protocol lets correct replicas commit to a view change and become silent in a view without any guarantee that their action will lead to the view change."

13 The explanation (II) Zyzzyva –Adds an extra phase to its view change protocol –Guarantees that a correct replica will not abandon a view unless every other correct replica does it

14 Zyzzyva Agreement (I) Common case: no faulty replicas

15 Explanations Secondary replicas assume that –Primary replica gave the right ordering –All secondary replicas will participate in transaction Initiate speculative execution Client receives 3 f + 1 mutually consistent responses

16 Zyzzyva Agreement (II) With a faulty replica

17 Explanations (I) Client receives 3 f mutually consistent responses Gathers at least 2 f + 1 mutually consistent responses Distributes a commit certificate to the replicas Once at least 2 f + 1 replicas acknowledge receiving a commit certificate, the client considers the request completed

18 Explanations (II) If enough secondary replicas suspect that the primary replica is faulty, a view change is initiated and a new primary elected

19 Comparison with traditional solutions

20 State maintained at each replica

21 Explanations (I) Each replica maintains –A history of the requests it has executed –A copy of the max commit certificate it has received Let it distinguish between committed history and speculative history

22 Explanations (II) Each replica constructs a checkpoint every CP_INTERVAL requests It maintains one stable checkpoint with a corresponding stable application state snapshot It might also have up to one speculative checkpoint with its corresponding speculative application state snapshot

23 Explanations (III) Checkpoints and application state become committed through a process similar to that of earlier BFT agreement protocols –Replicas send signed checkpoint messages to all replicas when they generate a tentative checkpoint –Commit checkpoint after they collect f + 1 signed matching checkpoint messages

24 View change sub-protocol (I)

25 Explanations Two-phase protocol Elects a new primary Guarantees that it will not introduce any changes in a history that has already completed at a correct client

26 Performance: throughput

27 Comments Zyzzyva-5 is a special version of Zyzziva requiring more replicas but having a lower overhead

28 Performance: latency

29 Scalability: peak throughputs

30 CONCLUSIONS Systematically exploiting speculative execution results in a protocol much faster than conventional BFT agreement protocols. Observe that Zyzzyva is optimized for the most frequent case but provides the correct result in all cases A good rule to follow