Distributed Systems CS 15-440 Case Study: Replication in Google Chubby Recitation 5, Oct 06, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.

Slides:

Advertisements

Similar presentations

Paxos and Zookeeper Roy Campbell.

Advertisements

Megastore: Providing Scalable, Highly Available Storage for Interactive Services. Presented by: Hanan Hamdan Supervised by: Dr. Amer Badarneh 1.

CS425 /CSE424/ECE428 – Distributed Systems – Fall 2011 Material derived from slides by I. Gupta, M. Harandi, J. Hou, S. Mitra, K. Nahrstedt, N. Vaidya.

NETWORK ALGORITHMS Presenter- Kurchi Subhra Hazra.

CS 5204 – Operating Systems1 Paxos Student Presentation by Jeremy Trimble.

Paxos Lamport the archeologist and the “Part-time Parliament” of Paxos: – The Part-time Parliament, TOCS 1998 – Paxos Made Simple, ACM SIGACT News 2001.

1 Indranil Gupta (Indy) Lecture 8 Paxos February 12, 2015 CS 525 Advanced Distributed Systems Spring 2015 All Slides © IG 1.

Dept. of Computer Science & Engineering, The Chinese University of Hong Kong Paxos: The Part-Time Parliament CHEN Xinyu

CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Byzantine Fault Tolerance Steve Ko Computer Sciences and Engineering University at Buffalo.

Consensus Hao Li.

1 Algorithms and protocols for distributed systems We have defined process groups as having peer or hierarchical structure and have seen that a coordinator.

Chubby Lock server for distributed applications 1Dennis Kafura – CS5204 – Operating Systems.

Consensus Algorithms Willem Visser RW334. Why do we need consensus? Distributed Databases – Need to know others committed/aborted a transaction to avoid.

Database Replication techniques: a Three Parameter Classification Authors : Database Replication techniques: a Three Parameter Classification Authors :

Distributed Systems CS Consistency and Replication – Part I Lecture 10, Oct 5, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.

CS 582 / CMPE 481 Distributed Systems

Distributed Systems CS Google Chubby and Message Ordering Recitation 4, Sep 29, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.

Distributed Systems CS Synchronization – Part II Lecture 8, Sep 28, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.

Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.

CS 425 / ECE 428 Distributed Systems Fall 2014 Indranil Gupta (Indy) Lecture 19: Paxos All slides © IG.

CS 425 / ECE 428 Distributed Systems Fall 2014 Indranil Gupta (Indy) Lecture 18: Replication Control All slides © IG.

State Machines CS 614 Thursday, Feb 21, 2002 Bill McCloskey.

Paxos Made Simple Jinghe Zhang. Introduction Lock is the easiest way to manage concurrency Mutex and semaphore. Read and write locks. In distributed system:

CS492: Special Topics on Distributed Algorithms and Systems Fall 2008 Lab 3: Final Term Project.

Fault Tolerance via the State Machine Replication Approach Favian Contreras.

Bringing Paxos Consensus in Multi-agent Systems Andrei Mocanu Costin Bădică University of Craiova.

An Introduction to Consensus with Raft

Paxos A Consensus Algorithm for Fault Tolerant Replication.

Paxos: Agreement for Replicated State Machines Brad Karp UCL Computer Science CS GZ03 / M st, 23 rd October, 2008.

From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Edition 5, © Addison-Wesley 2012 Slides for Chapter 21: Designing.

Distributed Systems CS Consistency and Replication – Part IV Lecture 21, Nov 10, 2014 Mohammad Hammoud.

Fault Tolerant Services

Distributed Systems CS Consistency and Replication – Part I Lecture 10, September 30, 2013 Mohammad Hammoud.

Consensus and leader election Landon Cox February 6, 2015.

© Spinnaker Labs, Inc. Chubby. © Spinnaker Labs, Inc. What is it? A coarse-grained lock service –Other distributed systems can use this to synchronize.

Distributed Systems CS Consistency and Replication – Part IV Lecture 13, Oct 23, 2013 Mohammad Hammoud.

CSci8211: Distributed Systems: Raft Consensus Algorithm 1 Distributed Systems : Raft Consensus Alg. Developed by Stanford Platform Lab  Motivated by the.

Implementing Replicated Logs with Paxos John Ousterhout and Diego Ongaro Stanford University Note: this material borrows heavily from slides by Lorenzo.

The Raft Consensus Algorithm Diego Ongaro and John Ousterhout Stanford University.

CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Paxos Steve Ko Computer Sciences and Engineering University at Buffalo.

CSE 486/586 CSE 486/586 Distributed Systems Leader Election Steve Ko Computer Sciences and Engineering University at Buffalo.

CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Paxos Steve Ko Computer Sciences and Engineering University at Buffalo.

Detour: Distributed Systems Techniques

CSE 486/586 Distributed Systems Leader Election

CS 525 Advanced Distributed Systems Spring 2013

Distributed Systems – Paxos

Distributed Systems CS

Lecture 17: Leader Election

Distributed Systems CS

EECS 498 Introduction to Distributed Systems Fall 2017

Implementing Consistency -- Paxos

Outline Announcements Fault Tolerance.

7.1. CONSISTENCY AND REPLICATION INTRODUCTION

Fault-tolerance techniques RSM, Paxos

CS 425 / ECE 428 Distributed Systems Fall 2017 Indranil Gupta (Indy)

EEC 688/788 Secure and Dependable Computing

CSE 486/586 Distributed Systems Leader Election

Consensus, FLP, and Paxos

Lecture 21: Replication Control

EEC 688/788 Secure and Dependable Computing

Distributed Systems CS

CMSC Cluster Computing Basics

EEC 688/788 Secure and Dependable Computing

EEC 688/788 Secure and Dependable Computing

EEC 688/788 Secure and Dependable Computing

EEC 688/788 Secure and Dependable Computing

Lecture 21: Replication Control

Implementing Consistency -- Paxos

CSE 486/586 Distributed Systems Leader Election

Presentation transcript:

Distributed Systems CS Case Study: Replication in Google Chubby Recitation 5, Oct 06, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud

Today…  Last recitation session:  Google Chubby Architecture  Today’s session:  Consensus and Replication in Google Chubby  Announcement:  Project 2 Interim Design Report is due soon

Overview Recap: Google Chubby Consensus in Chubby Paxos Algorithm

Recap: Google Data center Architecture (To avoid clutter the Ethernet connections are shown from only one of the clusters to the external links)

Chubby Overview A Chubby Cell is the first level of hierarchy inside Chubby ( ls ) /ls/chubby_cell/directory_name/…/file_name Chubby instance is implemented as a small number of replicated servers (typically 5) with one designated master Replicas are placed at failure-independent sites Typically, they are placed within a cluster but not within a rack The consistency of replicated database is ensured through a consensus protocol that uses operation logs

Chubby Architecture Diagram

Consistency and Replication In Chubby Challenges in replication of data in Google infrastructure: 1.Replica Servers may run at arbitrary speed and fail 2.Replica Servers have access to stable persistent storage that can survive crashes 3.Messages may be lost, reordered, duplicated or delayed Google has implemented a consensus protocol, using Paxos algorithm, for ensuring consistency The protocol operates over a set of replicas with the goal of reaching an agreement to update a common value

Paxos Algorithm Another algorithm proposed by Lamport Paxos ensures correctness, but not liveliness Algorithm initiation and termination: Any replica can submit a value with the goal of achieving consensus on a final value In Chubby, if all replicas have this value as the next entry in their update logs, then consensus is achieved Paxos is guaranteed to achieve consensus if: A majority of the replicas run for long enough with sufficient network stability

Paxos Approach Steps 1.Election Group of replica servers elect a coordinator 2.Selection of candidate value Coordinator selects the final value and disseminates to the group 3.Acceptance of final value Group will accept or reject a value that is finally stored in all replicas

1. Election Approach: Each replica maintains highest sequence number seen so far If the replica wants to bid for coordinator: It picks a unique number that is higher than all sequence numbers that the replica has seen till now Broadcast a “propose” message with this unique sequence number If other replicas have not seen higher sequence number, they send a “promise” message Promise message signifies that the replica will not promise to any other candidate lesser than the proposed sequence number The promise message may include a value that the replica wants to commit Candidate replica with majority of “promise” message wins Challenges: Multiple coordinators may co-exist Reject messages from old coordinators

Message Exchanges in Election

2. Selection of candidate values Approach: The elected coordinator will select a value from all promise messages If the promise messages did not contain any value then the coordinator is free to choose any value Coordinator sends the “accept” message (with the value) to the group of replicas Replicas should acknowledge the accept message Coordinator waits until a majority of the replicas answer Possible indefinite wait

Message Exchanges in Consensus

3. Commit the value Approach If a majority of the replicas acknowledge, then the coordinator will send a “commit” message to all replicas Otherwise, Coordinator will restart the election process

Message Exchanges in Commit

References “Paxos Made Live – An Engineering Perspective”, Tushar Chandra, Robert Griesemer, and Joshua Redstone, 26th ACM Symposium on Principles of Distributed Computing, PODC 2007