1 Chapter 14: Replication From Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edition 3, © Addison-Wesley 2001 Presentation.

Slides:



Advertisements
Similar presentations
Replication. Topics r Why Replication? r System Model r Consistency Models r One approach to consistency management and dealing with failures.
Advertisements

CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Consistency Steve Ko Computer Sciences and Engineering University at Buffalo.
Replication Management. Motivations for Replication Performance enhancement Increased availability Fault tolerance.
Consistency and Replication (3). Topics Consistency protocols.
DISTRIBUTED SYSTEMS II REPLICATION CNT. II Prof Philippas Tsigas Distributed Computing and Systems Research Group.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Consistency Steve Ko Computer Sciences and Engineering University at Buffalo.
Lab 2 Group Communication Andreas Larsson
Distributed Systems Fall 2010 Replication Fall 20105DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Slides for Chapter 15: Replication
1 Chapter 11: Coordination and Agreement From Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edition 3, © Addison-Wesley 2001.
CS 582 / CMPE 481 Distributed Systems Replication.
Group Communication Phuong Hoai Ha & Yi Zhang Introduction to Lab. assignments March 24 th, 2004.
CSS490 Replication & Fault Tolerance
Distributed Systems Fall 2011 Gossip and highly available services.
© Chinese University, CSE Dept. Distributed Systems / Distributed Systems Topic 9: Time, Coordination and Replication Dr. Michael R. Lyu Computer.
Replication Libby Rasnick Christopher Newport University CPSC 550 Spring 2003.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
1 Slides for Chapter 10: Time (and Global State) From Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edition 3, © Addison-Wesley.
From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Edition 5, © Addison-Wesley 2012 Slides for Chapter 18: Replication.
Slides for Chapter 14: Replication From Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edition 3, © Addison-Wesley 2001.
Copyright © George Coulouris, Jean Dollimore, Tim Kindberg This material is made available for private study and for direct.
Lab 2 Group Communication Farnaz Moradi Based on slides by Andreas Larsson 2012.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Replication with View Synchronous Group Communication Steve Ko Computer Sciences and Engineering.
DISTRIBUTED SYSTEMS II REPLICATION CNT. Prof Philippas Tsigas Distributed Computing and Systems Research Group.
Exercises for Chapter 18: Replication From Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edition 3, © Addison-Wesley 2001.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Gossiping Steve Ko Computer Sciences and Engineering University at Buffalo.
CS425 /CSE424/ECE428 – Distributed Systems – Fall Nikita Borisov - UIUC1 Material derived from slides by I. Gupta, M. Harandi, J. Hou, S.
Farnaz Moradi Based on slides by Andreas Larsson 2013.
DISTRIBUTED SYSTEMS II REPLICATION Prof Philippas Tsigas Distributed Computing and Systems Research Group.
IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.
Dealing with open groups The view of a process is its current knowledge of the membership. It is important that all processes have identical views. Inconsistent.
DISTRIBUTED SYSTEMS II AGREEMENT - COMMIT (2-3 PHASE COMMIT) Prof Philippas Tsigas Distributed Computing and Systems Research Group.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Replication Steve Ko Computer Sciences and Engineering University at Buffalo.
CS542: Topics in Distributed Systems Replication.
Replication (1). Topics r Why Replication? r System Model r Consistency Models – How do we reason about the consistency of the “global state”? m Data-centric.
DISTRIBUTED SYSTEMS II REPLICATION Prof Philippas Tsigas Distributed Computing and Systems Research Group.
Copyright © George Coulouris, Jean Dollimore, Tim Kindberg This material is made available for private study and for direct.
Copyright © George Coulouris, Jean Dollimore, Tim Kindberg This material is made available for private study and for direct.
Fault Tolerant Services
1 Distribuerede systemer og sikkerhed – 21. marts 2002 zFrom Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design zEdition 3, © Addison-Wesley.
Fault Tolerance and Replication
Replication (1). Topics r Why Replication? r System Model r Consistency Models r One approach to consistency management and dealing with failures.
Exercises for Chapter 15: COORDINATION AND AGREEMENT From Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edition 3, © Addison-Wesley.
© Chinese University, CSE Dept. Distributed Systems / Distributed Systems Topic 8: Fault Tolerance and Replication Dr. Michael R. Lyu Computer Science.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Consistency Steve Ko Computer Sciences and Engineering University at Buffalo.
Replication and Group Communication. Management of Replicated Data FE Requests and replies C Replica C Service Clients Front ends managers RM FE RM Instructor’s.
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT By Jyothsna Natarajan Instructor: Prof. Yanqing Zhang Course: Advanced Operating Systems.
Hwajung Lee.  Improves reliability  Improves availability ( What good is a reliable system if it is not available?)  Replication must be transparent.
Replication Improves reliability Improves availability ( What good is a reliable system if it is not available?) Replication must be transparent and create.
From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Edition 5, © Addison-Wesley 2012 Indirect Communication.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Replication Steve Ko Computer Sciences and Engineering University at Buffalo.
Lecture 13: Replication Haibin Zhu, PhD. Assistant Professor Department of Computer Science Nipissing University © 2002.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Transactions on Replicated Data Steve Ko Computer Sciences and Engineering University at Buffalo.
PERFORMANCE MANAGEMENT IMPROVING PERFORMANCE TECHNIQUES Network management system 1.
Distributed Computing Systems Replication Dr. Sunny Jeong. Mr. Colin Zhang With Thanks to Prof. G. Coulouris,
Replication Chapter Katherine Dawicki. Motivations Performance enhancement Increased availability Fault Tolerance.
Reliable multicast Tolerates process crashes. The additional requirements are: Only correct processes will receive multicasts from all correct processes.
Exercises for Chapter 14: Replication
Chapter 14: Replication Introduction
CSE 486/586 Distributed Systems Consistency --- 1
Replication Improves reliability Improves availability
Active replication for fault tolerance
Lecture 21: Replication Control
Slides for Chapter 15: Replication
Slides for Chapter 18: Replication
Replication Chapter 7.
Lecture 21: Replication Control
CSE 486/586 Distributed Systems Consistency --- 2
Network management system
CSE 486/586 Distributed Systems Consistency --- 1
Presentation transcript:

1 Chapter 14: Replication From Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edition 3, © Addison-Wesley 2001 Presentation based on slides by Coulouris et al; modified by Jens B Jorgensen and Jonas Thomsen, University of Aarhus

2 Replication – basics zGoals: yEnhanced performance xCaching, has a limit of effectiveness. yIncreased availability x1 – probability(all RM failed) = 1 – p n yFault tolerance xGuarantees strictly correct behavior. zRequirements: yReplication transparency xClients access one logical object, don’t know physical objects. yConsistency xApplication dependent requirements.

3 Replication – basic architectural model FE Requests and replies C Replica C Service Clients Front ends managers RM FE RM One logical object represented as more physical objects. Replica managers are state machines and can recover. Failure model: Crashes only. Phases: Request, coordination (order), execution, agreement, response.

4 Group communication – group membership services Join Group address expansion Multicast communication Group send Fail Group membership management Leave Process group

5 Group communication – views and view delivery zViews: Ordered lists of current group members. zFor each group g, the group management service delivers to any member p in g a series of views: yv 0 (g),v 1 (g),v 2 (g),... zA member delivers a view when a membership change occurs, and the application is notified. zAn event e occurs in a view v(g) at process p, if p has delivered v(g), but not the next view v’(g) when e occurs.

6 Group communication – view delivery requirements zOrder: If a process p delivers view v(g) and then v’(g), then no other process q delivers v’(g) before v(g). zIntegrity: If process p delivers view v(g), then p is in v(g). zNon-triviality: If process q joins a group and is or becomes indefinitely reachable from process p, then eventually q is always in the views that p delivers (a converse condition is required when group partition occurs).

7 Group communication – purpose of view-synchronous communication zEnsure that the delivery of a new view draws a conceptual line across the system. zEvery message that is delivered at all is consistently delivered one side or the other of that line. zExample: State transfer.

8 Group communication – requirements of view-synchronous communication zAgreement: Correct processes deliver the same set of messages in any given view. zIntegrity: If a process p delivers message m, then it will not deliver m again. Furthermore, p is in group(m) and m was supplied to a multicast operation by sender(m). zValidity (closed groups): Correct processes always deliver the messages that they send. If the system fails to deliver a message to any process q, then it notifies the surviving processes by delivering a new view with q excluded, immediately after the view in which any of them delivered the message.

9 Group communication – example of view-synchronous communication

10 Fault tolerance – introductory example zClient 1: ysetBalance(B,x,1) ysetBalance(A,y,2) zClient 2: ygetBalance(A,y) -> 2 ygetBalance(A,x) -> 0

11 Fault tolerance – linearizability correctness criteria zA replicated shared object service is said to be linearizable if for any execution there is some interleaving of the series of operations issued by all clients that satisfies the following two criteria: yThe interleaved sequence of operations meets the specification of a (single) correct copy of the objects. yThe order of operations in the interleaving is consistent with the real times at which the operations occurred in the actual execution.

12 Fault tolerance – sequential consistency correctness criteria zA replicated shared object service is said to be sequentially consistent if for any execution there is some interleaving of the series of operations issued by all clients that satisfies the following two criteria: yThe interleaved sequence of operations meets the specification of a (single) correct copy of the objects. yThe order of operations in the interleaving is consistent with the program order in which each individual client executed them.

13 Fault tolerance – correctness criteria example zClient 1: ysetBalance(B,x,1) ysetBalance(A,y,2) zClient 2: ygetBalance(A,y) -> 0 ygetBalance(A,x) -> 0 Linearizability? Sequentially consistency?

14 Fault tolerance – the passive (primary-backup) model (1/3) FE C C RM Primary Backup RM

15 Fault tolerance – the passive (primary-backup) model (2/3) zRequest: yFE  primary. zCoordination: yNo (primary decides). zExecution: yOn primary, stores response. zAgreement: yRequest is sent to slaves. yResponse is awaited. zResponse: yPrimary’s response is sent to FE.

16 Fault tolerance – the passive (primary-backup) model (3/3) zLinearizable (primary dictates the ordering) zLinearizable on primary failure if: yprimary is replaced by a unique backup. yall backups agree on which operations had been performed. zSolution: yUse view-synchronous group communication to send updates to backups.

17 Fault tolerance – active replication (1/3) FEC CRM

18 Fault tolerance – active replication (2/3) zRequest: yFE multicasts to all RMs. zCoordination: yThe group communication system delivers request totally ordered. zExecution: yAll RM executes request. zAgreement: yNo. zResponse: yEach RM’s response is sent to FE. FE decides what to deliver to client.

19 Fault tolerance – active replication (3/3) zSequential Consistency yThe totally ordered multicast executes requests in program order. zNot linearizable yTotal order does not maintain real-time order.

20 Summary zReplication – what and why? zGroup communication: yViews. yView-synchronous communication. zFault tolerance: yLinearizability. ySequential consistency. yPassive (primary-backup) replication. yActive replication.