Reliable multicast Tolerates process crashes. The additional requirements are: Only correct processes will receive multicasts from all correct processes.

Slides:



Advertisements
Similar presentations
COS 461 Fall 1997 Group Communication u communicate to a group of processes rather than point-to-point u uses –replicated service –efficient dissemination.
Advertisements

CS425/CSE424/ECE428 – Distributed Systems – Fall 2011 Material derived from slides by I. Gupta, M. Harandi, J. Hou, S. Mitra, K. Nahrstedt, N. Vaidya.
DISTRIBUTED SYSTEMS II FAULT-TOLERANT BROADCAST Prof Philippas Tsigas Distributed Computing and Systems Research Group.
CS542 Topics in Distributed Systems Diganta Goswami.
Lab 2 Group Communication Andreas Larsson
Distributed Systems Fall 2010 Replication Fall 20105DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
CS 582 / CMPE 481 Distributed Systems Fault Tolerance.
Group Communications Group communication: one source process sending a message to a group of processes: Destination is a group rather than a single process.
Transis Efficient Message Ordering in Dynamic Networks PODC 1996 talk slides Idit Keidar and Danny Dolev The Hebrew University Transis Project.
Group Communication Phuong Hoai Ha & Yi Zhang Introduction to Lab. assignments March 24 th, 2004.
1 Principles of Reliable Distributed Systems Lecture 5: Failure Models, Fault-Tolerant Broadcasts and State-Machine Replication Spring 2005 Dr. Idit Keidar.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Distributed Systems 2006 Virtual Synchrony* *With material adapted from Ken Birman.
Group Communication A group is a collection of users sharing some common interest.Group-based activities are steadily increasing. There are many types.
Dec 4, 2007 Reliable Multicast Group Neelofer T. CMSC 621.
Copyright © George Coulouris, Jean Dollimore, Tim Kindberg This material is made available for private study and for direct.
Group Communication A group is a collection of users sharing some common interest.Group-based activities are steadily increasing. There are many types.
Lab 2 Group Communication Farnaz Moradi Based on slides by Andreas Larsson 2012.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Replication with View Synchronous Group Communication Steve Ko Computer Sciences and Engineering.
Practical Byzantine Fault Tolerance
Group Communication Group oriented activities are steadily increasing. There are many types of groups:  Open and Closed groups  Peer-to-peer and hierarchical.
Farnaz Moradi Based on slides by Andreas Larsson 2013.
Dealing with open groups The view of a process is its current knowledge of the membership. It is important that all processes have identical views. Inconsistent.
Dealing with open groups The view of a process is its current knowledge of the membership. It is important that all processes have identical views. Inconsistent.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Replication Steve Ko Computer Sciences and Engineering University at Buffalo.
Hwajung Lee. A group is a collection of users sharing some common interest.Group-based activities are steadily increasing. There are many types of groups:
Totally Ordered Broadcast in the face of Network Partitions [Keidar and Dolev,2000] INF5360 Student Presentation 4/3-08 Miran Damjanovic
EEC 688/788 Secure and Dependable Computing Lecture 10 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Fault Tolerant Services
Fault Tolerance. Basic Concepts Availability The system is ready to work immediately Reliability The system can run continuously Safety When the system.
The Totem Single-Ring Ordering and Membership Protocol Y. Amir, L. E. Moser, P. M Melliar-Smith, D. A. Agarwal, P. Ciarfella.
Reliable Communication Smita Hiremath CSC Reliable Client-Server Communication Point-to-Point communication Established by TCP Masks omission failure,
V1.7Fault Tolerance1. V1.7Fault Tolerance2 A characteristic of Distributed Systems is that they are tolerant of partial failures within the distributed.
Building Dependable Distributed Systems, Copyright Wenbing Zhao
Replication and Group Communication. Management of Replicated Data FE Requests and replies C Replica C Service Clients Front ends managers RM FE RM Instructor’s.
Hwajung Lee.  Improves reliability  Improves availability ( What good is a reliable system if it is not available?)  Replication must be transparent.
Replication Improves reliability Improves availability ( What good is a reliable system if it is not available?) Replication must be transparent and create.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Replication Steve Ko Computer Sciences and Engineering University at Buffalo.
Fault Tolerance (2). Topics r Reliable Group Communication.
Group Communication A group is a collection of users sharing some common interest.Group-based activities are steadily increasing. There are many types.
Distributed Computing Systems Replication Dr. Sunny Jeong. Mr. Colin Zhang With Thanks to Prof. G. Coulouris,
Replication Chapter Katherine Dawicki. Motivations Performance enhancement Increased availability Fault Tolerance.
EEC 688/788 Secure and Dependable Computing Lecture 10 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
CSE 486/586 Distributed Systems Reliable Multicast --- 1
Fault Tolerance Chap 7.
Primary-Backup Replication
6.4 Data and File Replication
Reliable group communication
Computer Science 425 Distributed Systems CS 425 / ECE 428 Fall 2013
Reliable Multicast Group
Outline Announcements Fault Tolerance.
7.1. CONSISTENCY AND REPLICATION INTRODUCTION
Replication Improves reliability Improves availability
Advanced Operating System
Active replication for fault tolerance
EEC 688/788 Secure and Dependable Computing
Group Communication A group is a collection of users sharing some common interest.Group-based activities are steadily increasing. There are many types.
EEC 688/788 Secure and Dependable Computing
Lecture 21: Replication Control
EEC 688/788 Secure and Dependable Computing
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
Lecture 21: Replication Control
Abstractions for Fault Tolerance
CSE 486/586 Distributed Systems Reliable Multicast --- 1
Presentation transcript:

Reliable multicast Tolerates process crashes. The additional requirements are: Only correct processes will receive multicasts from all correct processes in the group. Multicasts by faulty processes will be received either by every correct process, or by none at all.

A theorem on reliable multicast In an asynchronous distributed system, total order reliable multicasts cannot be implemented when even a single process undergoes a crash failure. Why? Since it will violate the FLP impossibility result.

Scalable Reliable Multicast IP multicast or application layer multicast has to detect the loss of messages and use retransmission for achieving reliability. For large groups (like distance learning applications) scalability is a major problem.

Scalable Reliable Multicast Difficult to scale: Sender state explosion Message implosion State: receiver 1, receiver 2, … receiver n

Scalable Reliable Multicast If omission failures are rare, then receivers will only report the non- receipt of messages using NACK, It only triggers selective point- to-point retransmission. The reduction of acknowledgements is the underlying principle of Scalable Reliable Multicasts (SRM). If several members of a group fail to receive a message, then each such member waits for a random period of time before sending its NACK. This helps to suppress redundant NACKs. Sender multicasts the missing copy only once.

Dealing with open groups Processes may join or leave an open group. Life will be simpler, if everyone has a consistent view of the current membership. (view = current membership) What problems can arise if members do not have identical views?

Membership service A group membership service looks after the following: Joining and leaving groups. Updating all members about the latest view of the group Failure detection

Dealing with open groups Views should propagate in the same order to all. Example. Current view v0(g) = {0, 1, 2, 3}. Let 1, 2 leave and 4 join the group concurrently. This view change can be serialized in many ways: {0,1,2,3}, {0,1,3} {0,3,4}, OR {0,1,2,3}, {0,2,3}, {0,3}, {0,3,4}, OR {0,1,2,3}, {0,3}, {0,3,4} Send these changes by total order multicast.

View propagation {Process 0}: v 0 (g); v 0 (g) = {0.1,2,3}, send m1,... ; v 1 (g); send m2, send m3; v 1 (g) = {0,1,3}, v 2 (g) ; {Process 1}: v 2 (g) = {0,3,4} v 0 (g); send m4, send m5; v 1 (g); send m6; v 2 (g)...;

View-synchronous communication With respect to each message, all correct processes have the same view. m sent in view V  m received in view V

View delivery guidelines If a process j joins and thereafter continues its membership in a group g that already contains a process i, then eventually j appears in all views delivered by process i. If a process j permanently leaves a group g that contains a process i, then eventually j is excluded from all views delivered by process i.

View-synchronous communication Agreement. If a correct process k delivers a message m in v i (g ) before delivering the next view v i+1 (g), then every correct process j  v i (g)  v i+1 (g) must deliver m before delivering v i+1 (g). Integrity. If a process j delivers a view v i (g ), then v i (g ) must include j. Validity. If a process k delivers a message m in view v i (g) and another process j  v i (g) does not deliver that message m, then the next view v i+1 (g) delivered by k must exclude j.

Example Let process 1 deliver m and then crash. Possibility 1. No one delivers m, but each delivers the new view {0,2,3}. Possibility 2. Processes 0, 2, 3 deliver m and then deliver the new view {0,2,3} Possibility 3. Processes 2, 3 deliver m and then deliver the new view {0,2,3} but process 0 first delivers the view {0,2,3} and then delivers m. Are these acceptable? {0,1,2,3}{0,2,3} m m m

Overview of Transis Group communication system developed by Danny Dolev at the Hebrew University of Jerusalem. Deals with open group Supports scalable reliable multicast Tolerates network partition

Overview of Transis IP multicast (or ethernet LAN) used to support high bandwidth multicast. Acks are piggybacked and message loss is detected transparently, leading to selective retransmission The sequence of messages P 1, P 2, p 2 Q 1, Q 2, q 3 R 1, … received by a member i  {P,Q,R,S} shows the recipient did not receive the message Q3.

Overview of Transis Causal mode (maintains causal order) Agreed mode (maintains total order that does not conflict with the causal order) Safe mode (Delivers a message only when the lower levels of the system have acknowledged its reception at all the destination machines. All messages are delivered relative to a safe message)

Overview of Transis Each partition assumes that the machines in the other partition have failed, and maintains consistency within its own partition only. After repair, consistency is restored in the entire system. Dealing with partition

Replication Improves reliability Improves availability ( What good is a reliable system if it is not available?) Replication must be transparent and create the illusion of a single copy.

Updating replicated data F AliceBob F’F’ ’ BobAlice Update and consistency are primary issues.

Passive replication At most one replica can be the primary server Each client maintains a variable L (leader) that specifies the replica to which it will send requests. Requests are queued at the primary server. Backup servers ignore client requests. primary backup clients

Primary-backup protocol Receive. Receive the request from the client and update the state if appropriate. Broadcast. Broadcast an update of the state to all other replicas. Reply. Send a response to the client. reqreply update client primary backup ?

Primary-backup protocol If the client fails to get a response due the crash of the primary, then the request is retransmitted until a backup is promoted to the primary, Failover time is the duration when there is no primary server. reqreply update client primary backup ?