Reliable Group Communication Quanzeng You & Haoliang Wang.

Slides:



Advertisements
Similar presentations
1 Process groups and message ordering If processes belong to groups, certain algorithms can be used that depend on group properties membership create (
Advertisements

Fault Tolerance CSCI 4780/6780. Reliable Group Communication Reliable multicasting is important for several applications Transport layer protocols rarely.
Chapter 8 Fault Tolerance
Reliable Communication in the Presence of Failures Kenneth Birman, Thomas Joseph Cornell University, 1987 Julia Campbell 19 November 2003.
1 CS 194: Distributed Systems Process resilience, Reliable Group Communication Scott Shenker and Ion Stoica Computer Science Division Department of Electrical.
Synchronization Chapter clock synchronization * 5.2 logical clocks * 5.3 global state * 5.4 election algorithm * 5.5 mutual exclusion * 5.6 distributed.
Optimizing Buffer Management for Reliable Multicast Zhen Xiao AT&T Labs – Research Joint work with Ken Birman and Robbert van Renesse.
Ranveer Chandra , Kenneth P. Birman Department of Computer Science
Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.
Qi Huang CS FA 10/29/09. Basic idea: same data needs to reach a set of multiple receivers Application: What is multicast?
Secure Multicast (II) Xun Kang. Content Batch Update of Key Trees Reliable Group Rekeying Tree-based Group Diffie-Hellman Recent progress in Wired and.
Group Communications Group communication: one source process sending a message to a group of processes: Destination is a group rather than a single process.
CS514: Intermediate Course in Operating Systems Professor Ken Birman Vivek Vishnumurthy: TA.
Chapter 7 Fault Tolerance Basic Concepts Failure Models Process Design Issues Flat vs hierarchical group Group Membership Reliable Client.
Distributed Systems CS Fault Tolerance- Part III Lecture 15, Oct 26, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1.
Network Multicast Prakash Linga. Last Class COReL: Algorithm for totally-ordered multicast in an asynchronous environment, in face of network partitions.
EEC 693/793 Special Topics in Electrical Engineering Secure and Dependable Computing Lecture 13 Wenbing Zhao Department of Electrical and Computer Engineering.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Anonymous Gossip: Improving Multicast Reliability in Mobile Ad-Hoc Networks Ranveer Chandra (joint work with Venugopalan Ramasubramanian and Ken Birman)
Composition Model and its code. bound:=bound+1.
Error Checking continued. Network Layers in Action Each layer in the OSI Model will add header information that pertains to that specific protocol. On.
Multicast Transport Protocols: A Survey and Taxonomy Author: Katia Obraczka University of Southern California Presenter: Venkatesh Prabhakar.
Multicast Communication Multicast is the delivery of a message to a group of receivers simultaneously in a single transmission from the source – The source.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Communication (II) Chapter 4
Epidemics Michael Ford Simon Krueger 1. IT’S JUST LIKE TELEPHONE! 2.
A Randomized Error Recovery Algorithm for Reliable Multicast Zhen Xiao Ken Birman AT&T Labs – Research Cornell University.
Dec 4, 2007 Reliable Multicast Group Neelofer T. CMSC 621.
ARMADA Middleware and Communication Services T. ABDELZAHER, M. BJORKLUND, S. DAWSON, W.-C. FENG, F. JAHANIAN, S. JOHNSON, P. MARRON, A. MEHRA, T. MITTON,
Group Communication A group is a collection of users sharing some common interest.Group-based activities are steadily increasing. There are many types.
Distributed Systems CS Fault Tolerance- Part III Lecture 19, Nov 25, 2013 Mohammad Hammoud 1.
1 8.3 Reliable Client-Server Communication So far: Concentrated on process resilience (by means of process groups). What about reliable communication channels?
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Group Communication Group oriented activities are steadily increasing. There are many types of groups:  Open and Closed groups  Peer-to-peer and hierarchical.
Byzantine fault-tolerance COMP 413 Fall Overview Models –Synchronous vs. asynchronous systems –Byzantine failure model Secure storage with self-certifying.
Dealing with open groups The view of a process is its current knowledge of the membership. It is important that all processes have identical views. Inconsistent.
2007/1/15http:// Lightweight Probabilistic Broadcast M2 Tatsuya Shirai M1 Dai Saito.
Hwajung Lee. A group is a collection of users sharing some common interest.Group-based activities are steadily increasing. There are many types of groups:
Prof. Mort AnvariStrayer University at Arlington, VAAugust Exposing and Eliminating Vulnerabilities to Denial of Service Attacks in Secure Gossip-Based.
More on Fault Tolerance Chapter 7. Topics Group Communication Virtual Synchrony Atomic Commit Checkpointing, Logging, Recovery.
EEC 688/788 Secure and Dependable Computing Lecture 10 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Fault Tolerance. Basic Concepts Availability The system is ready to work immediately Reliability The system can run continuously Safety When the system.
The Totem Single-Ring Ordering and Membership Protocol Y. Amir, L. E. Moser, P. M Melliar-Smith, D. A. Agarwal, P. Ciarfella.
Reliable Communication Smita Hiremath CSC Reliable Client-Server Communication Point-to-Point communication Established by TCP Masks omission failure,
V1.7Fault Tolerance1. V1.7Fault Tolerance2 A characteristic of Distributed Systems is that they are tolerant of partial failures within the distributed.
Chapter 11 Fault Tolerance. Topics Introduction Process Resilience Reliable Group Communication Recovery.
Building Dependable Distributed Systems, Copyright Wenbing Zhao
Fault Tolerance Chapter 7. Goal An important goal in distributed systems design is to construct the system in such a way that it can automatically recover.
Fault Tolerance (2). Topics r Reliable Group Communication.
Group Communication A group is a collection of users sharing some common interest.Group-based activities are steadily increasing. There are many types.
Distributed Systems Lecture 7 Multicast 1. Previous lecture Global states – Cuts – Collecting state – Algorithms 2.
Chapter 8 Fault Tolerance. Outline Introductions –Concepts –Failure models –Redundancy Process resilience –Groups and failure masking –Distributed agreement.
EEC 688/788 Secure and Dependable Computing Lecture 10 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Reliable multicast Tolerates process crashes. The additional requirements are: Only correct processes will receive multicasts from all correct processes.
More on Fault Tolerance
Fault Tolerance Prof. Orhan Gemikonakli
DC7: More Coordination Chapter 11 and 14.2
Reliable group communication
湖南大学-信息科学与工程学院-计算机与科学系
Reliable Multicast Group
Distributed Systems CS
Advanced Operating System
Strayer University at Arlington, VA
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
Presentation transcript:

Reliable Group Communication Quanzeng You & Haoliang Wang

Topics Reliable Multicasting Scalable Multicasting Atomic Multicasting Epidemic Multicasting

Reliable Multicasting A message that is sent to a process group should be delivered to each member of that group. (ideal) Problems – During the communication a process joins the group Should the new joint process receive this msg. – What happens if a process crashes during the communication.

What is reliable communication Presence of faulty processes – All nonfaulty group members receive the message All processes operate correctly – Every message should be delivered to each current group member.

Basic Reliable-Multicasting Schemes (BRMS) Assumption – Processes do not fail – Processes do not join or leave the group – However, with unreliable multicasting channels. Assume messages are received in the order they are sent. Retransmission choices: 1.Receiver send requesting msg to sender 2.Sender automatically retransmit msg within a certain time Design trade-off: p-to-p retransmission, piggybacked ack

Scalability in Reliable Multicasting Issues with BRMS – Sender needs to keep a history buffer Until every receiver has returned ACK msg – Cannot support large numbers of receivers Solutions: – Only return feedback when missing a msg

Nonhierarchical Feedback Control Key: Reduce number of feedback msgs – feedback suppression Features: – Never ack successful multicast msg – Report the miss of a msg (NACK) – Msg missing detection is left to the application – Assume retransmissions are always multicast to entire group

Nonhierarchical Feedback Control The first retransmission request leads to the suppression of others.

Issues Still need history buffer – May force the sender to keep a msg forever Ensuring only one request for retransmission – accurate scheduling of feedback msg at each receiver – Across a wide-area network is not easy Interruptions (NACK) to processes which have successfully received the msg Solutions – Dynamically group the processes that have not received msg into a separate multicast group – Group processes that tend to miss the same messages in a new group (share the same multicast channel)

Hierarchical Feedback Control Improve Scalability of SRM – Assistance from receivers A hierarchical solution – Scale with large groups of receivers

Hierarchical Feedback Control Local coordinator has its own history buffer MSG for coordinator – From coordinator of parent group Problems – Need dynamic construction of the tree Use underlying network structure

Reliable Multicasting In the presence of process failure – A message is delivered to either all processes or to none at all. Virtual Synchrony

Communication Layer – Define process failures in terms of process groups and changes to group membership Comm layer: Send and receive msgs Msgs locally buffered in comm. layer

Virtual Synchrony Basic Definitions – Group view The view when sender sent msg m Each process has the same view – View change Change in group membership View change takes place by multicasting vc msg

Requirement Two multicast msgs simultaneously in transit: – m and vc – Nothing or ALL: Guarantee m is either delivered to all processes in G before vc or m is not delivered at all Requirement for reliable multicast protocol – Only one case in which m is allowed to fail: Group membership change is due to the sender of m crashing

Virtually Synchronous Sender crashes during the multicast, then the msg is either be delivered to all remaining processes or ignored by each of them. A view change acts as a barrier across which no multicast can pass

Message Ordering Four different orderings – Unordered multicast, FIFI-ordered, Causally- ordered, Totally ordered Unordered multicast

Message Ordering FIFO-ordered multicast Causally-ordered multicast – Causality between different msgs is preserved. – Implemented using vector timestamps

Different versions of virtual synchrony

Implementation of Virtual Synchrony Assume two views differ by at most one process No process failure while a new view change is announced

Scalability Challenges Large scale distributed system Mundane transient problems Both SRM and Virtual Synchrony have poor scalability

Scalability Challenges - SRM Request and Retransmission Storm – Linear growth of overhead with system size, or even quadratic under worst cases

Scalability Challenges - Virtual Synchrony Throughput instability – Performance decreases with higher perturbation rate and larger group size

Scalability Challenges - Virtual Synchrony Micropartition – To sustain stable throughput, failure detection is set aggressively – Healthy processes are frequently kicked out – Leave and rejoin are costly

Scalability Challenges - Virtual Synchrony Convoy – Transmission bursts in a tree-based system – Increasingly bursty layer by layer – Poor utilization of network bandwidth

Scalability Challenges Goal – Guarantees of scalability, performance, stability of throughput even under stress, and even when a significant rate of packet loss is occurring. Solution Epidemic Protocol

Analogy of epidemic or rumor spreading (gossip protocol)

Epidemic Protocol Analogy of epidemic or rumor spreading (gossip protocol)

Epidemic Protocol Analogy of epidemic or rumor spreading (gossip protocol)

Epidemic Protocol Analogy of epidemic or rumor spreading (gossip protocol)

Epidemic Protocol

Binomial Distribution

Epidemic Protocol Propagation Time Time to complete infection: O(log n)

Anti-Entropy – Monotonicity Order preservation Implementation Ordered update logs are maintained at each node Each update is assigned with (timestamp, node id) Compare incoming updates with the log and decide to merge / rollback and merge / discard Update Propagation Model

Optimization Unreliable Multicast – Rapidly distribute messages with message loss (gap) Gap Repairing Processes periodically gossip to a random process to exchange digests of its current received messages and repair gaps

Start by using unreliable multicast to rapidly distribute the message.

Periodically (e.g. every 100ms) each process sends a digest describing its state to a randomly selected group member.

Recipient checks the gossip digest against its own history and solicits any missing message from the process that sent the gossip

Processes respond to solicitations received and retransmit the requested message.

Optimization Bounded Overhead of Gossiping – For a given process, amount of data retransmitted will be bounded and excess requests will be ignored – Hash scheme is used to spread the buffering load around the system

Optimization Hierarchical Gossip The gossips are weighted so that nearby processes over low-latency links are preferred Each node maintains a subset of full system membership – Increase the rate of gossip to compensate the increasing propagation delays The weight of each node is adjusted to sustain constant load on routers

Scalability Each gossip round = 1 message sent + 1 message received (with high probability) + retransmit a bounded amount of data Loads between nodes are constant which means almost unlimited scalability In reality, scalability is limited due to propagation latency and group membership tracking

Scalability

Reliability Tunable reliability Replicate messages in the buffer across the system Increasing reliability by increasing the time length before a message is garbage collected

Summary SRM is a best-effort group communication protocol. Reliability is not guaranteed Virtual synchrony is a reliable group communication protocol Both SRM and virtual synchrony do not scale well Gossip-based protocols can provide good scalability while provid ing probabilistic reliability guarantees

Reference Bimodal multicast, Kenneth P. Birman, et.al. Spinglass: Secure and Scalable Communication Tools for Mission-Critical Computing, Kenneth P. Birman, et.al. Distributed Systems, Principles and Paradigms, Andrew S. Tanenbaum, et.al.