Optimistic Virtual Synchrony Jeremy Sussman - IBM T.J.Watson Idit Keidar – MIT LCS Keith Marzullo – UCSD CS Dept.

Slides:



Advertisements
Similar presentations
Multicasting in Mobile Ad hoc Networks By XIE Jiawei.
Advertisements

CS425/CSE424/ECE428 – Distributed Systems – Fall 2011 Material derived from slides by I. Gupta, M. Harandi, J. Hou, S. Mitra, K. Nahrstedt, N. Vaidya.
Prof R. Guerraoui Distributed Programming Laboratory
1 CS 194: Distributed Systems Process resilience, Reliable Group Communication Scott Shenker and Ion Stoica Computer Science Division Department of Electrical.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Reliable Multicast Steve Ko Computer Sciences and Engineering University at Buffalo.
Improving TCP Performance over Mobile Ad Hoc Networks by Exploiting Cross- Layer Information Awareness Xin Yu Department Of Computer Science New York University,
Computer Science Lecture 18, page 1 CS677: Distributed OS Last Class: Fault Tolerance Basic concepts and failure models Failure masking using redundancy.
Lab 2 Group Communication Andreas Larsson
Distributed Systems Fall 2010 Replication Fall 20105DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.
Ordering and Consistent Cuts Presented By Biswanath Panda.
GrapevineCS-4513, D-Term Introduction to the Grapevine Distributed System CS-4513 Distributed Computing Systems.
Algorithm for Virtually Synchronous Group Communication Idit Keidar, Roger Khazan MIT Lab for Computer Science Theory of Distributed Systems Group.
Group Communications Group communication: one source process sending a message to a group of processes: Destination is a group rather than a single process.
Computer Science Lecture 17, page 1 CS677: Distributed OS Last Class: Fault Tolerance Basic concepts and failure models Failure masking using redundancy.
Transis Efficient Message Ordering in Dynamic Networks PODC 1996 talk slides Idit Keidar and Danny Dolev The Hebrew University Transis Project.
© nCode 2000 Title of Presentation goes here - go to Master Slide to edit - Slide 1 Reliable Communication for Highly Mobile Agents ECE 7995: Term Paper.
Multicast Protocols Jed Liu 28 February Introduction  Recall Atomic Broadcast:  All correct processors receive same set of messages.  All messages.
Group Communication Phuong Hoai Ha & Yi Zhang Introduction to Lab. assignments March 24 th, 2004.
1 Principles of Reliable Distributed Systems Lecture 5: Failure Models, Fault-Tolerant Broadcasts and State-Machine Replication Spring 2005 Dr. Idit Keidar.
Scalable Group Communication for the Internet Idit Keidar MIT Lab for Computer Science Theory of Distributed Systems Group The main part of this talk is.
1 Message Logging Pessimistic & Optimistic CS717 Lecture 10/16/01-10/18/01 Kamen Yotov
Abstractions for Fault-Tolerant Distributed Computing Idit Keidar MIT LCS.
USER LEVEL INTERPROCESS COMMUNICATION FOR SHARED MEMORY MULTIPROCESSORS Presented by Elakkiya Pandian CS 533 OPERATING SYSTEMS – SPRING 2011 Brian N. Bershad.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Group Communication Robbert van Renesse CS614 – Tuesday Feb 20, 2001.
1 Idit Keidar MIT Lab for Computer Science Theory of Distributed Systems Group Paradigms for Building Distributed Systems: Performance Measurements and.
G Robert Grimm New York University Pulling Back: How to Go about Your Own System Project?
Josef Widder1 Why, Where and How to Use the  - Model Josef Widder Embedded Computing Systems Group INRIA Rocquencourt, March 10,
1 A Framework for Highly Available Services Based on Group Communication Alan Fekete Idit Keidar University of Sidney MIT.
CS603 Communication Mechanisms 14 January Types of Communication Shared Memory Message Passing Stream-oriented Communications Remote Procedure Call.
Lecture 12 Synchronization. EECE 411: Design of Distributed Software Applications Summary so far … A distributed system is: a collection of independent.
Vs. Object-Process Methodology Written by Linder Tanya Rubinshtein Leena Nazaredko Anton Research Report Work Flow Management System.
1 A Modular Approach to Fault-Tolerant Broadcasts and Related Problems Author: Vassos Hadzilacos and Sam Toueg Distributed Systems: 526 U1580 Professor:
CS 447 Networks and Data Communication
Wireless TCP Prasun Dewan Department of Computer Science University of North Carolina
CS 5204 (FALL 2005)1 Leases: An Efficient Fault Tolerant Mechanism for Distributed File Cache Consistency Gray and Cheriton By Farid Merchant Date: 9/21/05.
Total Order Broadcast and Multicast Algorithms: Taxonomy and Survey (Paper by X. Défago, A. Schiper, and P. Urbán) ACM computing Surveys, Vol. 36,No 4,
Lab 2 Group Communication Farnaz Moradi Based on slides by Andreas Larsson 2012.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
A Fault Tolerant Protocol for Massively Parallel Machines Sayantan Chakravorty Laxmikant Kale University of Illinois, Urbana-Champaign.
Farnaz Moradi Based on slides by Andreas Larsson 2013.
Dealing with open groups The view of a process is its current knowledge of the membership. It is important that all processes have identical views. Inconsistent.
Dealing with open groups The view of a process is its current knowledge of the membership. It is important that all processes have identical views. Inconsistent.
2007/1/15http:// Lightweight Probabilistic Broadcast M2 Tatsuya Shirai M1 Dai Saito.
Event Ordering Greg Bilodeau CS 5204 November 3, 2009.
Hwajung Lee. A group is a collection of users sharing some common interest.Group-based activities are steadily increasing. There are many types of groups:
November NC state university Group Communication Specifications Gregory V Chockler, Idit Keidar, Roman Vitenberg Presented by – Jyothish S Varma.
Totally Ordered Broadcast in the face of Network Partitions [Keidar and Dolev,2000] INF5360 Student Presentation 4/3-08 Miran Damjanovic
Scalable Group Communication for the Internet Idit Keidar MIT Lab for Computer Science Theory of Distributed Systems Group.
D u k e S y s t e m s Asynchronous Replicated State Machines (Causal Multicast and All That) Jeff Chase Duke University.
Fault Tolerance. Basic Concepts Availability The system is ready to work immediately Reliability The system can run continuously Safety When the system.
The Totem Single-Ring Ordering and Membership Protocol Y. Amir, L. E. Moser, P. M Melliar-Smith, D. A. Agarwal, P. Ciarfella.
Reliable Communication Smita Hiremath CSC Reliable Client-Server Communication Point-to-Point communication Established by TCP Masks omission failure,
December 4, 2002 CDS&N Lab., ICU Dukyun Nam The implementation of video distribution application using mobile group communication ICE 798 Wireless Mobile.
Building Dependable Distributed Systems, Copyright Wenbing Zhao
Optimization Problems in Wireless Coding Networks Alex Sprintson Computer Engineering Group Department of Electrical and Computer Engineering.
Antidio Viguria Ann Krueger A Nonblocking Quorum Consensus Protocol for Replicated Data Divyakant Agrawal and Arthur J. Bernstein Paper Presentation: Dependable.
Fault Tolerance (2). Topics r Reliable Group Communication.
1 QoS Adaptive Group Communication Antonio Di Ferdinando, Paul D Ezhilchelvan and Isi Mitrani (with inputs from Jon Crowcroft and Panos Gevros)
Group Communication A group is a collection of users sharing some common interest.Group-based activities are steadily increasing. There are many types.
Reliable multicast Tolerates process crashes. The additional requirements are: Only correct processes will receive multicasts from all correct processes.
Reliable group communication
CS 258 Reading Assignment 4 Discussion Exploiting Two-Case Delivery for Fast Protected Messages Bill Kramer February 13, 2002 #
Advanced Operating System
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
Last Class: Fault Tolerance
Presentation transcript:

Optimistic Virtual Synchrony Jeremy Sussman - IBM T.J.Watson Idit Keidar – MIT LCS Keith Marzullo – UCSD CS Dept.

2 Optimistic Virtual Synchrony Overview of Talk Group Communication Services (GCS) –Group Membership and Reliable Group Multicast and why some properties force processes to block Optimistic Virtual Synchrony (OVS) –Concept –Evaluation Related Work Conclusions

3 Optimistic Virtual Synchrony Group Communication Systems (1) Group Membership –Processes organized into groups –Particular memberships stamped as views In partitionable system, views can be concurrent Views provide a form of Concurrent Common Knowledge about system p1p1 time p2p2 p3p3 V1 {p 1, p 2, p 3 } V2 {p 1, p 2 } V5 {p 1, p 2, p 3 } V3 {p 3 }

4 Optimistic Virtual Synchrony Reliable Multicast –Messages sent to group –Same View Delivery If p 1 delivers m while in v 1 and p 2 delivers m, then p 2 delivers m in v 1 –Sending View Delivery If p 1 sends a message m while in v 1, then m is delivered in v 1 Group Communication Systems (2) p1p1 time p2p2 p3p3 V1 {p 1, p 2, p 3 } V2 {p 1, p 2 }

5 Optimistic Virtual Synchrony Assume that message delivery takes d Let processes send a message every t < d When can a new view be installed… –without violating Sending View Delivery? –without dropping messages? Why Properties Imply Blocking p1p1 time p2p2 p3p3 V1 {p 1, p 2, p 3 }

6 Optimistic Virtual Synchrony Observations Many problems are caused by stale views –Processes block when a view is stale –A sending process cannot know if the view will be stale before a message is delivered –For many applications, a message delivered in a stale view is useless Many applications do not require exact semantics of Sending View Delivery –State transfer not required for splitting –Leader election not required for joins

7 Optimistic Virtual Synchrony OVS: The Idea Inform processes of stale views –Give an “educated guess” of subsequent view (called the optimistic view) Allow processes to send optimistic messages that will be delivered in subsequent view –But deliver or drop message based on some predicate that the application provides

8 Optimistic Virtual Synchrony OVS Implementation Sender side: –All messages sent in a view that has become stale are sent optimistically Enhanced by a MessageCondition predicate Receiver side: –Store in a queue all optimistic messages received before a view change –On new view, deliver all optimistic messages for which the MessageCondition is true If more optimistic messages are received, treat them as if they were at the end of the queue

9 Optimistic Virtual Synchrony Message Conditions Separation of mechanism and policy –Provided by the application –Expressed as a predicate over the previous view, subsequent view, and optimistic view Examples: –Leader election ( leader in subsequent_view) –Need for state transfer (previous_view subset subsequent_view)

10 Optimistic Virtual Synchrony Evaluation of OVS Implementation of OVS on top of Transis –Comparison to a blocking system and to one that does not provide Sending View Delivery –Measurement of overhead of OVS Examination of applications that can benefit from the OVS semantics –(see proceedings)

11 Optimistic Virtual Synchrony Message Lifecycle Pre-send Pre-delivery Message Transmission Time Pre-send~ 90 microseconds Wire-time~1000 microseconds Pre-delivery~ 40 microseconds Total~1130 microseconds p1p1 time server p2p2

12 Optimistic Virtual Synchrony OVS Overhead Time in Microseconds Regular messages Optimistic messages with retransmission Optimistic messages with retransmission Sender side Receiver side Processing Time, pre-delivery

13 Optimistic Virtual Synchrony OVS Performance Benefits Average Time to Deliver Messages After View Change Time in Microseconds Messages delivered

14 Optimistic Virtual Synchrony Related Work Optimistic Atomic Broadcast [Pedone, Schiper] +Uses optimism for total order +Complementary to our approach Non-blocking light-weight groups [Amir et al; Dolev, Malki] +Scales well +Provides fast view delivery –Does not provide Sending View Delivery Property –Often allows messages to span more than one view (problematic for state transfer, other applications) Weak Virtual Synchrony [Friedman, van Renesse] +Eliminates blocking +Optimized for membership translation –Does not provide same level of policy/mechanism split –May require extra views to be delivered by system

15 Optimistic Virtual Synchrony Conclusions Optimistic Virtually Synchrony –Uses a very simple form of optimism Receiving processes never need to rollback from optimistic messages Sending process informed of dropped messages –Provides applications with useful properties Policy/Mechanism split on delivery semantics –From Sending View Delivery to all messages being delivered in subsequent view –Has low overhead

16 Optimistic Virtual Synchrony Common Properties of GCSs Virtual Synchrony –If p 1 and p 2 are both in v 1 and v 2 and p 1 delivered a message m in v 1, then p 2 delivered m in v 1 Guaranteed Self-Delivery –If p 1 sends a message m, then p 1 will eventually deliver m or crash Sending View Delivery –If p 1 sends a message m while in v 1, then m is delivered in v 1