1 Message Logging Pessimistic & Optimistic CS717 Lecture 10/16/01-10/18/01 Kamen Yotov

Slides:



Advertisements
Similar presentations
Impossibility of Distributed Consensus with One Faulty Process
Advertisements

TRANSACTION PROCESSING SYSTEM ROHIT KHOKHER. TRANSACTION RECOVERY TRANSACTION RECOVERY TRANSACTION STATES SERIALIZABILITY CONFLICT SERIALIZABILITY VIEW.
DISTRIBUTED SYSTEMS II FAULT-TOLERANT BROADCAST Prof Philippas Tsigas Distributed Computing and Systems Research Group.
Distributed Systems Fall 2010 Replication Fall 20105DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
CS 582 / CMPE 481 Distributed Systems Fault Tolerance.
CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch.
Computer Science Lecture 17, page 1 CS677: Distributed OS Last Class: Fault Tolerance Basic concepts and failure models Failure masking using redundancy.
EEC 688/788 Secure and Dependable Computing Lecture 12 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation.
Copyright 2004 Koren & Krishna ECE655/Ckpt Part.11.1 Fall 2006 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Fault Tolerant Computing.
CS-550 (M.Soneru): Recovery [SaS] 1 Recovery. CS-550 (M.Soneru): Recovery [SaS] 2 Recovery Computer system recovery: –Restore the system to a normal operational.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Causal Logging : Manetho Rohit C Fernandes 10/25/01.
CMPT Dr. Alexandra Fedorova Lecture XI: Distributed Transactions.
State Machines CS 614 Thursday, Feb 21, 2002 Bill McCloskey.
CMPT Dr. Alexandra Fedorova Lecture XI: Distributed Transactions.
A Survey of Rollback-Recovery Protocols in Message-Passing Systems M. Elnozahy, L. Alvisi, Y. Wang, D. Johnson Carnegie Mellon University Presented by:
1 Rollback-Recovery Protocols II Mahmoud ElGammal.
CIS 720 Distributed algorithms. “Paint on the forehead” problem Each of you can see other’s forehead but not your own. I announce “some of you have paint.
Checkpointing and Recovery. Purpose Consider a long running application –Regularly checkpoint the application Expensive task –In case of failure, restore.
A Survey of Rollback-Recovery Protocols in Message-Passing Systems.
Chapter 19 Recovery and Fault Tolerance Copyright © 2008.
Reliable Communication in the Presence of Failures Based on the paper by: Kenneth Birman and Thomas A. Joseph Cesar Talledo COEN 317 Fall 05.
Distributed Transactions Chapter 13
Distributed Systems CS Fault Tolerance- Part III Lecture 19, Nov 25, 2013 Mohammad Hammoud 1.
EEC 688/788 Secure and Dependable Computing Lecture 7 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Fault Tolerant Systems
Chapter 15 Recovery. Topics in this Chapter Transactions Transaction Recovery System Recovery Media Recovery Two-Phase Commit SQL Facilities.
A Fault Tolerant Protocol for Massively Parallel Machines Sayantan Chakravorty Laxmikant Kale University of Illinois, Urbana-Champaign.
12. Recovery Study Meeting M1 Yuuki Horita 2004/5/14.
Practical Byzantine Fault Tolerance
CS5204 – Operating Systems 1 Checkpointing-Recovery.
Checkpointing and Recovery. Purpose Consider a long running application –Regularly checkpoint the application Expensive task –In case of failure, restore.
CprE 545: Fault Tolerant Systems (G. Manimaran), Iowa State University1 CprE 545: Fault Tolerant Systems Rollback Recovery Protocols.
Coordinated Checkpointing Presented by Sarah Arnold 1.
Rollback-Recovery Protocols in Message-Passing Systems Based on A Survey of Rollback-Recovery Protocols in Message-Passing Systems by Mootaz Elnozahy Lorenzo.
Chapter 2 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University Building Dependable Distributed Systems.
Event Ordering Greg Bilodeau CS 5204 November 3, 2009.
Copyright © George Coulouris, Jean Dollimore, Tim Kindberg This material is made available for private study and for direct.
Commit Algorithms Hamid Al-Hamadi CS 5204 November 17, 2009.
Totally Ordered Broadcast in the face of Network Partitions [Keidar and Dolev,2000] INF5360 Student Presentation 4/3-08 Miran Damjanovic
D u k e S y s t e m s Asynchronous Replicated State Machines (Causal Multicast and All That) Jeff Chase Duke University.
The Totem Single-Ring Ordering and Membership Protocol Y. Amir, L. E. Moser, P. M Melliar-Smith, D. A. Agarwal, P. Ciarfella.
Revisiting failure detectors Some of you asked questions about implementing consensus using S - how does it differ from reaching consensus using P. Here.
EEC 688/788 Secure and Dependable Computing Lecture 6 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Middleware for Fault Tolerant Applications Lihua Xu and Sheng Liu Jun, 05, 2003.
Rollback-Recovery Protocols I Message Passing Systems Nabil S. Al Ramli.
Fault Tolerance and Checkpointing - Sathish Vadhiyar.
FTOP: A library for fault tolerance in a cluster R. Badrinath Rakesh Gupta Nisheeth Shrivastava.
1 Fault Tolerance and Recovery Mostly taken from
Operating System Reliability Andy Wang COP 5611 Advanced Operating Systems.
Prepared by Ertuğrul Kuzan
EEC 688/788 Secure and Dependable Computing
Operating System Reliability
Operating System Reliability
EECS 498 Introduction to Distributed Systems Fall 2017
Operating System Reliability
Operating System Reliability
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
Middleware for Fault Tolerant Applications
EEC 688/788 Secure and Dependable Computing
Fault Tolerant Distributed Computing system.
EEC 688/788 Secure and Dependable Computing
Operating System Reliability
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
Operating System Reliability
Operating System Reliability
Presentation transcript:

1 Message Logging Pessimistic & Optimistic CS717 Lecture 10/16/01-10/18/01 Kamen Yotov

2 Intruduction Context & Applications Check-pointing Message Logging Pessimistic (failure-free mode suffers) Optimistic (good for failure-free mode) Causal (to be discussed in next lectures...) Main problems Consistency Orphans

3 Fault Tolerance “Why”s Flow of events Check-point Log messages Crash Restore Replay

4 Common Assumptions Fail-stop model Failure eventually detectable by all Channels Asynchronous Reliable FIFO Unbounded message delivery Failures Transiently dropping No duplication and/or corruption Stable storage Spare processing capacity

5 Common goals Application independence Application transparency Simple Independent evolution Handles preexisting programs High throughput Failure-free model with little overhead Maximum fault-tolerance Any number of failures

6 Formal Terminology Delivery (as opposed to receipt) Non-faulty processes eventually deliver all messages that they have received Receive sequence number If p delivers m and m.rsn=l then m is the l th message p delivers Run Sequence of system states Asynchronous Only one process changes state at once

7 Formal Terminology (cont.) Properties: Logical expressions over runs □  - Always  ◊  - Eventually  Message determinant #m = m.data and m.dest not essential Logging determinants vs. actual messages Other notation N – set of all processes C – set of failed processes Log(m) – set of processes possessing a copy of #m Depend(m) – set of processes that depend on m

8 Orphan Properties Before failure, by definition #m  Log(m) #m lost if Log(m)  C stable(m) if #m cannot be lost p orphan of C if p did not fail p  Depend(m) #m is lost

9 Orphan Properties (cont.)

10 Performance Metrics Number of forced roll-backs Time spend on blocking Number of messages Size of messages

11 Got to the real-world stuff! No additional messages Any number of failures (including total) No assumptions about the logging protocol Pessimistic doesn’t require that generality

12 The Model Process states Process states State interval Instantiates a new one on each message received State interval index (auto increment) p1p1 p2p2 p3p3 I03I03 I13I13 I23I23 I33I33 I43I43 I53I53 I01I01 I11I11 I32I32

13 The Model Process states (cont.) p1p1 p2p2 p3p3 I03I03 I13I13 I23I23 I33I33 I43I43 I53I53 I01I01 I11I11 I32I32 Dependencies between process states (p i depends on p j ) Maximum index of any interval of p j, on which p i depends Inside a process each interval depends on the previous one Dependence vector d i = =,  k = , 0, 1, …

14 The Model System states Process state – dependence vector d i = =,  k = , 0, 1, … System state – dependence matrix n  n Row i – process state for p i Diagonal – current state intervals

15 The Model System states (cont.) S – set of all system states A=[  ** ]  S and B=[  ** ]  S A  B   i=1..n:  ii   ii Partial order different than Lamport’s Orders system states vs. events Only events are state intervals Lattice A  B = [  ** ]  ik =  ii   ii ?  ik :  ik A  B = [  ** ]  ik =  ii   ii ?  ik :  ik

16 The Model Consistent System states Consistent state All received messages Sent in the current state of the sender Can be deterministically sent in the future Messages not yet received are not a problem Definition: D=[  ** ]  S,  i, k=1..n:  ik   kk A process cannot depend on the state interval of another process, that has not been reached yet C = { D  S | D is consistent } C is a sub-lattice of S – proof straightforward!

17 The Model Logging and Stability logged(i,  ) Message that started state interval  of process i has been logged on stable storage checkpoint(i,  ) Exists a check-point that contains the state of process i on stable storage checkpoint(i,0) is implicit Effective check-point for  on i is checkpoint(i,  ),   ,  is maximal stable(i,  )   :  <     [logged(i,  )]

18 The Model Recoverable System states Recoverable system state System state is consistent All current process states are stable D=[  ** ]  S recoverable(D)  D  C &&  i : stable(i,  ii ) R = { D  S | recoverable(D) } R is a sub-lattice of S – proof straightforward! Theorem: A single maximum recoverable state exists! Proof R  S; A  B  R if A, B  RA, B  A  B Therefore maximum is  D  R D, obviously unique!

19 The Model Recoverable System states (cont.) Current recovery state The Maximum Recover State at any time Never decreases D=[  ** ], No  : (  i :    ii ) is ever rolled back Proof: D will always remain consistent  ii will always remain stable Since R is a lattice, any new state formed after D will be greater than D In any new current recovery:  ii  state interval index for each process Therefore, not state interval    ii for each i will ever need to be rolled back!

20 The Model Wrapup… Corollary 1: If all messages received are eventually logged no domino effect occurs If D=[  ** ] is the current recovery state Corollary 2: Any messages sent by process i from state    ii may be committed With  i being the effective checkpoint of  ii Corollary 3: All previous checkpoints of process i may be discarded Corollary 4: All messages that begin state intervals prior to  i may be discarded

21 The Algorithm Overview Keep a current recovery state On each new interval  for some process k becoming stable Try to improve the current recovery state, such that: State of process k advances to  Add more state intervals from other processes to maintain consistency Succeed if all such included intervals are stable

22 The Algorithm Basic implementation Notation D=[  ** ]– the current recovery state  – state interval of process k becoming stable d k = =,  j = , 0, 1, … – state of process k (dependence vector) Algorithm if (  >  kk ) {  i :  ki   i // update row of D while (  i,j :  ij >  jj ) if (     ij : stable(  )) //  - an interval for j  i :  ji   i // update row of D with d j for  else fail }

23 The Algorithm Some details The chosen  should be the minimum stable state interval:    ij The comparisons  ij >  jj can be made in any order without affecting the final result When state interval  of process k becomes stable, the algorithm finds some recoverable D with  kk =  No stable process state interval  that was not suitable should be checked again before advancing the current recovery state Corollary: When the recovery state advances from some D to D’, the rejected  ’s above that need to be rechecked are those with direct dependency on some  on any process i such that  ii <  <  ii ’

24 The Algorithm Proof of Correctness The algorithm presented always finds the current recovery state of the system Only finds recoverable system states Any such state found is greater Following the observations stated before, all possible new states are considered Therefore, the correct one is always found!

25 The Algorithm Optimizations & Implementation Optimization considerations Keeping work list of rows to update D Keep only the one with max index Keeping only the diagonal of D Implementation Provided in the paper Follows everything said till now Takes advantage of some specifics

26 Conclusions General Model and Algorithm Work for both pessimistic and optimistic protocols Does not need the generality for the pessimistic case Optimistic logging is desirable from performance standpoint in low failure environments Unifies existing approaches to fault tolerance Check-pointing Message Logging Results Existence of unique maximum recoverable state Never decreases (progress is being made) Domino effect cannot occur

27 Future work list… Address non-determinism Switch between check-pointing for the non-deterministic part Check-pointing + message logging elsewhere Output-driven optimistic message logging and check-pointing Pay attention to communication of the results Application specific knowledge