Persistent Linda 3.0 Peter Wyckoff New York University.

Slides:



Advertisements
Similar presentations
Data recovery 1. 2 Recovery - introduction recovery restoring a system, after an error or failure, to a state that was previously known as correct have.
Advertisements

Two phase commit. Failures in a distributed system Consistency requires agreement among multiple servers –Is transaction X committed? –Have all servers.
Replication. Topics r Why Replication? r System Model r Consistency Models r One approach to consistency management and dealing with failures.
Sanjay Ghemawat, Howard Gobioff and Shun-Tak Leung
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
1 Cheriton School of Computer Science 2 Department of Computer Science RemusDB: Transparent High Availability for Database Systems Umar Farooq Minhas 1,
CS-550: Distributed File Systems [SiS]1 Resource Management in Distributed Systems: Distributed File Systems.
Using DSVM to Implement a Distributed File System Ramon Lawrence Dept. of Computer Science
Computer Science Lecture 18, page 1 CS677: Distributed OS Last Class: Fault Tolerance Basic concepts and failure models Failure masking using redundancy.
Scaling Distributed Machine Learning with the BASED ON THE PAPER AND PRESENTATION: SCALING DISTRIBUTED MACHINE LEARNING WITH THE PARAMETER SERVER – GOOGLE,
What Should the Design of Cloud- Based (Transactional) Database Systems Look Like? Daniel Abadi Yale University March 17 th, 2011.
Recovery 10/18/05. Implementing atomicity Note, when a transaction commits, the portion of the system implementing durability ensures the transaction’s.
CS 582 / CMPE 481 Distributed Systems Fault Tolerance.
Computer Science Lecture 12, page 1 CS677: Distributed OS Last Class Distributed Snapshots –Termination detection Election algorithms –Bully –Ring.
Computer Science Lecture 17, page 1 CS677: Distributed OS Last Class: Fault Tolerance Basic concepts and failure models Failure masking using redundancy.
Quick Review of May 1 material Concurrent Execution and Serializability –inconsistent concurrent schedules –transaction conflicts serializable == conflict.
Building Fault Survivable MPI Programs with FT-MPI Using Diskless Checkpointing Zizhong Chen, Graham E. Fagg, Edgar Gabriel, Julien Langou, Thara Angskun,
Cs4432recovery1 CS4432: Database Systems II Database Consistency and Violations?
Persistent State Service 1 Distributed Object Transactions  Transaction principles  Concurrency control  The two-phase commit protocol  Services for.
Reliability and Partition Types of Failures 1.Node failure 2.Communication line of failure 3.Loss of a message (or transaction) 4.Network partition 5.Any.
1 Transaction Management Database recovery Concurrency control.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Two Techniques For Improving Distributed Database Performance ICS 214B Presentation Ambarish Dey Vasanth Venkatachalam March 18, 2004.
CS 603 Data Replication February 25, Data Replication: Why? Fault Tolerance –Hot backup –Catastrophic failure Performance –Parallelism –Decreased.
Academic Year 2014 Spring. MODULE CC3005NI: Advanced Database Systems “DATABASE RECOVERY” (PART – 1) Academic Year 2014 Spring.
A Survey of Rollback-Recovery Protocols in Message-Passing Systems M. Elnozahy, L. Alvisi, Y. Wang, D. Johnson Carnegie Mellon University Presented by:
INTRODUCTION TO TRANSACTION PROCESSING CHAPTER 21 (6/E) CHAPTER 17 (5/E)
Highly Available ACID Memory Vijayshankar Raman. Introduction §Why ACID memory? l non-database apps: want updates to critical data to be atomic and persistent.
Distributed Deadlocks and Transaction Recovery.
Checkpointing and Recovery. Purpose Consider a long running application –Regularly checkpoint the application Expensive task –In case of failure, restore.
1 Moshe Shadmon ScaleDB Scaling MySQL in the Cloud.
Switch off your Mobiles Phones or Change Profile to Silent Mode.
EEC 688/788 Secure and Dependable Computing Lecture 7 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Advanced Computer Networks Topic 2: Characterization of Distributed Systems.
A Fault Tolerant Protocol for Massively Parallel Machines Sayantan Chakravorty Laxmikant Kale University of Illinois, Urbana-Champaign.
12. Recovery Study Meeting M1 Yuuki Horita 2004/5/14.
Practical Byzantine Fault Tolerance
Checkpointing and Recovery. Purpose Consider a long running application –Regularly checkpoint the application Expensive task –In case of failure, restore.
Serverless Network File Systems Overview by Joseph Thompson.
Robustness in the Salus scalable block store Yang Wang, Manos Kapritsos, Zuocheng Ren, Prince Mahajan, Jeevitha Kirubanandam, Lorenzo Alvisi, and Mike.
 Distributed file systems having transaction facility need to support distributed transaction service.  A distributed transaction service is an extension.
Commit Algorithms Hamid Al-Hamadi CS 5204 November 17, 2009.
Chap 7: Consistency and Replication
Recovery Management in QuickSilver Roger Haskin, Yoni Malachi, Wayne Sawdon, and Gregory Chan IBM Almaden Research Center.
Carnegie Mellon Carnegie Mellon Univ. Dept. of Computer Science Database Applications C. Faloutsos Recovery.
University of Westminster – Checkpointing Mechanism for the Grid Environment K Sajadah, G Terstyanszky, S Winter, P. Kacsuk University.
GPFS: A Shared-Disk File System for Large Computing Clusters Frank Schmuck & Roger Haskin IBM Almaden Research Center.
FTOP: A library for fault tolerance in a cluster R. Badrinath Rakesh Gupta Nisheeth Shrivastava.
Em Spatiotemporal Database Laboratory Pusan National University File Processing : Database Management System Architecture 2004, Spring Pusan National University.
Robustness in the Salus scalable block store Yang Wang, Manos Kapritsos, Zuocheng Ren, Prince Mahajan, Jeevitha Kirubanandam, Lorenzo Alvisi, and Mike.
Computer Science Lecture 19, page 1 CS677: Distributed OS Last Class: Fault tolerance Reliable communication –One-one communication –One-many communication.
1 Fault Tolerance and Recovery Mostly taken from
Jun-Ki Min. Slide Purpose of Database Recovery ◦ To bring the database into the last consistent stat e, which existed prior to the failure. ◦
Recovery Control (Chapter 17)
Client/Server Databases and the Oracle 10g Relational Database
CSS534: Parallel Programming in Grid and Cloud
Prepared by Ertuğrul Kuzan
Cassandra Transaction Processing
Introduction to NewSQL
Operating System Reliability
Operating System Reliability
EECS 498 Introduction to Distributed Systems Fall 2017
Outline Announcements Fault Tolerance.
Operating System Reliability
Operating System Reliability
Database Backup and recovery
Operating System Reliability
Last Class: Fault Tolerance
Operating System Reliability
Operating System Reliability
Presentation transcript:

Persistent Linda 3.0 Peter Wyckoff New York University

Roadmap  Goals  Model  Simple example  Transactions  Tuning transactions  Checkpointing  Experiments  Summary

Goals  Utilize networks of workstations for parallel applications.  Low cost fault tolerance mechanisms for parallel applications with state  Provably good algorithms for minimizing fault tolerance overhead  A robust system with “good” performance on real applications

Example: Matrix Multiplication Void Multiply(float A,B,C[size][size]) { for(int I = 0 ; I < numHelpers ; I++) create_helper(‘helper’); output(‘B Matrix’, B); for(int I = 0 ; I < size ; I++) output(‘Row of A’, I, A[I]); for(int I = 0 ; I < size ; I++) input(‘Row of C’, I, ? C[I]) } Void helper() { float B[size][size]; float A[size],C[size]; read(‘B Matrix’, ?B); while(! Done) { input(‘Row of A’, ?I,?A); mult(A,B,C); output(‘Row of C’, I, C); } Sample Program

Fault Tolerance  Problems  Process resiliency  Process local consistency  Global consistency

Solution: Transactions  Good points Well defined behavior All or nothing Retains global consistency Durable  Bad points does not address local consistency does not address process resiliency expensive

Example: Matrix Multiplication with Traditional Transactions Void Multiply(float A,B,C[size][size]) { beginTransaction; for(int I = 0 ; I < numHelpers ; I++) create_helper(‘helper’); output(‘B Matrix’, B); for(int I = 0 ; I < size ; I++) output(‘Row of A’, I, A[I]); endTransaction; beginTransaction; for(int I = 0 ; I < size ; I++) input(‘Row of C’, I, ? C[I]) endTransaction; } Void helper() { float B[size][size]; float A[size],C[size]; read(‘B Matrix’, ?B); while(! Done) { beginTransaction; input(‘Row of A’, ?I,?A); mult(A,B,C); output(‘Row of C’, I, C); endTransaction; } Sample Program

Continuations  Encode process state, live variables, in an architecture dependent encoding  Save it to stable store at the end of each transaction  Use the transactional semantics on this information so that it is updated only if the transaction finishes

Matrix Multiplication with Transactions and Continuations Void Multiply(float A,B,C[size][size]) { int tranNumber = 0; recoverIfNeeded(tranNumber, A, B, C); beginTransaction(0) for(int I = 0 ; I < numHelpers ; I++) create_helper(‘helper’); output(‘B Matrix’, B); for(int I = 0 ; I < size ; I++) output(‘Row of A’, I, A[I]); endTransaction(++tranNumber, A,B,C); beginTransaction(1); for(int I = 0 ; I < size ; I++) input(‘Row of C’, I, ? C[I]) endTransaction(++tranNumber, A,B,C); } Void helper() { float B[size][size]; float A[size],C[size]; read(‘B Matrix’, ?B); while(! Done) { beginTransaction; input(‘Row of A’, ?I,?A); mult(A,B,C); output(‘Row of C’, I, C); endTransaction; } Sample Program

Runtime System with Durable Transactions

Cost of Transactions  Do not address local consistency  Do not address process resiliency  expensive Durability achieved by committing to stable store Sending the continuations to the server(s) over the network

Lite Transactions  Do not commit to stable store  Transactions still durable from client perspective  Commit is to memory at server and therefore fast!  Periodically checkpoint the server  On server failure (which is rare), rollback

Continuation Committing Revisited 1  Does the continuation really need to be sent to the server at each and every transaction commit?  What if we only get the continuations to the server every n minutes and checkpoint at the same time? On process failure will rollback to previous checkpoint Very low fault tolerance overhead Single failure leads to rollback

Continuation Committing Revisited 2  What if we replicate each process?  Only send continuation when checkpointing or when the other process fails to create another replica Low overhead fault tolerance Can quickly recover from one failure Massive rollback for the general failure case

Continuation Committing Mechanisms  Commit consistent  Periodic with replication  Periodic with message logging and replay  Periodic with undo log  Periodic only  Others?

Use the Mode that Best Suits Each Process  Each mode has a different failure free versus recovery time tradeoff  Each mode is good for processes with different characteristics Commit consistent is great for processes with fairly small state running on unstable machines Message logging and replay is good for large state processes which don’t communicate too much  Can have the end user or programmer decide which mode an application should use  Use all modes at the same time and have algorithms which decide which mode to use for each transaction based on the process and machine characteristics

Runtime System Different Committing Modes

Challenges  Checkpointing - algorithms exist for each mechanism in isolation some processes are inconsistent must not block clients  Choose the best mechanism for each process in an application at a particular time  Complicated to keep consistency when using different modes

Single Server Checkpointing  Always keep the two latest checkpoints  Flush committed memory to stable store  Flush consistent process continuations to stable store  Request continuations from all inconsistent processes  Continue servicing all requests from consistent processes  Continue servicing all but commit requests from inconsistent processes  Provably correct

Experiments

Current and Future Work  How to replicate the server(s) to provide availability  Algorithms for minimizing fault tolerance overhead  Predicting idle times  Combining the flexibility of PLinda’s programming model with the ease of Calypso’s programming model