“Distributed Algorithms” by Nancy A. Lynch SHARED MEMORY vs NETWORKS Presented By: Sumit Sukhramani Kent State University.

Slides:



Advertisements
Similar presentations
CS 603 Process Synchronization: The Colored Ticket Algorithm February 13, 2002.
Advertisements

Impossibility of Distributed Consensus with One Faulty Process
CS3771 Today: deadlock detection and election algorithms  Previous class Event ordering in distributed systems Various approaches for Mutual Exclusion.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Consensus Steve Ko Computer Sciences and Engineering University at Buffalo.
Lecture 8: Asynchronous Network Algorithms
Brewer’s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services Authored by: Seth Gilbert and Nancy Lynch Presented by:
Chapter 15 Basic Asynchronous Network Algorithms
CPSC 668Set 18: Wait-Free Simulations Beyond Registers1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.
Announcements. Midterm Open book, open note, closed neighbor No other external sources No portable electronic devices other than medically necessary medical.
Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks.
BREWER’S CONJECTURE AND THE FEASIBILITY OF CAP WEB SERVICES (Eric Brewer) Seth Gilbert Nancy Lynch Presented by Kfir Lev-Ari.
Failure Detectors. Can we do anything in asynchronous systems? Reliable broadcast –Process j sends a message m to all processes in the system –Requirement:
Distributed Systems Fall 2010 Replication Fall 20105DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
CPSC 668Set 9: Fault Tolerant Consensus1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.
CPSC 668Set 16: Distributed Shared Memory1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.
1 Fault-Tolerant Consensus. 2 Failures in Distributed Systems Link failure: A link fails and remains inactive; the network may get partitioned Crash:
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 4 – Consensus and reliable.
Chapter 18: Distributed Coordination (Chapter 18.1 – 18.5)
1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems.
CMPT Dr. Alexandra Fedorova Lecture XI: Distributed Transactions.
CS 603 Data Replication February 25, Data Replication: Why? Fault Tolerance –Hot backup –Catastrophic failure Performance –Parallelism –Decreased.
Time, Clocks, and the Ordering of Events in a Distributed System Leslie Lamport (1978) Presented by: Yoav Kantor.
Commit Protocols. CS5204 – Operating Systems2 Fault Tolerance Causes of failure: process failure machine failure network failure Goals : transparent:
Lecture 8-1 Computer Science 425 Distributed Systems CS 425 / CSE 424 / ECE 428 Fall 2010 Indranil Gupta (Indy) September 16, 2010 Lecture 8 The Consensus.
Distributed Algorithms – 2g1513 Lecture 9 – by Ali Ghodsi Fault-Tolerance in Distributed Systems.
Chapter 19 Recovery and Fault Tolerance Copyright © 2008.
Consensus and Its Impossibility in Asynchronous Systems.
DISTRIBUTED ALGORITHMS By Nancy.A.Lynch Chapter 18 LOGICAL TIME By Sudha Elavarti.
Consensus with Partial Synchrony Kevin Schaffer Chapter 25 from “Distributed Algorithms” by Nancy A. Lynch.
Byzantine fault-tolerance COMP 413 Fall Overview Models –Synchronous vs. asynchronous systems –Byzantine failure model Secure storage with self-certifying.
1 Chapter 9 Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2014 Synchronization Algorithms and Concurrent Programming Synchronization.
DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch Set 11: Asynchronous Consensus 1.
1 © R. Guerraoui Regular register algorithms R. Guerraoui Distributed Programming Laboratory lpdwww.epfl.ch.
The Complexity of Distributed Algorithms. Common measures Space complexity How much space is needed per process to run an algorithm? (measured in terms.
1 Consensus Hierarchy Part 1. 2 Consensus in Shared Memory Consider processors in shared memory: which try to solve the consensus problem.
Commit Algorithms Hamid Al-Hamadi CS 5204 November 17, 2009.
CS294, Yelick Consensus revisited, p1 CS Consensus Revisited
Time, Clocks, and the Ordering of Events in a Distributed System Leslie Lamport Massachusetts Computer Associates,Inc. Presented by Xiaofeng Xiao.
Distributed systems Consensus Prof R. Guerraoui Distributed Programming Laboratory.
Physical clock synchronization Question 1. Why is physical clock synchronization important? Question 2. With the price of atomic clocks or GPS coming down,
Hwajung Lee. Well, you need to capture the notions of atomicity, non-determinism, fairness etc. These concepts are not built into languages like JAVA,
Hwajung Lee. Why do we need these? Don’t we already know a lot about programming? Well, you need to capture the notions of atomicity, non-determinism,
Impossibility of Distributed Consensus with One Faulty Process By, Michael J.Fischer Nancy A. Lynch Michael S.Paterson.
Chapter 21 Asynchronous Network Computing with Process Failures By Sindhu Karthikeyan.
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Set 16: Distributed Shared Memory 1.
Fault tolerance and related issues in distributed computing Shmuel Zaks GSSI - Feb
Hwajung Lee. Mutual Exclusion CS p0 p1 p2 p3 Some applications are:  Resource sharing  Avoiding concurrent update on shared data  Controlling the.
Group Communication A group is a collection of users sharing some common interest.Group-based activities are steadily increasing. There are many types.
Distributed Databases – Advanced Concepts Chapter 25 in Textbook.
Mutual Exclusion with Partial Synchrony
The consensus problem in distributed systems
Atomic register algorithms
Lecture 9: Asynchronous Network Algorithms
View Change Protocols and Reconfiguration
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
Agreement Protocols CS60002: Distributed Systems
Outline Announcements Fault Tolerance.
Distributed Systems, Consensus and Replicated State Machines
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
EEC 688/788 Secure and Dependable Computing
Mutual Exclusion CS p0 CS p1 p2 CS CS p3.
Distributed Transactions
EEC 688/788 Secure and Dependable Computing
Physical clock synchronization
Distributed Databases Recovery
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
Constructing Reliable Registers From Unreliable Byzantine Components
Fault-Tolerant SemiFast Implementations of Atomic Read/Write Registers
Presentation transcript:

“Distributed Algorithms” by Nancy A. Lynch SHARED MEMORY vs NETWORKS Presented By: Sumit Sukhramani Kent State University

Transformation from the Shared Memory Model to the Network Model Correctness conditions satisfied by the Transformations. Non fault Tolerant Strategies. Fault Tolerant Strategies.

Correctness Conditions The general problem is to design an asynchronous send/receive network system B with processes P i, 1<=i<=n that is an I simulation of A. The correctness conditions are: α and ά are indistinguishable to U. For each I, a stop occurs in α only if stop occurs in ά.

Strategies Assuming no Failures Classification of simple strategies :  Single-copy scheme *  Multi-copy schemes * * Based on number of copies of the shared variables

Single- Copy Schemes SimpleShVarSim Algorithm Architecture of SimpleShVarSim Q1Q1 Q2Q2 R x,1 R y,1 R y,2 R x,2

Strategies Assuming no Failures (contd) Location of Shared Variable Fault Tolerance Busy Waiting Multiple–copy schemes Multi-Writer Register Single-Writer Register  MajorityVotingObject algorithm

MajorityVotingObject algorithm Lemma17.2 The MajorityVotingObject algorithm is a read/write atomic object.  The write operations obtain tags 1,2,…, in the order of their serialization points.  Each read or embedded read obtains the largest tag that has been written by a write operation serialized before it, together with the accompanying value. Theorem 17.3: Suppose that A uses read/write shared variables. Then the MajorityVoting algorithm based on A is a 0- simulation of A.

Lack of Fault Tolerance in MajorityVotingObject The standard transaction implementations are not fault tolerant for an atomic object x. Example: A process performing a read transaction might send out messages to read a majority of the copies to become locked. The same process might fail without releasing its locks which would prevent any later write transaction from ever obtaining the locks it requires. Solution: Use timeouts to detect process failures.

Algorithm Tolerating Process Failures // Assumption is that majority of the processes do not fail n > 2f // // Assumption is that the network is reliable // // Considering only the case of single-writer/multi-reader read/write shared memory. Implementation involves that each read/write shared variable x, of a read/write atomic object guaranteeing f-failure termination. It also involves assuming operations on specific ports. The central concept is that the result of each write is stored at a majority of the nodes in the network before the write completes.

ABDObject algorithm (Write) Each of the processes maintains a copy of x together with a tag. When writer process wants to perform a write(v) on x, it lets t be the smallest tag that it has not yet assigned to any write. Then it sets its local copy of x and local tag to v and t, respectively, and sends (“write”, v, t) messages to all the other processes. A process receiving this message updates its copy of x and the tag (if the tag is greater than the current tag). Finally it sends an acknowledgement to the writer. When the writer knows that majority of processes have their tag values equal to t, it returns ack.

ABDObject algorithm (Read) When a process wants to read x, it sends read messages to all the other processes and also reads its own value of x and its own tag. A process receiving this message responds with the latest value of x and tag. When the process has learned the x and tag values of a majority of the processes, it returns the value of x along with the largest tag t. The process also updates its own value of variable and the tag. Finally the processes after updating their value of x and tag send an ack.

Theorem 17.4 The ABDObject algorithm for n > 2f is a read/write atomic object guaranteeing f-failure termination. The algorithm is well formed and and has f-failure termination because each operation requires only majority of processes to decide. Serialization can be shown from the following properties:  If a write π with tag = t completes before a read ø is invoked, then ø obtains a tag that is at least as large as t.  If read π completes before read ø is invoked, then the tag obtained by ø is at least as great as that obtained by π.

Impossibility result for n/2 failures ABD algorithm does not tolerate f failures if n <= 2f. This is because the failure of this many processes make the other processes permanently unable to secure the majorities. Theorem 17.6 Let n = m + p where m, p >=1, and suppose that n <= 2f. Then there is no algorithm in the asynchronous broadcast model that implements a read/write atomic object with m writers and p readers, guaranteeing f-failure termination. Proving by contradiction. The theorem implies that for any fixed n and f where n >= 2 and f >= n/2 there can be no general method for producing f simulations if n process shared memory algorithms even if the underlying shared variables are restricted to be single writer/single reader registers.

Transformations from the Network to the Shared Memory Model  In this transformation there is no special requirement on the number of failures and this works even if n <= 2f. The constructions are simpler.  The reason for this is that the asynchronous model is more powerful than the asynchronous network model. The reason behind this is the availability of reliable shared memory.

Send/Receive Systems (single-writer/single-reader registers) The general problem is to produce a shared memory system B with n processes using single writer/single registers that simulate A. SimpleSRSim algorithm  B includes a single writer/single reader shared variable x(i,j) writable by process i and readable by process j. It contains a queue of messages initially empty. Process i only adds messages to the queue, no removals happen.  From time to time process i checks all the incoming variables x(j,i) in order to find if any new messages have been placed.  Finally process i handles the messages as P i does.

Broadcast Systems (single-writer/multi-reader registers)  SimpleBcastSim algorithm  B includes a single writer/multi reader shared variable x(i) writable by I and readable by all processes. It also contains a queue of messages initially empty.  Process I adds the message m to the end of the queue in the variable x(i). From time to time process i checks all the variables x(j) including x(i) in order to find if any new messages have been placed.  Finally process i handles the messages as P i does