Distributed Systems CS

Slides:



Advertisements
Similar presentations
Replication. Topics r Why Replication? r System Model r Consistency Models r One approach to consistency management and dealing with failures.
Advertisements

Consistency and Replication Chapter 7 Part II Replica Management & Consistency Protocols.
Replication Management. Motivations for Replication Performance enhancement Increased availability Fault tolerance.
Distributed File Systems Chapter 11
CS 582 / CMPE 481 Distributed Systems
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Computer Science Lecture 21, page 1 CS677: Distributed OS Today: Coda, xFS Case Study: Coda File System Brief overview of other recent file systems –xFS.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Concurrency Control & Caching Consistency Issues and Survey Dingshan He November 18, 2002.
Distributed Systems CS Distributed File Systems- Part II Lecture 20, Nov 16, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1.
Jeff Chheng Jun Du.  Distributed file system  Designed for scalability, security, and high availability  Descendant of version 2 of Andrew File System.
NFS. The Sun Network File System (NFS) An implementation and a specification of a software system for accessing remote files across LANs. The implementation.
Distributed File Systems Concepts & Overview. Goals and Criteria Goal: present to a user a coherent, efficient, and manageable system for long-term data.
Distributed Systems Principles and Paradigms Chapter 10 Distributed File Systems 01 Introduction 02 Communication 03 Processes 04 Naming 05 Synchronization.
Distributed File Systems
Distributed File Systems Overview  A file system is an abstract data type – an abstraction of a storage device.  A distributed file system is available.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
CS 5204 (FALL 2005)1 Leases: An Efficient Fault Tolerant Mechanism for Distributed File Cache Consistency Gray and Cheriton By Farid Merchant Date: 9/21/05.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Data Versioning Lecturer.
Introduction to DFS. Distributed File Systems A file system whose clients, servers and storage devices are dispersed among the machines of a distributed.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.
Replication (1). Topics r Why Replication? r System Model r Consistency Models – How do we reason about the consistency of the “global state”? m Data-centric.
Computer Science Lecture 19, page 1 CS677: Distributed OS Last Class: Fault tolerance Reliable communication –One-one communication –One-many communication.
CS425 / CSE424 / ECE428 — Distributed Systems — Fall 2011 Some material derived from slides by Prashant Shenoy (Umass) & courses.washington.edu/css434/students/Coda.ppt.
GLOBAL EDGE SOFTWERE LTD1 R EMOTE F ILE S HARING - Ardhanareesh Aradhyamath.
Replication (1). Topics r Why Replication? r System Model r Consistency Models r One approach to consistency management and dealing with failures.
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT By Jyothsna Natarajan Instructor: Prof. Yanqing Zhang Course: Advanced Operating Systems.
Highly Available Services and Transactions with Replicated Data Jason Lenthe.
Synchronization in Distributed File Systems Advanced Operating System Zhuoli Lin Professor Zhang.
Computer Science Lecture 19, page 1 CS677: Distributed OS Last class: Distributed File Systems Issues in distributed file systems Sun’s Network File System.
Chapter Five Distributed file systems. 2 Contents Distributed file system design Distributed file system implementation Trends in distributed file systems.
Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung
CS6320 – Performance L. Grewe.
Distributed File Systems
Distributed Shared Memory
Cloud Computing CS Distributed File Systems and Cloud Storage – Part II Lecture 13, Feb 27, 2012 Majd F. Sakr, Mohammad Hammoud and Suhail Rehman.
CSE 486/586 Distributed Systems Consistency --- 1
Distributed Systems CS
CSC-8320 Advanced Operating System
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT -Sumanth Kandagatla Instructor: Prof. Yanqing Zhang Advanced Operating Systems (CSC 8320)
Synchronization in Distributed File System
NFS and AFS Adapted from slides by Ed Lazowska, Hank Levy, Andrea and Remzi Arpaci-Dussea, Michael Swift.
Consistency Models.
Distributed Systems CS
Distributed Systems CS
Outline Announcements Fault Tolerance.
Advanced Operating Systems Chapter 11 Distributed File systems 11
Today: Coda, xFS Case Study: Coda File System
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
CSE 486/586 Distributed Systems Consistency --- 1
CSE 451: Operating Systems Winter Module 22 Distributed File Systems
Distributed File Systems
EEC 688/788 Secure and Dependable Computing
Distributed File Systems
EEC 688/788 Secure and Dependable Computing
Outline Announcements Lab2 Distributed File Systems 1/17/2019 COP5611.
Cary G. Gray David R. Cheriton Stanford University
CSE 451: Operating Systems Spring Module 21 Distributed File Systems
Distributed File Systems
CSE 451: Operating Systems Winter Module 22 Distributed File Systems
Distributed Systems CS
Today: Distributed File Systems
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Outline Review of Quiz #1 Distributed File Systems 4/20/2019 COP5611.
Distributed File Systems
Distributed File Systems
EEC 688/788 Secure and Dependable Computing
Distributed File Systems
Presentation transcript:

Distributed Systems CS 15-440 Distributed File Systems- Part II Lecture 21, Dec 3, 2012 Majd F. Sakr and Mohammad Hammoud

Today… Last session Distributed File Systems – Part I Today’s session Distributed File Systems – Part II Announcement: PS3 grades are out.

Discussion on Distributed File Systems Distributed File Systems (DFSs) Basics Basics DFS Aspects

DFS Aspects Aspect Description Aspect Description Architecture How are DFSs generally organized? Processes Who are the cooperating processes? Are processes stateful or stateless? Communication What is the typical communication paradigm followed by DFSs? How do processes in DFSs communicate? Naming How is naming often handled in DFSs? Synchronization What are the file sharing semantics adopted by DFSs? Consistency and Replication What are the various features of client-side caching as well as server-side replication? Fault Tolerance How is fault tolerance handled in DFSs? Aspect Description Architecture How are DFSs generally organized? Processes Who are the cooperating processes? Are processes stateful or stateless? Communication What is the typical communication paradigm followed by DFSs? How do processes in DFSs communicate? Naming How is naming often handled in DFSs? Synchronization What are the file sharing semantics adopted by DFSs? Consistency and Replication What are the various features of client-side caching as well as server-side replication? Fault Tolerance How is fault tolerance handled in DFSs?

DFS Aspects Aspect Description Aspect Description Architecture How are DFSs generally organized? Processes Who are the cooperating processes? Are processes stateful or stateless? Communication What is the typical communication paradigm followed by DFSs? How do processes in DFSs communicate? Naming How is naming often handled in DFSs? Synchronization What are the file sharing semantics adopted by DFSs? Consistency and Replication What are the various features of client-side caching as well as server-side replication? Fault Tolerance How is fault tolerance handled in DFSs? Aspect Description Architecture How are DFSs generally organized? Processes Who are the cooperating processes? Are processes stateful or stateless? Communication What is the typical communication paradigm followed by DFSs? How do processes in DFSs communicate? Naming How is naming often handled in DFSs? Synchronization What are the file sharing semantics adopted by DFSs? Consistency and Replication What are the various features of client-side caching as well as server-side replication? Fault Tolerance How is fault tolerance handled in DFSs?

Naming In NFS The fundamental idea underlying the NFS naming model is to provide clients with complete transparency Transparency in NFS is achieved by allowing a client to mount a remote file system into its own local file system However, instead of mounting an entire file system, NFS allows clients to mount only part of a file system A server is said to export a directory to a client when a client mounts a directory, and its entries, into its own name space

Mounting in NFS Sharing files becomes harder Client A Server Client B remote usr users usr work usr Mount steen subdirectory Mount steen subdirectory vu steen me mbox mbox mbox Exported directory mounted by Client A Exported directory mounted by Client B Sharing files becomes harder The file named /remote/vu/mbox at Client A The file named /work/vu/mbox at Client B

Sharing Files In NFS A common solution for sharing files in NFS is to provide each client with a name space that is partly standardized For example, each client may use the local directory /usr/bin to mount a file system A remote file system can then be mounted in the same manner for each user

Example Sharing files becomes easier Client A Server Client B remote usr users usr work usr Mount steen subdirectory Mount steen subdirectory bin steen bin mbox mbox mbox Exported directory mounted by Client A Exported directory mounted by Client B Sharing files becomes easier The file named /usr/bin/mbox at Client A The file named /usr/bin/mbox at Client B

Mounting Nested Directories In NFSv3 An NFS server, S, can itself mount directories, Ds, that are exported by other servers However, in NFSv3, S is not allowed to export Ds to its own clients Instead, a client of S will have to explicitly mount Ds If S will be allowed to export Ds, it would have to return to its clients file handles that include identifiers for the exporting servers NFSv4 solves this problem A file handle is a reference to a file within a file system. It is independent of the name of the file it refers to. A file handle is created by the server that is hosting the file system and is unique with respect to all file systems exported by the server. It is created when the file is created. The client is kept ignorant of the actual content of a file handle; it is completely opaque. File handles were 32 bytes in NFS version 2, but were variable up to 64bytes in version 3 and 128 bytes in version 4.

Mounting Nested Directories in NFS Client Server A Server B bin packages draw draw install install install Client imports directory from server A Server A imports directory from server B Client needs to explicitly import subdirectory from server B

NFS: Mounting Upon Logging In (1) Another problem with the NFS naming model has to do with deciding when a remote file system should be mounted Example: Let us assume a large system with 1000s of users and that each user has a local directory /home that is used to mount the home directories of other users Alice’s (a user) home directory is made locally available to her as /home/alice This directory can be automatically mounted when Alice logs into her workstation In addition, Alice may have access to Bob’s (another user) public files by accessing Bob’s directory through /home/bob

NFS: Mounting Upon Logging In (2) Example (Cont’d): The question, however, is whether Bob’s home directory should also be mounted automatically when Alice logs in If automatic mounting is followed for each user: Logging in could incur a lot of communication and administrative overhead All users should be known in advance A better approach is to transparently mount another user’s home directory on-demand The benefit of automatic mounting is that mounting file systems becomes completely transparent upon logging in

On-Demand Mounting In NFS On-demand mounting of a remote file system is handled in NFS by an automounter, which runs as a separate process on the client’s machine Client Machine Server Machine 1. Lookup “/home/alice” users 3. Mount request NFS Client Automounter 2. Create subdir “alice” alice Local File System Interface home 4. Mount subdir “alice” from server alice

DFS Aspects Aspect Description Aspect Description Architecture How are DFSs generally organized? Processes Who are the cooperating processes? Are processes stateful or stateless? Communication What is the typical communication paradigm followed by DFSs? How do processes in DFSs communicate? Naming How is naming often handled in DFSs? Synchronization What are the file sharing semantics adopted by DFSs? Consistency and Replication What are the various features of client-side caching as well as server-side replication? Fault Tolerance How is fault tolerance handled in DFSs? Aspect Description Architecture How are DFSs generally organized? Processes Who are the cooperating processes? Are processes stateful or stateless? Communication What is the typical communication paradigm followed by DFSs? How do processes in DFSs communicate? Naming How is naming often handled in DFSs? Synchronization What are the file sharing semantics adopted by DFSs? Consistency and Replication What are the various features of client-side caching as well as server-side replication? Fault Tolerance How is fault tolerance handled in DFSs?

Synchronization In DFSs File Sharing Semantics Lock Management

Synchronization In DFSs File Sharing Semantics Lock Management

Unix Semantics In Single Processor Systems Synchronization for file systems would not be an issue if files were not shared When two or more users share the same file at the same time, it is necessary to define the semantics of reading and writing In single processor systems, a read operation after a write will return the value just written Such a model is referred to as Unix Semantics Single Machine Original File a b Process A Write “c” a b c Process B Read gets “abc”

Unix Semantics In DFSs In a DFS, Unix semantics can be achieved easily if there is only one file server and clients do not cache files Hence, all reads and writes go directly to the file server, which processes them strictly sequentially This approach provides UNIX semantics, however, performance might degrade as all file requests must go to a single server

Caching and Unix Semantics The performance of a DFS with one single file server and Unix semantics can be improved by caching If a client, however, locally modifies a cache file and shortly another client reads the file from the server, it will get an obsolete file Client Machine #1 File Server Client Machine #2 1. Read “ab” 3. Read gets “ab” Process A a b a b Process B a b 2. Write “c” a b c

Session Semantics (1) One way out of getting an obsolete file is to propagate all changes to cached files back to the server immediately Implementing such an approach is very difficult An alternative solution is to relax the semantics of file sharing Session Semantics Changes to an open file are initially visible only to the process that modified the file. Only when the file is closed, the changes are made visible to other processes.

Session Semantics (2) Using session semantics raises the question of what happens if two or more clients are simultaneously caching and modifying the same file One solution is to say that as each file is closed in turn, its value is sent back to the server The final result depends on whose close request is most recently processed by the server A less pleasant solution, but easier to implement, is to say that the final result is one of the candidates and leave the choice of the candidate unspecified

Immutable Semantics (1) A different approach to the semantics of file sharing in DFSs is to make all files immutable With immutable semantics there is no way to open a file for writing What is possible is to create an entirely new file Hence, the problem of how to deal with two processes, one writing and the other reading, just disappears

Immutable Semantics (2) However, what happens if two processes try to replace the same file? Allow one of the new files to replace the old one (either the last one or non-deterministically) What to do if a file is replaced while another process is busy reading it? Solution 1: Arrange for the reader to continue using the old file Solution 2: Detect that the file has changed and make subsequent attempts to read from it fail

Atomic Transactions A different approach to the semantics of file sharing in DFSs is to use atomic transactions where all changes occur atomically A key property is that all calls contained in a transaction will be carried out in-order 1 A process first executes some type of BEGIN_TRANSACTION primitive to signal that what follows must be executed indivisibly 2 Then come system calls to read and write one or more files 3 When done, an END_TRANSACTION primitive is executed

Semantics of File Sharing: Summary There are four ways of dealing with the shared files in a DFS: Method Comment UNIX Semantics Every operation on a file is instantly visible to all processes Session Semantics No changes are visible to other processes until the file is closed Immutable Files No updates are possible; simplifies sharing and replication Transactions All changes occur atomically Method Comment UNIX Semantics Every operation on a file is instantly visible to all processes Session Semantics No changes are visible to other processes until the file is closed Immutable Files No updates are possible; simplifies sharing and replication Transactions All changes occur atomically Method Comment UNIX Semantics Every operation on a file is instantly visible to all processes Session Semantics No changes are visible to other processes until the file is closed Immutable Files No updates are possible; simplifies sharing and replication Transactions All changes occur atomically Method Comment UNIX Semantics Every operation on a file is instantly visible to all processes Session Semantics No changes are visible to other processes until the file is closed Immutable Files No updates are possible; simplifies sharing and replication Transactions All changes occur atomically Method Comment UNIX Semantics Every operation on a file is instantly visible to all processes Session Semantics No changes are visible to other processes until the file is closed Immutable Files No updates are possible; simplifies sharing and replication Transactions All changes occur atomically

Synchronization In DFSs File Sharing Semantics Lock Management

Central Lock Manager In client-server architectures (especially with stateless servers), additional facilities for synchronizing accesses to shared files are required A central lock manager can be deployed where accesses to a shared resource are synchronized by granting and denying access permissions Lease is obtained Lease is expired P0 P1 P2 P0 P1 P2 P0 P1 P2 Lock Request Lock Request Lock Granted Lock Denied Release Lock Granted Central Lock Manager Central Lock Manager Central Lock Manager 2 2 Queue Queue Queue

File Locking In NFSv4 NFSv4 distinguishes read locks from write locks Multiple clients can simultaneously access the same part of a file provided they only read data A write lock is needed to obtain exclusive access to modify part of a file NFSv4 operations related to file locking are: Operation Description Lock Create a lock for a range of bytes Lockt Test whether a conflicting lock has been granted Locku Remove a lock from a range of bytes Renew Renew the lease on a specified lock Operation Description Lock Create a lock for a range of bytes Lockt Test whether a conflicting lock has been granted Locku Remove a lock from a range of bytes Renew Renew the lease on a specified lock Operation Description Lock Create a lock for a range of bytes Lockt Test whether a conflicting lock has been granted Locku Remove a lock from a range of bytes Renew Renew the lease on a specified lock Operation Description Lock Create a lock for a range of bytes Lockt Test whether a conflicting lock has been granted Locku Remove a lock from a range of bytes Renew Renew the lease on a specified lock Operation Description Lock Create a lock for a range of bytes Lockt Test whether a conflicting lock has been granted Locku Remove a lock from a range of bytes Renew Renew the lease on a specified lock

Sharing Files in Coda When a client successfully opens a file f, an entire copy of f is transferred to the client’s machine The server records that the client has a copy of f If client A has opened f for writing and another client B wants to open f (for reading or writing) as well, it will fail If client A has opened f for reading An attempt by client B for reading succeeds An attempt by client B to open for writing would succeed as well

Transactional Behavior of Sharing Files in Coda Client A Session SA Open(RD) close Invalidate Server File f File f Overlap Open(RW) close Client B Session SB From a transactional point of view, this is okay as SA is considered to have been scheduled before SB Start of Session SA Start of Session SB

DFS Aspects Aspect Description Aspect Description Architecture How are DFSs generally organized? Processes Who are the cooperating processes? Are processes stateful or stateless? Communication What is the typical communication paradigm followed by DFSs? How do processes in DFSs communicate? Naming How is naming often handled in DFSs? Synchronization What are the file sharing semantics adopted by DFSs? Consistency and Replication What are the various features of client-side caching as well as server-side replication? Fault Tolerance How is fault tolerance handled in DFSs? Aspect Description Architecture How are DFSs generally organized? Processes Who are the cooperating processes? Are processes stateful or stateless? Communication What is the typical communication paradigm followed by DFSs? How do processes in DFSs communicate? Naming How is naming often handled in DFSs? Synchronization What are the file sharing semantics adopted by DFSs? Consistency and Replication What are the various features of client-side caching as well as server-side replication? Fault Tolerance How is fault tolerance handled in DFSs?

Consistency and Replication In DFSs Client-Side Caching Server-Side Replication

Consistency and Replication In DFSs Client-Side Caching Server-Side Replication

Client-Side Caching In Coda Caching and replication play important roles in DFSs, most notably when they are designed to operate over WANs To see how client-side caching is deployed in practice, we discuss client-side caching in Coda Clients in Coda always cache entire files, regardless of whether the file is opened for reading or writing Cache coherence in Coda is maintained by means of callbacks

Callback Promise and Break For each file, the server from which a client had cached the file keeps track of which clients have a copy of that file A server is said to record a callback promise When a client updates its local copy of a file, it notifies the server Subsequently, the server sends an invalidation message to other clients Such an invalidation message is called callback break

Using Cached Copies in Coda The interesting aspect of client-side caching in Coda is that as long as a client knows it has an outstanding callback promise at the server, it can safely access the file locally Client A Session SA Session S’A Open(RD) close Open(RD) Invalidate (callback break) Server File f Not OK File f File f OK (no file transfer) Open(RW) close Open(RW) close Client B Session SB Session S’ B

Consistency and Replication In DFSs Client-Side Caching Server-Side Replication

Server-Side Replication Server-side replication in DFSs is applied (as usual) for fault-tolerance and performance reasons However, a problem with server-side replication is that a combination of a high degree of replication and a low read/write ratio may degrade performance For an N-fold replicated file, a single update request will lead to an N-fold increase of update operations Concurrent updates need to be synchronized

Storage Groups in Coda The unit of replication in Coda is a collection of files called volume The collection of Coda servers that have a copy of a volume are known as Volume Storage Group (VSG) for that volume In the presence of failures, a client may not have access to all servers in a volume’s VSG A client’s Accessible Volume Storage Group (AVSG) for a volume consists of those servers in that volume’s VSG that the client can currently access If the AVSG is empty, the client is said to be disconnected

Maintaining Consistency in Coda Coda uses a variant of Read-One, Write-All (ROWA) to maintain consistency of a replicated volume When a client needs to read a file, it contacts one of the members in its AVSG of the volume to which that file belongs When closing a session on an updated file, the client transfers it in parallel to each member in the AVSG The scheme works fine as long as there are no failures (i.e., each client’s AVSG of a volume equals to the volume’s VSG)

A Consistency Example (1) Consider a volume that is replicated across 3 servers S1, S2, and S3 For client A, assume its AVSG covers servers S1 and S2, whereas client B has access only to server S3 Server S1 Server S3 Broken Network Server S2 Client A Client B

A Consistency Example (2) Coda uses an optimistic strategy for file replication. It allows both clients, A and B: To open a replicated file f, for writing Update their respective copies Transfer their copies back to the members in their AVSG Obviously, there will be different versions of f stored in the VSG The question is how can this inconsistency be detected and resolved? The solution adopted by Coda is deploying a versioning scheme

Versioning in Coda (1) The versioning scheme in Coda entails that a server Si in a VSG maintains a Coda Version Vector CVVi(f) for each file f contained in the VSG If CVVi(f)[j] = k, then server Si knows that server Sj has seen at least version k of file f CVVi(f)[i] is the number of the current version of f stored at server Si An update of f at server Si will lead to an increment of CVVi(f)[i]

Versioning in Coda (2) Server S1 Server S3 Broken Network Server S2 Client A Client B CVV1(f) = CVV2(f) = CVV3(f) = [1, 1, 1] (initially) When client A reads f from one of the servers in its AVSG, say S1, it also receives CVV1(f) After updating f, client A multicasts f to each server in its AVSG (i.e., S1 and S2)

Versioning in Coda (3) Server S1 Server S3 Broken Network Server S2 Client A Client B S1 and S2 will then record that their respective copies have been updated, but not that of S3 (i.e., CVV1(f) = CVV2(f) = [2, 2, 1]) Meanwhile, client B is allowed to open a session in which it receives a copy of f from server S3 If so, client B may subsequently update f

Versioning in Coda (4) Server S1 Server S3 Broken Network Server S2 Client A Client B How the inconsistency is resolved is discussed by Kumar and Satyanarayanan (1995) Say, client B then closes its session and transfers the update to S3 S3 updates its version vector to CVV3(f) =[1, 1, 2] When the partition is healed, the 3 servers will notice that a conflict has occurred and it needs to be repaired (i.e., inconsistency is detected)

DFS Aspects Aspect Description Aspect Description Architecture How are DFSs generally organized? Processes Who are the cooperating processes? Are processes stateful or stateless? Communication What is the typical communication paradigm followed by DFSs? How do processes in DFSs communicate? Naming How is naming often handled in DFSs? Synchronization What are the file sharing semantics adopted by DFSs? Consistency and Replication What are the various features of client-side caching as well as server-side replication? Fault Tolerance How is fault tolerance handled in DFSs? Aspect Description Architecture How are DFSs generally organized? Processes Who are the cooperating processes? Are processes stateful or stateless? Communication What is the typical communication paradigm followed by DFSs? How do processes in DFSs communicate? Naming How is naming often handled in DFSs? Synchronization What are the file sharing semantics adopted by DFSs? Consistency and Replication What are the various features of client-side caching as well as server-side replication? Fault Tolerance How is fault tolerance handled in DFSs?

Fault Tolerance In DFSs Fault Tolerance in DFSs is typically handled according to the principles we discussed in the Fault Tolerance lectures Hence, we will concentrate mainly on some special issues in fault tolerance for DFSs

Handling Byzantine Failures One of the problems that is often ignored when dealing with fault tolerance in DFSs is that servers may exhibit Byzantine failures To achieve protection against Byzantine failures, the server group must consist of at least 3k+1 processes (assuming that at most k processes fail at once) In practical settings, such a protection can only be achieved if non-faulty processes are ensured to execute all operations in the same order (Castro and Liskov, 2002) Each process request can be attached with a sequence number and a single coordinator can be used to serialize all operations

Quorum Mechanism (1) But: To this end, a quorum mechanism is used The coordinator can fail Furthermore, processes can go through a series of views, where: In each view the members of a group agree on the non-faulty processes A member initiate a view change when the current coordinator appears to be failing To this end, a quorum mechanism is used

Quorum Mechanism (2) The quorum mechanism goes as follows: Whenever a process P receives a request to execute an operation o with a number n in a view v, it sends this to all other processes P waits until it has received a confirmation from at least 2k other processes that have seen the same request Such a confirmation is called a quorum certificate A quorum of size 2K+1 is said to be obtained

Quorum Mechanism (3) The whole quorum mechanism consists of 5 phases: 1. Request 2. Pre-Prepare 3. Prepare 4. Commit 5. Reply Client Master Replica Replica Replica The Master multicasts a sequence # Each replica multicasts its acceptance to others The client sends a request Agreement is reached and all processes inform each other and execute the operation The client is given the result

Next Class Virtualization