Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 DISTRIBUTED SYSTEMS.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN Chapter 7 Consistency And Replication (CONSISTENCY PROTOCOLS) 1

CONSISTENCY PROTOCOLS Continuous Consistency - Bounding Numerical Deviation - Bounding Staleness Deviations - Bounding Ordering Deviations Sequential Consistency - Primary-based Protocol - Remote-write Protocol - Local-write Protocol - Replicated-write Protocol - Active Replication - Majority Voting (Quorum-Based) Cache-Coherence Protocols Implementing Client-Centric Consistency Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 2

consistency protocol A consistency protocol describes an implementation of a specific consistency model. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 3

Continuous Consistency Bounding Numerical Deviation: A solution for keeping the numerical deviation within bounds We concentrate on writes to a single data item x Each write W(x) has an associated weight that represents the numerical value by which x is updated, denoted as weight (W) For simplicity, we assume that weight (W) > 0. Each write W is initially submitted to one out of the N available replica servers, in which case that server becomes the write's origin, denoted as origin (W) each server S i will keep track of a log L i of writes that it has performed on its own local copy of x. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 4

Continuous consistency: Numerical deviation Consider a data item x and let weight(W) denote the numerical change in its value after a write operation W. Assume that weight(W) > 0. W is initially forwarded to one of the N replicas, denoted as origin(W). TW[ i,j ] are the writes executed by server S i that originated from S j TW[ i, j ] = ∑{weight(W) | origin(W) = S j & W Є log(S i )} The goal is for any time t, to let the current value V i at server S i deviate within bounds from the actual value v(t) of x. This actual value is completely determined by all submitted writes Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 5

Continuous consistency: Numerical deviation Actual value v(t) of x: N v(t) = v (0) + ∑ TW[k,k] k=1 value v i of x at replica i N v(i) = v (0) + ∑ TW[i,k] k=1 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 6

Continuous consistency: Numerical deviation for every server S i, we associate an upperbound δ i such that we need to enforce: v(t) – v i ≤ δi for every server S i when a server S i propagates a write originating from S j to S k the latter will be able to learn about the value TW [i,j] at the time the write was sent S k maintains a view TW k [ i, j ] of what it believes S i will have as value for TW[ i,j ]. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 7

Continuous consistency: Numerical deviation Solution S ĸ sends operations from its log to S i when it sees that TW ĸ [ i, k ] is getting too far from TW[ k, k ], in particular, when TW[ k, k ] - TW ĸ [ i, k ] > δ i / (N -1) Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 8

Continuous consistency: Numerical deviation when server S k notices that S i has not been keeping in the right pace with the updates that have been submitted to S k it forwards writes from its log to Si. This forwarding effectively advances the view TW ĸ [ i, k ] that S k has of TW[ i, k ], making the deviation TW[ i, k ]- TW ĸ [ i, k ] smaller. In particular, S k advances its view on TW[ i, k ]when an application submits a new write that would increase TW[ k, k ] - TW ĸ [ i, k ] beyond δ i / (N -1). Advancement always ensures that v(t) – v i ≤ δi Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 9

Bounding Staleness Deviations There are many ways to keep the staleness of replicas within specified bounds One approach is to timestamp each submitted write by its origin server let server S k keep a real-time vector clock RVC k where: RVCk[i] = T(i) means that S k has seen all writes that have been submitted to S i up to time T(i) T(i) denotes the time local to S i Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 10

Bounding Staleness Deviations If the clocks between the replica servers are loosely synchronized: Whenever server S k notes that T( k ) - RVC k [ i ] is about to exceed a specified limit it simply starts pulling in writes that originated from S i with a timestamp later than RVC k [ i ]· a replica server is responsible for keeping its copy of x up to date regarding writes that have been issued elsewhere Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 11

Numerical VS Staleness Bounds Numerical bounds follows a push approach, by letting an origin server keep replicas up to date by forwarding writes Staleness Bounds follows a Pull approach, Replica servers pull in updates from origin servers Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 12

Bounding Ordering Deviations ordering deviations in continuous consistency are caused by the fact that a replica server tentatively applies updates that have been submitted to it. each server will have a local queue of tentative writes for which the actual order in which they are to be applied to the local copy of x still needs to be determined The ordering deviation is bounded by specifying the maximal length of the queue of tentative writes enforce a globally consistent ordering of tentative writes Using primary-based or quorum-based protocols Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 13

Primary-Based Protocols each data item x in the data store has an associated primary, which is responsible for coordinating write operations on x provide a straightforward implementation of sequential consistency all processes see all write operations in the same order, no matter which backup server they use to perform read operations A distinction can be made as to whether: - the primary is fixed at a remote server - write operations can be carried out locally after moving the primary to the process where the write operation is initiated Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 14

Remote- Write Protocols all write operations need to be forwarded to a fixed single server Read operations can be carried out locally Also called (primary-backup protocol), Fig. 7-20 Disadvantage: it may take a relatively long time before the process that initiated the update is allowed to continue, an update is implemented as a blocking operation Alternative: nonblocking approach Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 15

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Remote-Write Protocols Figure 7-20. The principle of a primary-backup protocol. 16

Remote- Write Protocols nonblocking approach: As soon as the primary has updated its local copy of x, it returns an acknowledgment. After that, it tells the backup servers to perform the update as well Advantage: write operations may speed up considerably Disadvantage: fault tolerance, updates may not be backed up by other servers Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 17

Local- Write Protocols when a process wants to update data item x, it locates the primary copy of x, and subsequently moves it to its own location (Fig. 7-21) Advantage (in nonblocking protocol only): multiple, successive write operations can be carried out locally, while reading processes can still access their local copy updates are propagated to the replicas after the primary has finished with locally performing the updates Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 18

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Local-Write Protocols Figure 7-21. Primary-backup protocol in which the primary migrates to the process wanting to perform an update. 19

Local- Write Protocols Example primary-backup protocol with local writes Mobile computing in disconnected mode (ship all relevant files to user before disconnecting, and update later on) While being disconnected, all update operations are carried out locally. while other processes can still perform read operations (but no updates). Later, when connecting again, updates are propagated from the primary to the backups, bringing the data store in a consistent state again Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 20

Local- Write Protocols Example primary-backup protocol with local writes distributed file systems that require a high degree of fault tolerance a fixed central server through which all write operations take place The server temporarily allows one of the replicas to perform a series of local updates to speed up performance When the replica server is done, the updates are propagated to the central server, from where they are then distributed to the other replica servers Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 21

Replicated-Write Protocols write operations can be carried out at multiple replicas instead of only one A distinction can be made between: - active replication, in which an operation is forwarded to all replicas - majority voting (Quorum-Based Protocols) Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 22

Active Replication each replica has an associated process that carries out update operations operations need to be carried out in the same order everywhere. totally-ordered multicast mechanism. Such a multicast can be implemented using Lamport's logical clocks, Disadvantage: this implementation of multicasting does not scale well in large distributed systems. total ordering can be achieved using a central coordinator, also called a sequencer. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 23

Active Replication first forward each operation to the sequencer, which assigns it a unique sequence number and subsequently forwards the operation to all replicas Operations are carried out in the order of their sequence number. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 24

Quorum-Based Protocols A different approach to supporting replicated writes is to use voting clients to request and acquire the permission of multiple servers before either reading or writing a replicated data item Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 25

Quorum-Based Protocols how the algorithm works? a file is replicated on N servers to update a file, a client must first contact at least half the servers plus one (a majority) and get them to agree to do the update. Once they have agreed, the file is changed and a new version number is associated with the new file. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 26

Quorum-Based Protocols how the algorithm works? To read a replicated file, a client must also contact at least half the servers plus one and ask them to send the version numbers associated with the file. If all the version numbers are the same, this must be the most recent version Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 27

Quorum-Based Protocols to read a file of which N replicas exist, a client needs to assemble a read quorum, an arbitrary collection of any N R servers, or more to modify a file, a write quorum of at least N w servers is required. The values of N R and N w are subject to the following two constraints: N R +N W > N and N W > N / 2 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 28

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Quorum-Based Protocols Figure 7-22. Three examples of the voting algorithm. (a) A correct choice of read and write set. (b) A choice that may lead to write-write conflicts. (c) A correct choice, known as ROWA (read one, write all). 29

Cache-Coherence Protocols cache-coherence protocols ensures that a cache is consistent with the server-initiated replicas coherence detection strategy (when) coherence enforcement strategy (How) Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 30

Cache-Coherence Protocols Caches are controlled by clients instead of servers much research in the design and implementation of caches caching solutions differ in their coherence detection strategy, when inconsistencies are detected. Dynamic solutions: inconsistencies are detected at runtime. For example, a check is made with the server to see whether the cached data have been modified since they were cached In the case of distributed databases, dynamic detection- based protocols can be further classified by considering exactly when during a transaction the detection is done Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 31

Implementing Client-Centric Consistency A Naive Implementation: each write operation W is assigned a globally unique identifier. Such an identifier is assigned by the server to which the write had been submitted (Origin of W). for each client, a track of two sets of writes: The read set for a client consists of the writes relevant for the read operations performed by a client The write set consists of the identifiers of the writes performed by the client. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 32

Implementing Client-Centric Consistency Monotonic-read consistency: When a client performs a read operation at a server, that server is handed the client's read set to check whether all the identified writes have taken place locally (The size of the set may introduce a performance problem) If not, it contacts the other servers to ensure that it is brought up to date before carrying out the read operation Alternatively, the read operation is forwarded to a server where the write operations have already taken place After the read operation is performed, the write operations that have taken place at the selected server and which are relevant for the read operation are added to the client‘s read set Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 33

Implementing Client-Centric Consistency Monotonic-write consistency: When a client initiates a new write operation at a server, the server is handed over the client's write set It then ensures that the identified write operations are performed first and in the correct order After performing the new operation, that operation's write identifier is added to the write set Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 34

Implementing Client-Centric Consistency read-your-writes consistency : requires that the server where the read operation is performed has seen all the write operations in the client's write set The writes can be fetched from other servers before the read operation is performed Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 35

Implementing Client-Centric Consistency writes-follow-reads consistency: implemented by first bringing the selected server up to date with the write operations in the client's read set Then, later adding the identifier of the write operation to the write set, along with the identifiers in the read set Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 36

Improving Efficiency read set and write set associated with each client can become very large Solution: a client's read and write operations are grouped into sessions A session is typically associated with an application: it is opened when the application starts and is closed when it exits Whenever a client closes a session, the sets are cleared Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 37

Summary Consistency protocols describe specific implementations of consistency models a distinction can be made between primary-based protocols and replicated-write protocols In primary-based protocols, all update operations are forwarded to a primary copy that subsequently ensures the update is properly ordered and forwarded In replicated-write protocols, an update is forwarded to several replicas at the same time Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 38

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 DISTRIBUTED SYSTEMS.

Similar presentations

Presentation on theme: "Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 DISTRIBUTED SYSTEMS."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 DISTRIBUTED SYSTEMS.

Similar presentations

Presentation on theme: "Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 DISTRIBUTED SYSTEMS."— Presentation transcript:

Similar presentations

About project

Feedback