Consistency And Replication

Consistency And Replication
Chapter 10 Consistency And Replication

Topics Motivation Data-centric consistency models
Client-centric consistency models Distribution protocols Consistency protocols

Motivation Make copies of services on multiple sites, improve …
Reliability(by redundancy) If primary FS crashes, standby FS still works Performance Increase processing power Reduce communication delays Scalability Prevent overloading a single server (size scalability) Avoid communication latencies (geographic scale) However, updates are more complex When, who, where and how to propagate the updates?

Concurrency Control on Remote Object
A remote object capable of handling concurrent invocations on its own. A remote object for which an object adapter is required to handle concurrent invocations

Object Replication A distributed system for replication-aware distributed objects. A distributed system responsible for replica management

Distributed Data Store
Clients point of view: Its data store is capable of storing an amount of data

Distributed Data Store
Data Store’s point of view: General organization of a logical data store, physically distributed and replicated across multiple tasks.

Operations on A Data Store
Read:ri(x)b client i or process Pi performs a read for data item x and it returns value b Write: wi(x)a client i or process Pi performs a write on data x setting it to the new value a Operations not instantaneous Time of issue (when request is sent by client) Time of execution (when request is executed at a replica) Time of completion (when reply is received by client)

Example

Consistency Models Defines which interleaving of operations is valid (admissible) Different levels of consistency strong (strict, tight) weak (loose) Consistency model: Concerned with the consistency of a data store Specifies characteristics of valid ordering of operations A data store that implements a particular consistency model will provide a total ordering of operations that is valid according to this model

Consistency Models Data-centric models Client centric models:
Described consistency experienced by all clients Clients P1, P2, P3, … see same kind of orderings Client centric models: Described consistency only seen by clients who request it Clients P1, P2, P3 may see different kinds of orderings

Data-Centric Consistency Models
Strong ordering: Strict consistency Linear consistency Sequential consistency Causal consistency FIFO consistency Weak ordering: Weak consistency Release consistency Entry consistency

Strict Consistency Definition: A DDS (distributed data store) is strict consistent if any read on a data item of the DDS returns the value corresponding to the result of the most recent write on x, regardless of the location of the processes doing read or write Analysis: 1. In a single processor system strict consistency is for nothing, that’s exact the behavior of local shard memory with atomic reads/writes 2. However, it’s hard to establish a global time to determine what’s the most recent write 3. Due to message transfer delays this model is not achievable

Example Behavior of two processes, operating on the same data item.
(a) A strictly consistent store. (b) A store that is not strictly consistent.

Strict Consistency Problems
Assumption: y = 0 is stored on node 2, P1 and P2 are processes on node 1 and 2, Due to message delays, r(y) at t = t2 may result in 0 or 1 and at t = t4 may result in 0, 1 or 2 Furthermore: If y migrates to node 1 between t2 and t3 then r(y) issued at time t2 may even get value 2 (i.e. .back to the future.).

Sequential Consistency (1)
Definition: A DDS offers sequential consistency, if all processes see the same order of accesses to the DDS, whereby reads/writes of individual processes occur in program order, and reads/writes of different ones are performed in some sequential order. Analysis: 1. Sequential consistency is weaker than strict consistency 2. Each valid permutation of accesses is allowed iff all tasks see same permutation  2 runs of a distributed application may have different results 3. No global timing ordering is required

Example Each task sees all writes in the same order, even though not strict consistent.

Non-Sequential Consistency

Linear Consistency Definition: A DDS is said to be linear consistent (linearizable) when each operation is time-stamped and the following holds: The result of each execution is the same as if the (read and write) operations by all processes on the DDS were executed in some sequential order and the operations of each individual process appear in this sequence in the order specified by its program. In addition, if TSOP1(x) < TSOP2(y), then operation OP1(x) should precede OP2(y) in this sequence

Assumption Each operation is assumed to receive a time stamp using a globally available clock, but with only finite precision, e.g. some loosely coupled synchronized local clocks. Linear consistency is stricter than sequential one, i.e. a linear consistent DDS is also sequentially consistent. With linear consistency no longer each valid interleaving of reads and writes is allowed, the ordering has also obey the order implied by the time-stamps of these operations.

Causal Consistency (1) Definition: A DDS is assumed to provide causal consistency if, the following condition holds: Writes that are potentially causally related* must be seen by all tasks in the same order. Concurrent writes may be seen in a different order on different machines. * If event B is caused or influenced by an earlier event A, causality requires that everyone else also sees first A, and then B.

Causal Consistency (2) Definition: write2 is potentially dependent on write1, when there is a read between these 2 writes which may have influenced write2 Corollary: If write2 is potential dependent on write1  the only correct sequence is: write1 write2.

Causal Consistency: Example
This sequence is allowed with a casually-consistent store, but not with sequentially or strictly consistent store.

Causal Consistency: Example
A violation of a casually-consistent store. A correct sequence of events in a casually-consistent store.

Implementation Implementing causal consistency requires keeping track of which processes have seen which writes. Construction and maintenance of a dependency graph, expressing which operations are causally related (using vector time stamps)

FIFO or PRAM Consistency
Definition: DDS implements FIFO consistency, when all writes of one process are seen in the same order by all other processes, i.e. they are received by all other processes in the order they were issued. However, writes from different processes may be seen in a different order by different processes. Corollary: Writes on different processors are concurrent Implementation: Tag each write-operation of every process with: (PID, sequence number)

Example Both writes are seen on processes P3 and P4 in a different order, they still obey FIFO-consistency, but not causal consistency because write 2 is dependent on write1.

Example (2) Two concurrent processes with variable x,y = 0
Process P1 Process P2 x=1; y=1; if ( y==0 ) print(“A”); if (x==0) print(“B”); Possible results A B Nil AB?

Synchronization Variable
Background: not necessary to propagate intermediate writes. Synchronization variable Associated with one operation synchronize(S). Synchronize all local copies of the data store.

Compilation Optimization
int a, b, c, d, e, x, y; /* variables */ int *p, *q; /* pointers */ int f( int *p, int *q); /* function prototype */ a = x * x; /* a stored in register */ b = y * y; /* b as well */ c = a*a*a + b*b + a * b; /* used later */ d = a * a * c; /* used later */ p = &a; /* p gets address of a */ q = &b /* q gets address of b */ e = f(p, q) /* function call */ A program fragment in which some variables may be kept in registers.

Weak Consistency Definition: DDS implements weak consistency, if the following hold: Accesses to synchronization variables obey sequential consistency No access to a synchronization variable is allowed to be performed until all previous writes have completed everywhere No data access (read or write) is allowed to be performed until all previous accesses to synchronization variables have been performed

Interpretation A synchronization variable S knows just one operation: synchronize(S) responsible for all local replicas of the data store Whenever a process calls synchronize(S) its local updates will be updated on all replicas of the DDS and all updates of the other processes will be updated to its local replica of the DDS All tasks see all accesses to synchronization-variables in the same order

Interpretation (2) No data access allowed until all previous accesses to synchronization-variables have been done By doing a synch before reading shared data, a task can be sure of getting the “ up to date value” Unlike previous consistency models “weak consistency” forces the programmer to collect critical operations all together

Example Via synchronization you can enforce that you’ll get up-to-date values. Each process must synchronize if its writes should be seen by others. A process requesting a read without any synchronization measures may get out-of-date values.

Non-weak Consistency

Release Consistency Problems with weak consistency: When a synch-variable is accessed, the DDS doesn’t know whether this is done because a process has finished writing the shared variables or whether it is about reading them. It must take actions required in both cases, namely making sure that all locally initiated writes have been completed (i.e. propagated to all other machines), as well as gathering in all writes from other machines. Provide two operations: acquire and release

Details Idea: Implementation:
Distinguish between memory accesses in front of a critical section (acquire) and those behind of a critical section (release). Implementation: When a release is done, all the protected data that have been updated within the critical section will be propagated to all replicas.

Definition Definition: A DDS offers release consistency, if the following three conditions hold: 1. Before a read or write operation on shared data is performed, all previous acquires done by the process must have completed successfully. 2. Before a release is allowed to be performed, all previous reads and writes by the process must have been completed 3. Accesses to synchronization variables are FIFO consistent.

Example Valid event sequence for release consistency, even though P3 missed to use acquire and release. Remark: Acquire is more than a lock or enter_critical_section, it waits until all updates on protected data from other nodes are propagated to its local replicas, before it enters the critical section

Lazy Release Consistency
Problems with “eager” release consistency: When a release is done, the process doing the release pushes out all the modified data to all processes that already have a copy and thus might potentially read them in the future. There is no way to tell if all the target machines will ever use any of these updated values in the future  above solution is a bit inefficient, too much overhead.

Details With “lazy” release consistency nothing is done at a release.
However, at the next acquire the processor determines whether it already has all the data it needs. Only when it needs updated data, it needs to send messages to those places where the data have been changed in the past. Time-stamps help to decide whether a data is out-dated.

Entry Consistency Unlike release consistency, entry consistency requires each ordinary shared variable to be protected by a synchronization variable. When an acquire is done on a synchronization variable, only those ordinary shared variables guarded by that synchronization variable are made consistent. A list of shared variables may be assigned to a synchronization variable (to reduce overhead).

How to Synchronize? Every synch-variable has a current owner
An owner may enter and leave critical sections protected by this synchronization variable as often as needed without sending any coordination message to the others. A process wanting to get a synchronization-variable has to send a message to the current owner. The current owner hands over the synch-variable all together with all updated values of its previous writes. Multiple reads in the non-exclusive reads are possible.

Example A valid event sequence for entry consistency

Summary of Consistency Models
Description Strict Absolute time ordering of all shared accesses matters. Linearizability All processes must see all shared accesses in the same order. Accesses are furthermore ordered according to a (nonunique) global timestamp Sequential All processes see all shared accesses in the same order. Accesses are not ordered in time Causal All processes see causally-related shared accesses in the same order. FIFO All processes see writes from each other in the order they were used. Writes from different processes may not always be seen in that order (a) Weak Shared data can be counted on to be consistent only after a synchronization is done Release Shared data are made consistent when a critical region is exited Entry Shared data pertaining to a critical region are made consistent when a critical region is entered. (b) Consistency models not using synchronization operations. Models with synchronization operations.

Up to Now System wide consistent view on DDS
Independent of number of involved processes Mutual exclusive atomic operations on DDS Processes access only local copies Propagation of updates have to be made, whenever it is necessary to fulfill requirements of the consistency model Are there still weaker consistency models?

Client-Centric Consistency
Provide guarantees about ordering of operations only for a single client, i.e. Effects of an operations depend on the client performing it Effects also depend on the history of client’s operations Applied only when requested by the client No guarantees concerning concurrent accesses by different clients Assumption: Clients can access different replicas, e.g. mobile users

Mobile Users The principle of a mobile user accessing different replicas of a distributed database.

Eventual Consistency If updates do not occur for a long period of time, all replicas will gradually become consistent Requirements: Few read/write conflicts No write/write conflicts Clients can accept temporary inconsistency Examples: DNS: Updates slowly (1 – 2 days) propagating to all caches. WWW: Few write/write conflicts Mirrors eventually updated Cached copies (browser or Proxy) eventually replaced.

Client Centric Consistency Models
Monotonic Reads Monotonic Writes Read Your Writes Writes Follow Reads

Monotonic Reading Definition: A DDS provides monotonic-read consistency if the following holds: If process P reads the value of data item x, any successive read operation on x by that process will always return the same value or a more recent one (independently of the replica at location L where this new read will be done).

Example Systems Distributed database with distributed and replicated user-mailboxes. s can be inserted at any location. However, updates are propagated in a lazy (i.e. on demand) fashion.

Example The read operations performed by a single process P at two different local copies of the same data store. A monotonic-read consistent data store A data store that does not provide monotonic reads.

Monotonic Writing Definition: DDS provides monotonic-write consistency if the following holds: A write operation by process P on data item x is completed before any successive write operation on x by the same process P can take place. Remark: Monotonic-writing ~ FIFO consistency Only applies to writes from one client process P Different clients -not requiring monotonic writing may see the writes of process P in any order

Example The write operations performed by a single process P at two different local copies of the same data store A monotonic-write consistent data store. A data store that does not provide monotonic-write consistency.

Reading Your Writes Definition: DDS provides “read your write” consistency if the following holds: The effect of a write operation by a process P on a data item x at a location L will always be seen by a successive read operation by the same process. Example of a missing read-your-write consistency: Updating a website with an editor, if you want to view your updated website, you have to refresh it, otherwise the browser uses the old cached website content. Updating passwords

Example A data store that provides read-your-writes consistency.
A data store that does not.

Writes Following Reads
Definition: DDS provides “ writes-follow-reads” consistency if the following holds: A write operation by a process P on a data item x following a previous read by the same process, is guaranteed to take place on the same or even a more recent value of x, than the one having been read before.

Example A writes-follow-reads consistent data store
A data store that does not provide writes-follow-reads consistency

Implementing Client Centric Consistency
Naive Implementation (ignoring performance): Each write gets a globally unique identifier Identifier is assigned by the server that accepts this write operation for the first time For each client two sets of write identifiers are maintained: Read-set(client C) := RS(C) {write-IDs relevant for the reads of this client C} Write-set(client C) := WS(C) {write-IDs having been performed by client C}

Implementing Monotonic Reads
When a client C performs a read at server S, that server is handed the client’s read set RS(C) to control whether all identified writes have taken place locally at server S. If not, server has to be updated before reading!

Implementing Monotonic Write
If client initiates a write on a server S, this server S gets the clients write-set in order to update server S. A write on this server is done according to the times stamped WID. Having done the new write, client’s write-set is updated with this new write. The response time of a client might thus increase with an ever increasing write-set. However, what to do if all the reader write-sets of a client get larger and larger?

Improving Efficiency with RS and WS
Major drawback: potential sizes of read- and write sets  Group all write- and read-operations of a client in a so called session (mostly assigned with an application) Every time a client closes its current session, all updates are propagated and these sets are deleted afterwards

Summary on Consistency Models
Choosing the right consistency model requires an analysis of the following trade-offs: Consistency and redundancy All replicas must be consistent All replicas must contain full state Reduced consistency  reduced reliability Consistency and performance Consistency requires extra work Consistency requires extra communication May result in loss of overall performance

Distribution Protocols
Replica Placement Permanent Replicas Server-Initiated Replicas Client-Initiated Replicas Update Propagation State versus Operations Pull versus Push Protocols Unicasting versus Multicasting Epidemic Protocols Update Propagation Models Removing data

Replica Placement The logical organization of different kinds of copies of a data store into three concentric rings.

Replica Placement Permanent replicas Server-initiated replicas
Initial set of replicas. Created and maintained by DDS-owner(s) Writes are allowed E.g., web mirrors Server-initiated replicas Enhance performance Not maintained by owner of DDS Placed close to groups of clients Manually Dynamically Client-initiated replicas Client caches Temporary Owner not aware of replica Placed closest to a client Maintained by host (often the client)

Update Propagation

What to Be Propagated? Propagate only a notification of an update (“invalidation”) Typical for invalidation protocols May include information which part of the DDS has been updated Work best, when ratio of reads/write is low Propagate updated data from one replica to another Work best, if ratio of reads/writes is high You may also aggregate some update before sending them across the network Propagate the update operation to other replicas (“active replication”) This approach called active replication works if the size of parameters associated with each operation is small compared to the updated data

Pull versus Push Protocols
Push protocol , i.e. updates are propagated to other replicas without those replicas having asked for them Used between permanent and server initiated replicas, i.e. to achieve a relatively high degree of consistence Pull protocol , i.e. a server (or a client) asks another server to provide the updates Used by client caches, e.g. when a client requests a website, not having updated for a longer period of time, it may check the original web site, whether updates have been made Efficient when read-to-write ratio is relatively low.

Pull versus Push Protocols
Issue Push-based Pull-based State of server List of client replicas and caches None Messages sent Update (and possibly fetch update later) Poll and update Response time at client Immediate (or fetch-update time) Fetch-update time A comparison between push-based and pull-based protocols in the case of multiple client, single server systems.

Unicasting Potential overhead with unicasting in a LAN.
Good for pull-based approach.

Multicasting With multicasting an update message can be propagated more efficiently across a LAN. Good for push-based approach.

Epidemic Protocols Implementing eventual consistency you may rely on epidemic protocols. No guarantees for absolute consistency are given, but after some time epidemic protocols will send updates to all replicas. Notions: An infective is a server with a replica that is willingly to spread to other servers, too A susceptible, is a server that has not yet been infected, i.e. updated A removed server is a server, that does not want to propagate any information

Anti-Entropy Protocol
Server P picks another server Q at random, and subsequently exchanges updates with Q, there are 3 approaches how to exchange updates: P only pushed its own updates to Q P only pulls in new updates from Q P and Q exchange to each other their updates, i.e. a push-pull approach

Gossip Protocols Rumor spreading or gossiping works as follows:
If server P has been updated for data item x, it contacts another arbitrary server Q and tries to push its new update of x to Q. However, if Q got this update already by some other server, P is so much disappointed, that it will stop gossiping with a prob. = 1/k

Gossip Protocols (2) Although gossiping really works quite well on average, you cannot guarantee that every server will be updated. In a DDS with a “large” number of replicas, the fraction s of servers remaining ignorant towards an update, i.e. are still susceptible is: s = e-(k+1)(1-s)

Analysis of Epidemic Protocols
Advantages: Scalability, due to limited # of update messages Disadvantage: Spreading the deleting of a data is quite cumbersome, due to an unwanted side effect: Suppose, you have deleted on server S data item x, but you may receive again an old copy of data item x from some other server due to still ongoing gossiping

Consistency Protocols
Primary-Based Protocols Remote-Write Protocols Local-Write Protocols Replicated-Write protocols Active Replication Quorum-Based Protocols

Primary-Based Protocols
Each data item of a DDS has an associated primary, responsible for coordinating write operations on x Primary server: Fixed,i.e. a specific remote server, i.e. remote writing Dynamic, primary is migrated to the place, of the next write

Remote-Write Protocols (1)
Primary-based remote-write protocol with a fixed server to which all read and write operations are forwarded.

Remote-Write Protocols (2)
The principle of primary-backup protocol.

Local-Write Protocols (1)
Primary-based local-write protocol in which a single copy is migrated between processes.

Local-Write Protocols (2)
Primary-backup protocol in which the primary migrates to the process wanting to perform an update.

Replicated-Write Protocols
Writes can take place at multiple replicas, instead of on only a specific primary server. Active replication Operation is forwarded to all replicas Problem: make sure all operations need to be carried out in the same order everywhere. Scalability Replicated invocation Majority voting Before reading or writing ask a subset of all replicas

Replicated Invocation for Active Replication

Solutions Forwarding an invocation request from a replicated object.
Returning a reply to a replicated object.

Quorum-Based Protocols
Preliminaries: If a client wants to read or write, it first must request and acquire permission of multiple servers. Example: A DFS with file F being replicated on N servers. If an update has to be made, demand, that the client first contacts half of the servers plus 1, and get them to agree to do his update. Once, they have agreed, file F gets a new version number F(x.y) To read file F, a client also must contact at least half of the servers and ask them, to hand out the current version number of F.

Gifford’s Quorum-Based Protocols
To read a file F a client must use a read-quorum, an arbitrary assemble of NR servers. To write a file F, at least NW servers( the write quorum) is required. The following must hold: A) NR +NW > N B) NW > N/2 A) Is used to prevent read-write conflicts B) Is used to prevent write-write conflicts

Examples Three examples of the voting algorithm:
A correct choice of read and write set A choice that may lead to write-write conflicts A correct choice, known as ROWA (read one, write all)

Consistency And Replication

Similar presentations

Presentation on theme: "Consistency And Replication"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Consistency And Replication

Similar presentations

Presentation on theme: "Consistency And Replication"— Presentation transcript:

Similar presentations

About project

Feedback