Presentation is loading. Please wait.

Presentation is loading. Please wait.

Distributed Systems 2006 Retrofitting Reliability* *With material adapted from Ken Birman.

Similar presentations


Presentation on theme: "Distributed Systems 2006 Retrofitting Reliability* *With material adapted from Ken Birman."— Presentation transcript:

1 Distributed Systems 2006 Retrofitting Reliability* *With material adapted from Ken Birman

2 Distributed Systems 20062 Plan Tracking group membership: We’ll base it on 2PC and 3PC Fault-tolerant multicast: We’ll use membership Ordered multicast: We’ll base it on fault-tolerant multicast Tools for solving practical replication and availability problems: we’ll base them on ordered multicast Robust Web Services: We’ll build them with these tools 2PC and 3PC: Our first “tools” (lowest layer)

3 Distributed Systems 20063 Distributed Shared Memory A new goal: software Distributed Shared Memory (DSM) –Looks like a memory-mapped file (cf. Linux mmap) Data is automatically replicated, so all distributed processes see identical memory content

4 Distributed Systems 20064 So what’s the model? Application “maps” a region of memory While running, it sometimes –Acquires a read or write lock for memory –Then for a period of time reads or writes some part of the DSM (some “pages”) –Then releases the lock (This is just our distributed replication model in a new form…)

5 Distributed Systems 20065 To implement this DSM… We need a way to –Implement the mapping –Detect that a page has become dirty –Invoke our communication primitives when a lock is requested or released Idea for Linux –Use the Linux mapped file primitives and build a DSM “daemon” to send updates –Intercept Linux semaphore operations for synchronization So, for us, it reduces to how to handle replication…

6 Distributed Systems 20066 DSM with a daemon DSMD Wrapper intercepts mmap and semaphore operations and redirects those associated with the shared memory region to the DSMD. We’ll assume that the developer comes up with a sensible convention for associating semaphores either with entire mapped regions, or with pages of them mmap creates shared memory regions. The DSMD will multicast the contents of a page when the associated semaphore lock is released. Properties of the multicast and of the locking “protocol” determine the DSM properties seen by the programmer. The programmer doesn’t use multicast directly

7 Distributed Systems 20067 Design choices? Must pick a memory coherency model, i.e., type of consistency for DSM Strong consistency –The DSM behaves like a single non-replicated memory Weak consistency –The DSM can be highly inconsistent –Updates propagate after an unspecified and possibly long delay, and copies of the mapped region may differ Release consistency –Locking for mutual exclusion; consistent as long as locking is used Causal consistency –If DSM accesses a  b, then b will observe the results of a

8 Distributed Systems 20068 Best choice? We should probably pick release consistency or causal consistency –Strong consistency very expensive Can implement using our protocol for replication via asynchronous cbcast –If we obtain read as well as write locks, we get causal consistency The updates end up totally ordered along mutual exclusion paths –The primitive is strong enough to maintain this delivery ordering at all copies

9 Distributed Systems 20069 False sharing Suppose multiple independent objects map to the same page but have distinct locks –One issue designer must worry about In a traditional, hardware DSM page ends up “ping- ponging” between the machines – leads to trashing –In our solution, this just won’t work because of overhead Our mechanism requires that there be one lock per unit of memory transmitted (cf. partial overwrite) Let application developers avoid false sharing

10 Distributed Systems 200610 Summary We have looked at ways of using reliability technologies in existing systems –Wrappers and toolkits –Wrapping RPC servers –Unbreakable stream connections –Reliable DSM This a.o. course skips security which is essential –Other course(s) handle this... –Read Chapter 22 of [Birman, 2005]


Download ppt "Distributed Systems 2006 Retrofitting Reliability* *With material adapted from Ken Birman."

Similar presentations


Ads by Google