Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Multiversion Update-Serializable Protocol for Genuine Partial Data Replication Sebastiano Peluso, Pedro Ruivo, Paolo Romano, Francesco Quaglia and Luís.

Similar presentations


Presentation on theme: "A Multiversion Update-Serializable Protocol for Genuine Partial Data Replication Sebastiano Peluso, Pedro Ruivo, Paolo Romano, Francesco Quaglia and Luís."— Presentation transcript:

1 A Multiversion Update-Serializable Protocol for Genuine Partial Data Replication Sebastiano Peluso, Pedro Ruivo, Paolo Romano, Francesco Quaglia and Luís Rodrigues Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland

2 Distributed STMs STMs are being employed in new scenarios: Database caches in three-tier web apps (FénixEDU) HPC programming language (X10) In-memory cloud data grids (Coherence, Infinispan) New challenges: Scalability Fault-tolerance Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland REPLICATION 1

3 Partial Replication Each site stores a partial copy of the data. Genuine partial replication schemes maximize scalability by ensuring that: Only data sites that replicate data item read or written by a transaction T, exchange messages for executing/committing T. Existing 1-Copy Serializable implementations enforce distributed validation of read-only transactions [SRDS10]: considerable overheads in typical workloads Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland 2

4 Issues with Partial Replication Extending existing local multiversion (MV) STMs is not enough. Local MV STMs rely on a single global counter to track version advancement. Problem: Commit of transactions should involve ALL NODES NO GENUINENESS = POOR SCALABILITY Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland 3

5 GMU: Genuine Multiversion Update- Serializable Replication [ICDCS12] In the execution/commit phase of a transaction T, ONLY nodes which store data items accessed by T are involved. It uses multiple versions for each data item It builds visible snapshots = freshest consistent snapshots taking into account: 1. causal dependencies vs. previously committed transactions at the time a transaction began, 2. previous reads executed by the same transaction Vector clocks used to establish visible snapshots Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland G M U 4

6 High Level Overview (i) Transactions commit using a vector clock. Each node stores a log of committed vector clocks. Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland 5 Initial view of the visible snapshot Upon a transaction T begins on N: it acquires the most recent vector clock in N’s commit log. View extension of the visible snapshot Upon T reads on a node N: T’s vector clock can be modified according to N’s commit log. Three reading rules are applied using T’s vector clock.

7 High Level Overview (ii) Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland 6 Write operation Upon a transaction T writes V on data item O: it inserts in T’s write-set. Commit operation Read-only transactions always commit. Update transactions run a genuine 2-Phase Commit: Upon prepare message reception (participant-side) acquire read/write locks and validate read-set, send back a tentative commit vector clock. If all replies are positive (coordinator-side) multicast write-set and final commit vector clock.

8 Rule 1: Reading Lower Bound Node 0 Node 1 (it stores X) Node 2 (it stores Y) X (2) T 1 :R(X) (1,1,1) (1,2,2) (1,1,1) Y (2) (1,2,2) T 0 :W(X,v) T 0 :W(Y,w) (1,1,1) T 1 :R(Y) Y (2) (1,2,2) Most recent VC in VCLog T 1.VC T 0 :Commit Commit (1,2,2) T 1.VC Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland 7

9 Rule 2: Reading Upper Bound Node 0 Node 1 (it stores X) Node 2 (it stores Y) X (3) Y (2) X (1) T 1 :R(X) (1,1,1) (1,3,3) (1,1,1) Y (3) (1,3,3) T 0 :W(X,v) T 0 :W(Y,w) X (1) (1,1,1) T 1 :R(Y)Y (2) T 1 :Commit (1,1,1) Most recent VC in VCLog T 1.VC T 0 :Commit Commit (1,1,2) T 1.VC Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland (1,1,2) Y (1) 8

10 Rule 3: Selection of Data Versions Informally: observe the most recent consistent version of data item id on node i based on T’s history (previous reads). Formally: iterate over the versions of id and return the most recent one s.t. id.version.VN <= T.VC[i] Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland 9

11 Building the commit Vector Clock Based on a variant of the Skeen’s total order multicast algorithm [SKEEN85]. Intuition: Serialize all-and-only conflicting transactions, tracking direct and transitive conflict dependencies, causal relationship Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland 10

12 Consistency Criterion GMU ensures Extended Update Serializability: Update Serializability [ICDT86] ensures: 1-Copy-Serializabilty (1CS) on the history restricted to committed update transactions; 1CS on the history restricted to committed update transactions and any single read-only transaction. But it can admit non-1CS histories containing at least 2 read-only transactions. Extended Update Serializability [Adya99]: ensures US property also to executing transactions; analogous to opacity in STMs. Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland 11

13 Experiments on private cluster 8 core physical nodes TPC-C - 90% read-only xacts - 10% update xacts - 4 threads per node - moderate contention (15% abort rate at 20 nodes) Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland 12

14 Thanks for the attention Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland 13

15 References Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland [Adya99] A. Adya, “Weak consistency: A generalized theory and optimistic implementations for distributed transactions,” tech. rep., PhD Thesis, Massachusetts Institute of Technology, 1999. [ICDCS12] Sebastiano Peluso, Pedro Ruivo, Paolo Romano, Francesco Quaglia, Luís Rodrigues. “When Scalability Meets Consistency: Genuine Multiversion Update-Serializable Partial Replication”. The IEEE 32 nd International Conference on Distributed Computing Systems, June, 2012. [ICDT86] R. C. Hansdah and L. M. Patnaik, “Update serializability in locking,”. International Conference of Database Theory, vol. 243 of Lecture Notes in Computer Science, pp. 171–185, Springer Berlin / Heidelberg, 1986. [SKEEN85] D. Skeen. “Unpublished communication”, 1985. Referenced in K. Birman, T. Joseph “Reliable Communication in the Presence of Failures”, ACM Trans. on Computer Systems, 47-76, 1987 [SRDS10] Nicolas Schiper, Pierre Sutra, Fernando Pedone. “P-Store: Genuine Partial Replication in Wide Area Networks”. Proc. of the 29 th Symposium of Reliable Distributed Systems, 2010. 14


Download ppt "A Multiversion Update-Serializable Protocol for Genuine Partial Data Replication Sebastiano Peluso, Pedro Ruivo, Paolo Romano, Francesco Quaglia and Luís."

Similar presentations


Ads by Google