Presentation is loading. Please wait.

Presentation is loading. Please wait.

Timed Quorum Systems … for large-scale and dynamic environments Vincent Gramoli, Michel Raynal.

Similar presentations


Presentation on theme: "Timed Quorum Systems … for large-scale and dynamic environments Vincent Gramoli, Michel Raynal."— Presentation transcript:

1 Timed Quorum Systems … for large-scale and dynamic environments Vincent Gramoli, Michel Raynal

2 OPODIS’07Gramoli, Raynal Context Large-scale dynamic distributed systems

3 OPODIS’07Gramoli, Raynal Context Large-scale dynamic distributed systems Nodes communicate through message-passing

4 OPODIS’07Gramoli, Raynal Context Large-scale dynamic distributed systems Nodes communicate through message-passing Nodes join/leave the system at any time

5 OPODIS’07Gramoli, Raynal Goal To emulate a shared-memory in this context read write

6 OPODIS’07Gramoli, Raynal Goal To emulate a shared-memory in this context Providing atomic (i.e. linearizable) read/write operations read write

7 OPODIS’07Gramoli, Raynal Roadmap 1.Model and preliminary definitions 2.Related work 3.Timed Quorum System (TQS) 4.An efficient implementation of TQS 5.Conclusion

8 OPODIS’07Gramoli, Raynal Simple model System of n interconnected nodes with unique IDs Asynchronous communication with neighbors (nodes whose ID is known) Dynamism intensity (i.e. churn) c We consider a single object (local atomicity)

9 OPODIS’07Gramoli, Raynal Quorum System Quorums are sets (of nodes) that mutually intersect. A Quorum System (QS) is a set of quorums. Q1 Q2 Q3 Q1 ∩ Q2 ≠ Ø Q1 ∩ Q3 ≠ Ø Q2 ∩ Q3 ≠ Ø Ex. 3 quorums of size q=2

10 OPODIS’07Gramoli, Raynal Operations Atomic quorum-based operations for static settings: [Attiya, Bar-Noy, Dolev, JACM 1996] Each node of the quorums maintains: – A local value v of the object – A unique tag t, the version number of this value

11 OPODIS’07Gramoli, Raynal 1) Reading a value Q1 Q2 Q3 value? tag? v1,t1 Operations Phase 1: Consult the most up-to-date value v

12 OPODIS’07Gramoli, Raynal Operations 1) Reading a value Q1 Q2 Q3 v1,t1 Phase 1: Consult the most up-to-date value v Phase 2: Propagate the consulted value

13 OPODIS’07Gramoli, Raynal Operations 1) Reading a value Q1 Q2 Q3 v1,t1 Phase 1: Consult the most up-to-date value v Phase 2: Propagate the consulted value Theorem of Attiya and Welch 1998: « Read must write » to prevent new/old inversions for unbounded # of readers.

14 OPODIS’07Gramoli, Raynal Operations 1) Reading a value Q1 Q2 Q3 Output: v1 Phase 1: Consult the most up-to-date value v Phase 2: Propagate the consulted value

15 OPODIS’07Gramoli, Raynal Operations 2) Writing a value v2 Q1 Q2 Q3 Input: v2

16 OPODIS’07Gramoli, Raynal Operations 2) Writing a value v2 Q1 Q2 Q3 max tag? t1 Phase 1: Consult the value version and choose a new one strictly larger

17 OPODIS’07Gramoli, Raynal Operations 2) Writing a value v2 Q1 Q2 Q3 v2,t2 (with t2 > t1) Phase 1: Consult the value version and choose a new one strictly larger Phase 2: Propagate the new value associated with the new version

18 OPODIS’07Gramoli, Raynal Dynamic Solutions Reconfigurable storage: a failing QS is replaced by a new one. –RAMBO: Shvartsman, Lynch, DISC’02 –RDS: Chockler et al. OPODIS’05 Structured dynamic quorums: failed servers are replaced by new ones. –AM05: Abraham, Malkhi, Dist. Comp. 2005 –NN05: Nadav, Naor, DISC’05 –SQUARE: Gramoli, Anceaume, Virgillito, SAC’07

19 OPODIS’07Gramoli, Raynal Dynamic Solutions Reconfigurable storage: a failing QS is replaced by new one. –RAMBO: Shvartsman, Lynch, DISC’02 –RDS: Chockler et al. OPODIS’05 Structured dynamic quorums: failed servers are replaced by new ones. –AM05: Abraham, Malkhi, Dist. Comp. 2005 –NN05: Nadav, Naor, DISC’05 –SQUARE: Gramoli, Anceaume, Virgillito, SAC’07 All solutions require bounded churn during any finite period

20 OPODIS’07Gramoli, Raynal Dynamic Solutions Reconfiguration complexity vs. operation latency tradeoff RAMBO RDS reconfiguration complexity operation latency SQUARE Prevents scalability! AM05 NN05

21 OPODIS’07Gramoli, Raynal Timed Quorum System Dynamic quorum systems should be: –Probabilistic: # of failures not necessarily bounded –Timed: no property can hold forever

22 OPODIS’07Gramoli, Raynal Timed Quorum System Timed access strategy ω: A mapping from any time t to a probability distribution on the possible quorums. Δ-Timed Quorum System (TQS): For any Q 1 and Q 2 accessed resp. with ω(t 1 ) and ω(t 2 ), if |t 2 – t 1 | ≤ Δ, then Q 1  Q 2 ≠ Ø with high probability.

23 OPODIS’07Gramoli, Raynal Timed Quorum System Δ-Timed Quorum System (TQS): For any Q(t 1 ) and Q(t 2 ) accessed resp. with ω(t 1 ) and ω(t 2 ): if |t 2 – t 1 | ≤ Δ, then Q(t 1 )  Q(t 2 ) ≠ Ø with high probability. Time Q(t1) Q(t2) Q(t3) Q(t4) Q(t5) Q(t1)  Q(t2)Q(t2)  Q(t3)Q(t3)  Q(t4)Q(t3)  Q(t5) Δ Example of a TQS: {Q(t1),Q(t2),Q(t3),Q(t4),Q(t5)}

24 OPODIS’07Gramoli, Raynal Consistency Probabilistic Atomicity: –In the real-time sequence of operations: Any operation verifies atomicity w.r.t. all preceding successful operations, and it is said successful Or this operation is said unsuccessful –Any operation is successful with high probability

25 OPODIS’07Gramoli, Raynal Theorem 1: If at least one quorum is accessed every Δ period of time, then Δ-TQS implements probabilistic atomicity. Consistency Probabilistic Atomicity: –In the real-time sequence of operations: Any operation verifies atomicity w.r.t. all preceding successful operations, and it is said successful Or this operation is said unsuccessful –Any operation is successful with high probability

26 OPODIS’07Gramoli, Raynal Some observations Replication is necessary for data persistence In large-scale systems, operations are frequent Theorem « read must write » of Attiya and Welch indicates that some information must be replicated in any operation

27 OPODIS’07Gramoli, Raynal Efficient TQS Implementation Underlying gossip-based shuffle of neighborhood: –Each node has constantly a new random set of neighbors Classical quorum-based operations: –Consulting v and t at some quorum –Choosing v’ and t’ to propagate –Propagating v’ and t’ to some quorum

28 OPODIS’07Gramoli, Raynal Efficient TQS Implementation Disseminate until q = O(  n) nodes are contacted 1 k k 1 k1 Client 1 k l

29 OPODIS’07Gramoli, Raynal Efficient TQS Implementation Assumptions: –neighbors are chosen uniformly at random –at least one operation succeeds every Δ time –c = rate of arrival = rate of departure  [0,1) Results: –This algorithm implements a TQS

30 OPODIS’07Gramoli, Raynal Efficient TQS Implementation Assumptions: –neighbors are chosen uniformly at random –at least one operation succeeds every Δ time –c = rate of arrival = rate of departure  [0,1) Results: –This algorithm implements a TQS –Replication is piggybacked into operations

31 OPODIS’07Gramoli, Raynal Efficient TQS Implementation Assumptions: –neighbors are chosen uniformly at random –at least one operation succeeds every Δ time –c = rate of arrival = rate of departure  [0,1) Results: –This algorithm implements a TQS –Replication is piggybacked into operations –The quorum size is O(  nD ) where D = (1-c) -Δ

32 OPODIS’07Gramoli, Raynal Efficient TQS Implementation Assumptions: –neighbors are chosen uniformly at random –at least one operation succeeds every Δ time –c = rate of arrival = rate of departure  [0,1) Results: –This algorithm implements a TQS –Replication is piggybacked into operations –The quorum size is O(  nD ) where D = (1-c) -Δ –The operation latency is O(log k  nD ) message delays, where D = (1-c) -Δ

33 OPODIS’07Gramoli, Raynal Efficient TQS Implementation Assumptions: –neighbors are chosen uniformly at random –at least one operation succeeds every Δ time –c = rate of arrival = rate of departure  [0,1) Results: –This algorithm implements a TQS –Replication is piggybacked into operations –The quorum size is O(  nD ) where D = (1-c) -Δ –The operation latency is O(log k  nD ) message delays, where D = (1-c) -Δ –Smallest quorum size O(  n) for static systems when D=O(1) cf. [Malkhi, Reiter, Wool, Wright, Inf. and Comp. Journal 2001]

34 OPODIS’07Gramoli, Raynal Conclusion We defined Timed Quorum System that: Is inherently dynamic: –NO underlying structure –Timely intersection requirement Ensures Probabilistic Atomicity Scales well: –O(  nD) messages by operation –O(log k  nD) time by operation Is optimal: –When D=O(1), translates into best known static result: O(  n)

35 OPODIS’07Gramoli, Raynal Open Issue TQS in Mobile Sensor Networks: –Consultation phase: Gather motes to consult t and v Scatter motes to make t and v likely visible –Propagation phase: Gather motes to propagate t’ and v’ Scatter motes to make t’ and v’ likely visible


Download ppt "Timed Quorum Systems … for large-scale and dynamic environments Vincent Gramoli, Michel Raynal."

Similar presentations


Ads by Google