Relative Power of Models in Distributed Computing Petr Kuznetsov TU Berlin/DT-Labs.

Relative Power of Models in Distributed Computing Petr Kuznetsov TU Berlin/DT-Labs

What makes distributed computing special? Failures and lack of synchrony Computing units are unreliable and unsynchronized (otherwise ≅ centralized system) Blind Men and the Elephant

What makes distributed system hard?  Multitude of abstractions and models  Lack of categorization, complexity classes  Unnatural? Blind Men and the Elephant

4 Distributed modeling jumble CAS, LL/SC? RW shared memory? Message passing? Snapshot memory? Sub- consensus objects? Clouds, data centers…? t-resilience ?

5 Today  t-resilience ≅ wait-freedom [BG simulation] ( t+1)-process wait-free system is equivalent to a n- process t-resilient system (t<n) a colorless task T is solvable t-resiliently iff T is solvable wait-free by t+1 processes  Non-uniform fault models ≅ wait-freedom Set consensus power of an adversary Colorless tasks Colored conjectures

6 Model  n processes p 1,…,p n  Read-write shared memory  Atomic snapshots [AADGMS90]  Crash failures p2p2 p1p1 W(R 1,1) W(R 2,1) p3p3 Snapshot(R) W(R 3,1)Snapshot(R) (1,0,0) (1,1,1) (1,0,1)

7 Distributed tasks Functions in distributed computing A task (I,O,Δ):  I – set of input vectors  O – set of output vectors  Task specification Δ: I →2 O

8 k-set consensus Processes start with inputs in V (|V|>k) Safety: The set of outputs is a subset of inputs of size at most k Liveness: Every correct process eventually outputs (wait- freedom)  k=1: consensus  wait-free (k+1)-process k-set consensus is impossible  Colorless: a process is free to adopt inputs or outputs

9 The wait-free model: 2 processes P Q P reads before Q writes P reads after Q writes Q reads after P writes Q reads before P writes Full-information protocol: while not done write(view) view := snapshot(memory) Wait-free consensus is impossible!

10 The wait-free model: 3 processes PQ R

11 The wait-free model: 3 processes PQ R

12 The wait-free model: 3 processes PQ R Sperner’s Lemma: wait- free (k+1)-process k-set consensus is impossible [BG93,SZ93,HS93]

13 What about k-resilience?  Assume 1 out of a million process may fail?  Can we solve consensus?  Can we solve k-set consensus in a k-resilient system for some n>k? No! (Otherwise we could do it wait-free)

14 BG agreement Part I

15 BG agreement [Borowsky, Gafni, 1993] Safety as in consensus:  Every output is an input  No two outputs are different Liveness:  Every correct process outputs, if no participating process fails

16 BG-agreement: protocol Code for p i : write(A i,input i ) S:=snapshot(A) write(B i,S) wait until for all p j in S, B j ≠  decide on the smallest input in the smallest B j

17 BG agreement: correctness Liveness: Suppose each participant takes 3 steps: every wait terminates (If a participant “dies” between the writes – block) Safety: Consider p t that wrote the smallest snapshot S to B t for all B j ≠ , p t is in B j every p i waits until p t writes every p i decides on the smallest input in S

18 BG simulation  k+1 simulators q 1,…,q k+1  n simulated processes p 1,..,p n (n>k) p1p1 pnpn q1q1 q k+1 ….

19 Simulation  Every simulator q i takes a snapshot to get the “most recent” view of p j  Run 3 steps of BG-agreement to agree on the view of p j  If the view is decided, register the view  If not proceed to the next process in round-robin  Safe: the simulated views  Live? No! What if a BG-agreement blocks?

20 Simulation order  Run the BG agreements in round-robin  When done with the “write-phase” of a BG- agreement for a step of p j – proceed to p j+1 mod n p2p2 p4p4 q1q1 q3q3 p1p1 p3p3 q2q2

21 Progress  A faulty simulator may block at most one process  At most k simulators can fail  At most k simulated process fail – a k-resilient run!

22 Application: colorless tasks Suppose T=(I,O,Δ) is solvable k-resiliently, let A be the corresponding (full-information) protocol  Each q i starts with an input in V I  The first view of each p j is its input (each q i proposes its own input)  In the resulting k-resilient run of A, some p j output  q i adopts the first output value it sees. k-resilient k-set consensus is impossible!

23 Works both ways  n simulators p 1,…,p n can k-resiliently simulate a wait-free run on q 1,..,q k+1  what matters is the number of failures, not the number of simulators All k-resilient systems are equivalent! (with respect to colorless tasks)

24 On Non-Uniform Fault Models Part II

25 Uniform fault models  Processes failures are IID P(pi fails)=ε P(no faults)=(1-ε) n P(t or less faults)=I 1-ε (n-t,t+1)  t-resilience: at most t faults  wait-freedom: at most n-1 faults

26 But…  Processes may fail In a correlated way In non-identical way

27 Non-identical faults  Processes p,q,r p and r fail independently q is unlikely to fail  Possible runs: pqr (no faults) pq (r fails) qr (p fails) q (p and r fail) p q r

28 Correlated faults  Processes p,q,r p and q share unreliable hardware q and r share unreliable software It is unlikely that both hardware and software fail  Possible runs: pqr (no faults) p (software fault) r (hardware fault) p q r

29 Generic adversaries [Delporte et al., 2009] p p rs A - set of process subsets The model = all runs with in correct sets in A A= {p,qr,rs} qrs

30 Hitting sets A can solve 2-set consensus Can it solve consensus? pqrs Hitting set of A A={p,qr,rs} pqrs Yes: for all S in A, h(A S )=1

31 Commit-adopt [Gafni, 1998] Liveness:  Every correct process returns Safety:  Return (adopt,v) or (commi,v) where v was proposed  If one value proposed, only (commit,*) is returned  If one returns (commit,v), then only (*,v) is rerurned

32 Commit-adopt: wait-free protocol Code for p i : write(A i,input i ) S:=collect(A) if |vals(S)|=1 then write(B i,input i ) else write(B i,fail) S:=collect(B) if there are no fails in S then return (commit,input i ) if for some j, input j is in S return (adopt, input j ) return (commit,input i )

33 Commit-adopt: correctness Liveness: immediate Safety:  If all proposals are the same – every process commits  At most one non-fail value in B  If a process commits: the value is seen by every terminated process in B

34 Leader-based consensus v i := input i while true r++ (u,v i ) := CommitAdopt r (v i ) if u = commit then return v i v i := get the estimate from the Leader i Safety provided by Commit-Adopt Liveness if, eventually, the same correct leader is elected

35 Electing a leader using A for all S in A, h(A S )=1 shared C[1],…,C[n] \\ shared counters R[1],…,R[n] \\ the most recent rounds while true r++ R[i]:= r wait until for some S in A: for all j in S, R[j]≥r for all j not in S: C[j]++ \\ increment the counter Leader i := argmin(C[1],…,C[n])

36 Set consensus number of A setcon(A)=  0, if A is empty  max S in A min a in S setcon(A S,a ) +1, otherwise A S – all S’ in A, subsets of S A S,a – all S’ in A S, not containing a A = {pqr,pq,pr,p,q,r}, setcon(A)=2: S = pqr and a = p, we have A S,a = {q, r} and setcon(A S,a ) = 1

37 A = {pqr, pq, pr, p, q, r} for S=pqr, a=p, A S,a ={q,r} and setcon (A S,a ) = 1 setcon(A)= setcon (A S,a )+1=2 A 1 ={pqr,pq,pr,p} A 2 ={q,r}

38 Partitioned adversary A=A 1,…,A k setcon(A i )=1 there exists S in A, such that for all a in S: A S,a is in A i+1,…,A k setcon(A S,a )=k-i

39 Characterizing setcon setcon(A)=k if and only if A solves k-set consensus but not (k-1)-set consensus A with setcon(A)=k solves a colorless task T if and only if T is solvable (k-1)-resiliently n-process system with A ≅ k-process wait-free

40 Sufficiency Solving k-set consensus with A, setcon(A)=k Split A into A 1,..,A k, each of setcon 1 Run k parallel leader-based consensuses – at least one terminates Adopt the first returned value

41 Necessity Suppose A can solve (k-1)-set consensus k processes solve (k-1)-set consensus as follows: “Leveled” BG-simulation, starting from level k At current level L, simulate steps of S such that setcon(A S )=L If blocked simulating a step of p on level L go to simulating S’ in A S,p such that setcon(A S’ )=L-1 If a higher level “unblocks”, return to it k k-1 2 1 …

42 Asymmetric progress conditions [Imbs et al., DISC 2010]  Make progress if p 1 participates (wait-free for p 1 ), or at most one process is eventually up (obstruction-free for the rest)  A={p 1 *,p 2,…p n } setcon(A)=2 A 1 ={p 1 *} A 2 ={p 2,…p n }

43 What if we use stronger objects?  Suppose we can use k-process consensus objects  What is the min k such that consensus is solvable?  Suppose k≤n-1  2 process solve wait-free consensus as follows: BG-simulation (a slow simulator blocks up to n-1 simulated processes) Start with simulating all in round-robin If blocked – simulate the first unblocked process If blocked – go back to simulating all  Eventually, either all advance, or exactly one runs solo

44 Summary  t-resilience ≅ wait-freedom [BG93]  Adversaries ≅ wait-freedom [GK10a]  What about complexity? Task solvability undecidable for >2 processes [HR97,GK99]  What about colored tasks? Extended to generic tasks [GK10b]

45 It’s wait-free!

46 References  E. Borowsky and E. Gafni. Generalized FLP impossibility result for t-resilient asynchronous computations,'' STOC 1993  E. Gafni and P. Kuznetsov Turning Adversaries into Friends: Simplified, Made Constructive, and Extended OPODIS 2010  E. Gafni and P. Kuznetsov Relating L-Resilience and Wait-Freedom via Hitting Sets ICDCN 2011  D. Imbs, M. Raynal, G. Taubenfeld On Asymmetric Progress Conditions DISC 2010

47 QUESTIONS?

48 Colorless distributed tasks V I –set of input values V O -set of output values Val(U) denotes the set of values in vector U  In is in Δ (Out)  Val(In) is subset of Val(In’)  Val(Out’) is subset of Val(Out) Out’ is subset of Δ(In’)

Relative Power of Models in Distributed Computing Petr Kuznetsov TU Berlin/DT-Labs.

Similar presentations

Presentation on theme: "Relative Power of Models in Distributed Computing Petr Kuznetsov TU Berlin/DT-Labs."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Relative Power of Models in Distributed Computing Petr Kuznetsov TU Berlin/DT-Labs.

Similar presentations

Presentation on theme: "Relative Power of Models in Distributed Computing Petr Kuznetsov TU Berlin/DT-Labs."— Presentation transcript:

Similar presentations

About project

Feedback