Presentation is loading. Please wait.

Presentation is loading. Please wait.

Two Techniques for Proving Lower Bounds Hagit Attiya Technion TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA A.

Similar presentations


Presentation on theme: "Two Techniques for Proving Lower Bounds Hagit Attiya Technion TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA A."— Presentation transcript:

1 Two Techniques for Proving Lower Bounds Hagit Attiya Technion TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA A

2 Goal of this Presentation Describe two common techniques for proving lower bounds in distributed computing: ▫Information theory arguments ▫Covering Variations Applications

3 nicer system architecture My always first slide… real system architecture algorithm problem implementation

4 Part I Information Theory Arguments

5 Overview Bound the flow of information among processes (and memory) Show that information takes long to be acquired Argue that solving a particular problem requires information about many processes Usually applies to: ▫Shared memory systems ▫Synchronous executions (imply lower bounds also for asynchronous executions) Details depend on the primitives used

6 Single-writer registers: Possible argument Need to read from each process The state of a process can be found only in its own register Hence, first process must read n registers

7 Not really When processes take steps together First process doubles information in 2 nd step But can’t do better than that

8 More Refined Argument Consider synchronized executions ▫Processes take steps in rounds ▫All reads appear before all writes INF(p i,t-1 ): The set of inputs influencing process p i at the start of round t ▫For t = 1, INF(p i,t-1 ) = { p i } ▫For t > 1, if p i reads a value written by p j, INF(p i,t ) = INF(p i,t-1 ) [ INF(p j,t-1 ) ▫For t > 1, if p i writes, INF(p i,t ) = INF(p i,t-1 )

9 INF determines the state INF(p i,t-1 ): The set of inputs influencing process p i at the start of round t ▫For t = 1, INF(p i,t-1 ) = { p i } ▫For t > 1, if p i reads a value written by p j, INF(p i,t ) = INF(p i,t-1 ) [ INF(p j,t-1 ) ▫For t > 1, if p i writes, INF(p i,t ) = INF(p i,t-1 ) Proof by case analysis Lemma: If the states of processes in INF(p i,t-1 ) are the same in configurations C and C’, then p i takes the same steps in a t-round execution from C and from C’

10 Size of INF INF(p i,t-1 ): The set of inputs influencing process p i at the start of round t ▫For t = 1, INF(p i,t-1 ) = { p i } ▫For t > 1, if p i reads a value written by p j, INF(p i,t ) = INF(p i,t-1 ) [ INF(p j,t-1 ) ▫For t > 1, if p i writes, INF(p i,t ) = INF(p i,t-1 ) I (t ) = max | INF(p i,t )|  I (t ) ≤ 2 t Lemma: I (0 ) = 1, and I (t ) ≤ 2 I (t-1 )

11 Simple application: Computing OR Consider input configuration C 0 = (0,0,, 0,, 0) The size of the influence set of a process is < n in all rounds < log n Some process p i is not in INF(p 1,log n-1)  By lemma, p_1 returns the same value in C 0 and in C 1 = (0,0,, 1,, 0)  A contradiction pipi

12 Application: Approximate agreement For a small ² > 0 Processes start with input in [0,1] Must decide on an output in [0,1] such that ▫All outputs are within ² of each other (agreement) ▫If all inputs are v, the output is v (validity) System is asynchronous and a process must decide even if it runs by itself (solo termination)

13 Application: Approximate agreement [Attiya, Shavit, Lynch] Consider input configuration C 0 = (0,0,,,, 0) Run all processes to completion from C 0 must decide 0 If number of rounds T < log n  I(T) < n  9 process p i  INF(p 1,T )

14 Approximate agreement (cont.) Consider two input configurations C 0 = (0,,,,, 0) C 1 = (0,, 1,, 0) Run p i to completion, must decide 1 p i  INF(p 1,T )  p 1 still decides 0 when running from this configuration, contradicting agreement pipi Theorem: Solo-terminating approximate agreement requires  (log n) rounds in a synchronous failure-free run

15 Approximate agreement (cont.) Consider two input configurations C 0 = (0,,,,, 0) C 1 = (0,, 1,, 0) Run p i to completion, must decide 1 p i  INF(p 1,T )  p 1 still decides 0 when running from this configuration, contradicting agreement pipi Theorem: Solo-terminating approximate agreement requires  (log n) rounds in a synchronous failure-free run Overhead of solo-termination: in “nice” runs, since otherwise, a synchronous algorithm can solve the problem in one round.

16 With multi-writer registers Previous theorem does not hold A wait-free approximate agreement algorithm that takes O(1) rounds in “nice” executions [Schenk] Even simpler: An O(1) OR algorithm

17 With multi-writer registers Previous theorem does not hold A wait-free approximate agreement algorithm that takes O(1) rounds in “nice” executions [Schenk] Even simpler: An O(1) OR algorithm Only a few initial configurations to distinguish between Can you find it? Overhead of single-writer registers: Separates single-writer and multi-writer registers

18 Information flow with multi-writer registers The previous argument does not hold Instead, consider how learning more information allows to differentiate between input configurations Capture as a partitioning of process states and memory values [Beame] (0,, 1,, 0) (0,,,,, 0) (1,, 1,, 0) (0,, 0,, 1)

19 Multi-writer registers: Ordering events Within each round Put all reads, then Put all writes ÞReads obtain value written at the end of previous round

20 Partitioning into equivalence classes For process p and round t, two input configurations are in the same equivalence class of P(p,t) if p is in the same state after t rounds from both (in a synchronous failure-free execution) P(t): the number of classes after t rounds (max over p) V(R,t), V(t) defined similarly for locations R  P(t), V(t) · ( 4n+2) 2 t−2 Lemma: P(t) · P(t-1)V(t-1) and V(t) · n P(t-1)+V(t-1)

21 Application: The collect problem update(v) stores v as latest value of a process collect() returns a set of values (one per process) When each process initially stores one of two values  There are 2 n possible input configurations Each leading to a different output Previous lemma implies ( 4n+2) 2 t−2 ≥ P(t) ≥ 2 n  Must have  (log n) rounds

22 Also for other primitives (CAS) Non-reading CAS Reading CAS returns the old value (can be handled, but we won’t do that) Can also extend to non-reading kCAS CAS(R,old,new){ if R==old then R = new return success else return fail } CAS(R,old,new){ if R==old then R = new return success else return fail }

23 Careful with CAS More information flow in a sequence of steps initially, R == 0 cas(R,0,1) cas(R,1,2)... cas(R,n−1,n) On the other hand cas(R,n-1,n) cas(R,n-2,n-1)... cas(R,0,1) 

24 Ordering events within a round Put all reads first. Put all writes last. For every register R whose current value is v, consider all CAS events: ▫Put all events with old  v: all fail ▫Put all events with old == v: only the first succeeds (assumes operations are non-degenerate) Allows to prove a lemma analogue to multi-writer registers (different constants)

25 Information Flow with Bounded Fan-In Arbitrary objects, but bounded contention ▫Not too many processes access the same base object similtaneously Isolate processes n a Q-independent execution ▫Only processes in Q take steps ▫Access only objects not modified by processes in Q  For a process p 2 Q, a Q-independent execution is indistinguishable from a p-solo execution

26 Constructing independent executions Proof by induction, with a trivial base case. Induction step: consider Q t -independent execution. We use the following result from graph theory. Look at the next steps processes in Q t are about to perform, and construct an undirected graph (V,E) Lemma: For any algorithm using only objects with contention ≤ w and every t ≥ 0, there is a t-round Q t -independent execution, with| Q t | ≥ n/( w+2) t Turan theorem: Any graph (V,E) has an independent set of size | V| 2 /(|V|+2|E|)

27 Induction step: The graph V = Q t E contains an edge { p i, p j } if ▫ p i and p j access the same object, or ▫ p i is about to read an object modified by p j, or ▫ p j is about to read an object modified by p i |E| ≤ | Q t |(w+1)/2 Turan’s theorem and inductive hypothesis  there is an independent set Q t+1 of size ≥ n/( w+2) t Omit all steps of Q t – Q t+1 from the execution to get a Q t+1 -independent execution

28 Application: Weak Test&Set Weak test&set: Like test&set but at most one success Take t such that ( w+2) t < n Lemma gives a t-round { p i,p j }-independent execution Each of p i and p j seems to be running solo  must succeed  Contradiction Theorem: The solo step complexity of weak test&set is  (log n / log w )

29 Part II Covering

30 Covering: The basic idea Several processes write to the same location Writes by early processes are lost, if no read in between  Must write to distinct locations  Other process must read these locations

31 Max Register WriteMax(v,R) operation ReadMax operation op returns the maximal value written by a WriteMax operation that ▫completed before op started, or ▫overlaps op Special case of a linearizable object

32 Lower bound for ReadMax operation [Jayanti, Tan, Toueg] The proof is constructive Theorem: ReadMax must read n different registers.

33 Construction for the lower bound ®k®k ¯k¯k writes by p 1 … p k to R 1 … R k p 1 … p k perform WriteMax operations °k°k P n performs ReadMax operation reads R 1 … R k Proof by induction on k = 0, …, n Base case is simple Taking k = n yields the result

34 Inductive Step ®k®k ¯k¯k writes by p 1 … p k to R 1 … R k p 1 … p k perform WriteMax operations °k°k P n performs ReadMax operation p k+1 perform WriteMax operations must write to R  R 1 … R k ¯k¯k writes by p 1 … p k to R 1 … R k °k°k P n performs ReadMax operation does not observe p k+1

35 ¼k¼k Inductive Step ®k®k ¯k¯k writes by p 1 … p k to R 1 … R k p 1 … p k perform WriteMax operations °k°k P n performs ReadMax operation p k+1 perform WriteMax operations must write to R  R 1 … R k ¯k¯k writes by p 1 … p k to R 1 … R k °k°k P n performs ReadMax operation must read R  R 1 … R k

36 Inductive Step ®k®k ¯k¯k writes by p 1 … p k to R 1 … R k p 1 … p k perform WriteMax operations °k°k P n performs ReadMax operation p k+1 perform WriteMax operations ¯k¯k writes by p 1 … p k to R 1 … R k °k°k P n performs ReadMax operation write to R k+1 Claim follows with R 1 … R k R k+1 and ® k+1 = ® k ¼ k ¼k¼k

37 Swap objects Theorem holds for other primitives and objects, e.g., (register-to memory) swap Need some care in constructing ¼ k, ° k swap(R,v){ tmp = R return tmp } swap(R,v){ tmp = R return tmp }

38 Result holds also for other objects E.g., counters Constructed execution contains many increment operations Better algorithms when ▫Few increment operations ▫Max register holds bounded values [Aspnes, Attiya, Censor-Hillel]

39 Counters with CAS Counters can be implemented with a single location R, and a single CAS per operation: To increment, simply: ▫read previous value from R ▫CAS +1 to R To read the counter, simply read R  Lots of contention on R!  This is inevitable

40 The memory stalls measure [Dwork, Herlihy, Waarts] If k processes access (or modify) the same location at the same configuration ▫The first process incurs one step, and no stalls ▫The second process incurs one step, and one stall ▫.▫. ▫.▫. ▫.▫. ▫The k’th process incurs one step, and k-1 stalls

41 Lower bound on number of stalls Theorem: ReadCounter must incur n stalls + steps. p 1 … p k poised on R 1 … R m, m · k p 1 … p k perform Increment operations P n performs ReadCounter operation accesses R 1 … R m Similar construction as in previous theorem

42 Lower bound on number of stalls Theorem: ReadCounter must incur n stalls + steps. p 1 … p k poised on R 1 … R m, m · k p 1 … p k perform Increment operations P n performs ReadCounter operation accesses R 1 … R k incurs k stalls + steps Similar construction as in previous theorem

43 Wrap-up There are many lower bound results But fewer techniques… Some results & techniques are relevant to questions asked in Transform Material is based on monograph-in-writing with Faith Ellen ▫Let me know if you want to proof-read it!


Download ppt "Two Techniques for Proving Lower Bounds Hagit Attiya Technion TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA A."

Similar presentations


Ads by Google