Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 Set 19: Asynchronous Solvability 1.

Similar presentations


Presentation on theme: "CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 Set 19: Asynchronous Solvability 1."— Presentation transcript:

1 CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 Set 19: Asynchronous Solvability 1

2 Problems Solvable in Failure-Prone Asynchronous Systems CSCE 668Set 19: Asynchronous Solvability 2  Although consensus is not solvable in failure-prone asynchronous systems (neither message passing nor read/write shared memory), there are some interesting problems that are solvable:  set consensus  approximate agreement  renaming  k-exclusion weakenings of consensus - "opposite" of consensus - fault-tolerant variant of mutex

3 Model Assumptions CSCE 668Set 19: Asynchronous Solvability 3  asynchronous  shared memory with read/write registers  heavy use of atomic snapshot objects  at most f crash failures of procs.  results can be translated to message passing if f < n/2 (cf. Chapter 10)  may be a few asides into message passing

4 Set Consensus Motivation CSCE 668Set 19: Asynchronous Solvability 4  By judiciously weakening the definition of the consensus problem, we can overcome the asynchronous impossibility  We've already seen a weakening of consensus:  weaker termination condition for randomized algorithms  How about weakening the agreement condition?  One weakening is to allow more than one decision value:  allow a set of decisions

5 Set Consensus Definition CSCE 668Set 19: Asynchronous Solvability 5 Termination: Eventually, each nonfaulty processor decides. k-Agreement: The number of different values decided on by nonfaulty processors is at most k. Validity: Every nonfaulty processor decides on a value that is the input of some processor.

6 Set Consensus Algorithm CSCE 668Set 19: Asynchronous Solvability 6  Uses a shared atomic snapshot object X  can be implemented with read/write registers  update your segment of X with your input  repeatedly scan X until there are at least n - f nonempty segments  decide on the minimum value appearing in any segment

7 Correctness of Set Consensus Algorithm CSCE 668Set 19: Asynchronous Solvability 7  Termination: at most f crashes.  Validity: every decision is some proc's input  Why does k-agreement hold?  We'll show it does as long as k > f.  Sanity check: When k = 1, we have standard consensus. As long as there is less than 1 failure, we can solve the problem.

8 k-Set Agreement Condition CSCE 668Set 19: Asynchronous Solvability 8  Let S be set of min values in final scan of each nf proc; these are the nf decisions  Suppose in contradiction |S| > f + 1.  Let v be largest value in S, the decision of p i.  So p i 's final scan misses at least f + 1 values, contradicting the code.

9 Synchronous vs. Asynchronous?  How does the previous, asynchronous, algorithm compare to the synchronous (message-passing) algorithm for k-set consensus from Chapter 5 homework?  Recall the synchronous algorithm runs in f/k + 1 rounds. CSCE 668Set 19: Asynchronous Solvability 9

10 Set Consensus Lower Bound CSCE 668Set 19: Asynchronous Solvability 10 Theorem: There is no asynchrounous algorithm for solving k-set consensus in the presence of f failures, if f ≥ k.  Straightforward extensions of consensus impossibility result fail; even proving the existence of an initial bivalent configuration is quite involved.  Original proof of set-consensus impossibility used concepts from algebraic topology  Textbook's proof uses more elementary machinery, but still very involved

11 Approximate Agreement Motivation CSCE 668Set 19: Asynchronous Solvability 11  An alternative way to weaken the agreement condition for consensus:  Require that the decisions be close to each other, but not necessarily equal  Seems appropriate for continuous-valued problems (as opposed to discrete)

12 Approximate Agreement Definition CSCE 668Set 19: Asynchronous Solvability 12 Termination: Eventually, each nonfaulty processor decides.  -Agreement: All nonfaulty decisions are within  of each other. Validity: Every nonfaulty decision is within the range of the input values.

13 Approximate Agreement Algorithm CSCE 668Set 19: Asynchronous Solvability 13  Assume procs know the range from which input values are drawn:  let D be the length of this range  wait-free: up to n - 1 procs can fail  algorithm is structured as a series of "asynchronous rounds":  exchange values via a snapshot object, one per round  compute midpoint for next round  continue until spread of values is within , which requires about log 2 (D/  ) rounds

14 Approximate Agreement Algorithm CSCE 668Set 19: Asynchronous Solvability 14 Shared atomic snapshot objects ASO[1], ASO[2],... Initially local variable v = p i 's input Initially local variable r = 1 while true do 1. update p i 's segment of ASO[r] to be v 2. let scan be set of values obtained by scanning ASO[r] 3. v := midpoint(scan) 4. if r =  log 2 (D/  )  + 1 then decide v and terminate 5. else r++

15 Analysis of Approx. Agreement Alg. CSCE 668Set 19: Asynchronous Solvability 15 Definitions w.r.t. a particular execution:  M =  log 2 (D/  )  + 1  U 0 = set of input values  U r = set of all values ever written to ASO[r]

16 Helpful Lemma CSCE 668Set 19: Asynchronous Solvability 16 Lemma (16.8): Consider any round r < M. Let u be the first value written to ASO[r]. Then the values written to ASO[r+1] are in this range: umin(U r )max(U r )(min(U r )+u)/2(max(U r )+u)/2 elements of U r+1 are in here

17 Implications of Lemma CSCE 668Set 19: Asynchronous Solvability 17  The range of values written to the ASO object for round r + 1 is contained within the range of values written to the ASO object for round r.  range(U r+1 )  range(U r )  The spread (max - min) of values written to the ASO object for round r + 1 is at most half the spread of values written to the ASO object for round r.  spread(U r+1 ) ≤ spread(U r )/2

18 Correctness of Algorithm CSCE 668Set 19: Asynchronous Solvability 18  Termination: Each proc executes M asynchronous rounds.  Validity: The range at each round is contained in the range at the previous round.   -Agreement: spread(U M ) ≤ spread(U 0 )/2 M ≤ D/2 M ≤ 

19 Handling Unknown Input Range CSCE 668Set 19: Asynchronous Solvability 19  Range might not be known.  Actual range in an execution might be much smaller than maximum possible range.  First idea: have a preprocessing phase in which procs try to determine input range  but asynchrony and possible failures makes this approach problematic  Instead…

20 Handling Unknown Input Range CSCE 668Set 19: Asynchronous Solvability 20  Use just one atomic snapshot object  Dynamically recalculate how many rounds are needed as more inputs are revealed  Skip over rounds to try to catch up to maximum observed round  Only consider values associated with maximum observed round  Still use midpoint

21 Unknown Input Range Algorithm CSCE 668Set 19: Asynchronous Solvability 21 shared atomic snapshot object A; initially all segments hold  update i (A,[x,1,x]), where x is p i 's input // [original input, rd#, current estimate] repeat scan A let S be spread of all inputs from scan (ignore  segments) if S = 0 then maxRound := 0 else maxRound := log 2 (S/  ) let r max be largest round from scan (ignore  segments) let values be set of estimates in segments with round number r max update i (A,[x,r max +1,midpoint(values)]) until r max ≥ maxRound decide midpoint(values)

22 Analysis of Unknown Input Range Algorithm CSCE 668Set 19: Asynchronous Solvability 22 Definitions w.r.t. a particular execution:  U 0 = set of all input values  U r = set of all values ever written to A with round number r  M = largest r s.t. U r is not empty With these changes, correctness proof is similar to that for known input range algorithm.

23 Key Differences in Proof CSCE 668Set 19: Asynchronous Solvability 23  Why does termination hold?  a proc's local maxRound variable can only increase if another proc wakes up and increases the spread of the observable inputs. This can happen at most n - 1 times.  Why does  -agreement hold?  If p i 's input is observed by p j the last time p j computes its maxRound, same argument as before.  Otherwise, when p i wakes up, it ignores its own input and uses values from maxRound or later.

24 Renaming CSCE 668Set 19: Asynchronous Solvability 24  Procs start with unique names from a large domain  Procs should pick new names that are still distinct but that are from a smaller domain  Motivation: Suppose original names are serial numbers (many digits), but we'd like the procs to do some kind of time slicing based on their ids

25 Renaming Problem Definition CSCE 668Set 19: Asynchronous Solvability 25 Termination: Eventually every nonfaulty proc p i decides on a new name y i Uniqueness: If p i and p j are distinct nonfaulty procs, then y i ≠ y j. We are interested in anonymous algorithms: procs don't have access to their indices, just to their original names. Code depends only on your original name.

26 Performance of Renaming Algorithm CSCE 668Set 19: Asynchronous Solvability 26  New names should be drawn from {1,2,…,M}.  We would like M to be as small as possible.  Uniqueness implies M must be at least n.  Due to the possibility of failures, M will actually be larger than n.

27 Renaming Results CSCE 668Set 19: Asynchronous Solvability 27  Algorithm for wait-free case (f = n –1) with M = n + f = 2n – 1.  Algorithm for general f with M = n + f.  Lower bound that M must be at least n + 1, for wait- free case.  Proof similar to impossibility of wait-free consensus  Stronger lower bound that M must be at least 2n – 1, for wait-free case if n satisfies a certain number- theoretic property  If n does not satisfy the property, there is a wait-free algorithm with M = 2n – 2. (includes n = 6, 10, 14,...)

28 Wait-Free Renaming Algorithm CSCE 668Set 19: Asynchronous Solvability 28 Shared atomic snapshot object A; initially all segments hold  s := 1 // suggestion for p i ’s new name while true do update p i ’s segment of A to be [x,s], where x is p i ’s original name scan A if s is also someone else's suggestion (per scan result) then let r be rank of x among original names of non-  segments let s be r-th smallest positive integer not currently suggested by another proc else decide on s for new name and terminate

29 Analysis of Renaming Algorithm CSCE 668Set 19: Asynchronous Solvability 29 Uniqueness: Suppose in contradiction p i and p j choose same new name, s. p i 's last scan before deciding s p j 's last scan before deciding s p i 's last update before deciding: suggests s sees s as p i 's suggestion and doesn't decide s

30 Analysis of Renaming Algorithm CSCE 668Set 19: Asynchronous Solvability 30  New name space is {1, …, 2n – 1}.  Why?  rank of a proc p i 's original name is at most n (the largest one)  worst case is when each of the n – 1 other procs has suggested a different new name for itself, so suggested names are {1, …, n – 1}.  Then p i suggests n + n – 1 = 2n – 1.

31 Analysis of Renaming Algorithm CSCE 668Set 19: Asynchronous Solvability 31 Termination: Suppose in contradiction some set T of nonfaulty procs never decide in some execution.  Consider the suffix  of the execution in which  each proc in T has already done at least one update and  only procs in T take steps (others have either already crashed or decided).

32 Analysis of Renaming Algorithm CSCE 668Set 19: Asynchronous Solvability 32  Let F be the set of new names that are free (not suggested at the beginning of  by any proc not in T)  the trying procs need to choose new names from this set.  Let z 1, z 2,… be the names in F in order.  By the definition of , no proc wakes up during  and reveals an additional original name, so all procs in T are working with the same set of original names during .  Let p i be proc whose original name has smallest rank (among this set of original names). Let r be this rank.

33 Analysis of Renaming Algorithm CSCE 668Set 19: Asynchronous Solvability 33  Eventually procs other than p i stop suggesting z r as a new name:  After  starts, every scan indicates a set of free names that is no larger than F.  Every trying proc other than p i has a larger rank and thus continually suggests a new name for itself that is larger than z r, once it does the first scan in .

34 Analysis of Renaming Algorithm CSCE 668Set 19: Asynchronous Solvability 34  Eventually p i does suggest z r as its new name:  By choice of z r as r-th smallest free new name, and fact that eventually other trying procs stop suggesting z 1 through z r, eventually p i sees z r as free name with r-th smallest rank.  Contradicts assumption that p i is trying (i.e., stuck).  So termination holds.

35 General Renaming CSCE 668Set 19: Asynchronous Solvability 35  Suppose we know that at most f procs will fail, where f is not necessarily n - 1.  We can use the wait-free algorithm, but it is wasteful in the size of the new name space, 2n – 1, if f < n – 1.  We can do better (if f < n – 1) with a slightly different algorithm:  keep track in the snapshot object of whether you have decided  an undecided proc suggests a new name only if its original name is among the f + 1 lowest names of procs that have not yet decided.

36 k-Exclusion Problem CSCE 668Set 19: Asynchronous Solvability 36  A fault-tolerant version of mutual exclusion.  Processors can fail by crashing, even in the critical section (stay there forever).  Allow up to k processors to be in the critical section simultaneously.  If < k processors fail, then any nonfaulty processor that wishes to enter the critical section eventually does so.

37 k-Exclusion Algorithm CSCE 668Set 19: Asynchronous Solvability 37 cf. paper by Afek et al. [5].

38 k-Assignment Problem CSCE 668Set 19: Asynchronous Solvability 38  A specialization of k-Exclusion to include:  Uniqueness: Each proc in the critical section has a variable called slot, which is an integer between 1 and m. If p i and p j are in the C.S. concurrently, then they have different slots.  Models situation when there is a pool of identical resources, each of which must be used exclusively:  k is number of procs that can be in the pool concurrently  m is the number of resources  To handle failures, m should be larger than k

39 k-Assignment Algorithm Schema CSCE 668Set 19: Asynchronous Solvability 39 k-exclusion entry section renaming using m = 2k-1 names k-assignment entry section k-exclusion exit section k-assignment exit section

40 k-Assignment Algorithm Schema CSCE 668Set 19: Asynchronous Solvability 40 k-exclusion entry section request-name for long-lived renaming using m = 2k-1 names k-assignment entry section k-exclusion exit section release-name for long-lived renaming using m = 2k-1 names k-assignment exit section


Download ppt "CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 Set 19: Asynchronous Solvability 1."

Similar presentations


Ads by Google