Presentation is loading. Please wait.

Presentation is loading. Please wait.

Self-stabilizing Overlay Networks Sukumar Ghosh University of Iowa Work in progress. Jointly with Andrew Berns and Sriram Pemmaraju (Talk at Michigan Technological.

Similar presentations


Presentation on theme: "Self-stabilizing Overlay Networks Sukumar Ghosh University of Iowa Work in progress. Jointly with Andrew Berns and Sriram Pemmaraju (Talk at Michigan Technological."— Presentation transcript:

1 Self-stabilizing Overlay Networks Sukumar Ghosh University of Iowa Work in progress. Jointly with Andrew Berns and Sriram Pemmaraju (Talk at Michigan Technological University)

2

3 On Thursday, 16th August 2007 Skype had an outage (Skype is known to be a “self-healing” overlay network) (Skype’s explanation) The disruption was triggered by a massive restart of users’ computers across the globe within a very short timeframe, as they re-booted after receiving a routine set of patches through Windows Update.

4 Overlay Network A logical network laid on top of the Internet A B C Internet Logical link AB Logical link BC

5 The Formal Model Let V be a set of nodes. The functions id : V  Z+ assigns a unique id to each node in V rs : V  {0, 1}* assigns a random bit string to each node in V A family of overlay networks ON : F  G, where F is the set of all triples λ= (V; id; rs) and G is the set of all directed graphs. The family of overlay networks associates a unique directed graph ON(λ) ∈ G with each labeled set λ = (V; id; rs) of nodes.

6 Structured vs. Unstructured Overlay networks UnstructuredStructured No restriction on network topology. Examples: Gnutella, Kazaa, Bittorrent, Skype etc. Network topology satisfies specific invariants. Examples: Chord, CAN, Pastry Skip Graph etc

7 The Challenge Can an overlay network restore its correct functionality from an arbitrary initial configuration? Bad configurations can be caused by failures, perturbations, selfish actions, malicious attacks.

8 Autonomic Systems Self-management is the holy grail of all complex dynamic systems.

9 Self-stabilizing systems (Convergence) Recover from any arbitrary initial configuration to a legal configuration in a bounded number of steps, and (Closure) remain in the legal configuration thereafter, until another failure or perturbation occurs.

10 Self-stabilizing Overlay Networks Can an overlay network restore its topology from an arbitrary initial configuration? Does it make sense in unstructured networks? Does it make sense in structured networks?

11 Related work Self-stabilizing and Byzantine-tolerant overlay network. OPODIS 2007 [Dolev, Hoch, van Renesse] A distributed polylog time algorithm for self-stabilizing SKIP graph. PODC ’09 [Jacob, Richa, Scheideler et. al] Linearization: Locally self-stabilizing Sorting in graphs. ALENEX, SIAM ‘07 [Onus, Richa, Scheideler]

12 Example: Linearization 27 10 20 15 30 13 18 34 21 25710131518213034 The ideal topology is a sorted list. The goal is to spontaneously recover to the ideal topology from an arbitrary connected topology (Onus, Richa, Scheideler, ALENEX 2007)

13 Self-stabilizing algorithm: Linearization Left and right neighbors: –‘w’ is left neighbor of node ‘u’ if {u, w}  E and w < u. –‘w’ is right neighbor of node ‘u’ if {u, w}  E and u < w. u=10 w 1 =2 w 2 =3w 4 =8w 3 =6v 1 =19 v 2 =28v 4 =35v 3 =30 left neighborsright neighbors

14 Self-stabilizing algorithm: Linearization u=10 w 1 =2 w 2 =3w 4 =8w 3 =6v 1 =19 v 2 =28v 4 =35v 3 =30 (The Algorithm) In each round do Convert left neighbors into sorted list Convert right neighbors into sorted list Takes at most (n-2) rounds. Slide borrowed from Onus et al.

15 Evolution of Skip Graph (Aspenes, Shah SODA 2003) 423291563479380107 Search time is O(n) hops

16 SKIP Graph 4 23291563479380 107 Node degree = O(log n), diameter = O(log n) Number of levels = O(log n), Search time now is O(log n) hops 001100110010111000101011101010 Level 0 Level 1 Level 2 0 - - 1 - - 00 - 01 - 10 - 11 -

17 SKIP Graph: the question Can we have a self-stabilizing skip graph that can spontaneously restore its topology starting from any “connected” initial configuration?

18 Why local checking is important Unless bad configurations are detected via local checking, periodic global snapshots are needed, which is disruptive for the system.

19 SKIP Graph is NOT locally checkable Self-stabilization requires local detection of errors, but certain failures are not locally checkable

20 SKIP+ graph Jacob, Richa, Scheideler et al. (PODC 2009) proposed a locally checkable version of SKIP Graph by adding a few extra edges to an existing Skip Graph. They called it a SKIP+ Graph. They presented an algorithm to stabilize such a topology in O(log 2 n) rounds with high probability. The algorithm is quite cumbersome. We try to devise a simpler and better solution.

21 Detectors detector Our first step

22 Detector diameter The detector diameter of G, is the maximum hop distance in G between any node and the closest detector.

23 Transitive Closure Framework Due to the local checkability property in any faulty configuration, there is at least one detector

24 Transitive Closure Framework Theorem For a SKIP+ graph, the detector diameter D = O(log n)

25 Transitive Closure Framework

26 The neighbors of each detector become detectors in the next round. In O(log n) rounds, every node becomes a detector, and these detectors initiate the transitive closure process. After an additional O(log n) rounds, all nodes become connected with one another, and the topology becomes completely connected.

27 Transitive Closure Framework After all nodes becomes detectors and eventually the topology becomes completely connected, the nodes rebuild the correct topology using a REPAIR subroutine. REPAIR takes only one round.

28 The Repair Process Lemma If the network is completely connected and all nodes are detectors in round i, a legal overlay network will be built in round (i + 1), and no node will be a detector. Compare with Jacob et. al’s results

29 Local checkability Let L define a correct configuration of an overlay network. Then network is locally checkable when L = p 0 ∧ p 1 ∧ p 2 ∧ … ∧ p n-1 where p i is a local predicate involving process i and its immediate neighbors only. Most of the real life networks are NOT locally checkable

30 Example: a clique Theorem. A complete connected topology is locally checkable a b c

31 Example: a clique Theorem. A complete connected topology is locally checkable a b c

32 Chord is not locally checkable Chord ringLoopy chord ring

33 CAN is not locally checkable Content Addressable Network (CAN) on a 2D torus Replace the black edges by the red edges, and each column becomes a loopy chord ring

34 LCON: a locally checkable overlay network in a circular key space 18 0 3 32 5 37 23 25 40 50 54 59 N= 64 7

35 LCON: a locally checkable overlay network in a circular key space 18 0 3 32 5 37 23 25 40 50 54 59 S-links for node u : one edge to each node in the range (u to u+s mod N ) D-links for node u: Succ (u+s mod N), Succ (u+2s mod N) Succ (u+(d-1)s + mod N) N max = s x d Let s=16, d=4 7

36 Observations Observation Each node in LCON has (d+s-2) neighbors. When d = s, the size of the neighborhood is O(sqrt N). Theorem The detector diameter of LCON is at most two.

37 Some properties of LCON Theorem. LCON is locally checkable. Main idea. Case 1. If the diameter is two, then every node can “see” every other node, and check if the topology is correct. Case 2. We show that if the diameter if greater than two, then there is at least one detector.

38 Self-stabilization of LCON The Transitive Closure Framework (TCF) will stabilize LCON in O(log N) time. But it may be a sledgehammer. What is the space complexity of stabilization using TCF?

39 Self-stabilization of LCON We have an algorithm customized for LCON that stabilizes LCON in polylog time, while the space complexity does not skyrocket to O(n)

40 Generalization of LCON Main idea Consider a CAN-like topology on a d- dimensional torus. Convert the “ring” in each dimension into an LCON ring. It is only partially shown in the figure on a 2-dimensional torus Each node has O(d.N 1/2d ) neighbors

41 Conclusion  A new problem of growing interest. We need efficient algorithms for stabilizing a variety of overlay topologies.  The initial topology must be connected. Stabilization from a partitioned topology is impossible. Also for a given (V, id, rs) the legal topology should be unique. Otherwise there will be an additional step for distributed consensus  Working on extending this to more fragile networks.

42 Questions?


Download ppt "Self-stabilizing Overlay Networks Sukumar Ghosh University of Iowa Work in progress. Jointly with Andrew Berns and Sriram Pemmaraju (Talk at Michigan Technological."

Similar presentations


Ads by Google