Presentation is loading. Please wait.

Presentation is loading. Please wait.

Self-organized fault-tolerant routing in P2P overlays Wojciech Galuba, Karl Aberer EPFL, Switzerland Zoran Despotovic, Wolfgang Kellerer Docomo Euro-Labs,

Similar presentations


Presentation on theme: "Self-organized fault-tolerant routing in P2P overlays Wojciech Galuba, Karl Aberer EPFL, Switzerland Zoran Despotovic, Wolfgang Kellerer Docomo Euro-Labs,"— Presentation transcript:

1 Self-organized fault-tolerant routing in P2P overlays Wojciech Galuba, Karl Aberer EPFL, Switzerland Zoran Despotovic, Wolfgang Kellerer Docomo Euro-Labs, Munich, Germany

2 2 © 2009 EPFL, Docomo Euro-Labs What are the P2P overlays? Underlying blue network (e.g. TCP/IP) Red peers come and go Peers form an overlay network (red links)

3 3 Routing in P2P overlays Overlays (usually) have their own address space Goal: provide point-to-point connectivity  or rather point-to-service connectivity... © 2009 EPFL, Docomo Euro-Labs source destination

4 4 © 2009 EPFL, Docomo Euro-Labs What is the problem? Failures in large-scale systems are the norm, not the exception Permanent failures  well understood  Overlay maintenance algorithms Intermittent failures  Transient network connectivity problems  Peer overload, resource exhaustion  Cannot be addressed in the same way as permanent failures

5 5 Existing solutions - multipath Multiple paths Goal: at least one path reaches destination © 2009 EPFL, Docomo Euro-Labs source destination - lossy peer

6 6 Existing solutions – iterative routing Source controls the routing process Successively ask nodes for their neighbors High redundancy  if one node fails, use others © 2009 EPFL, Docomo Euro-Labs source destination - lossy peer j

7 7 Exisisting solutions - problems Heavily rely on message redundancy  High bandwidth cost Do not learn from failures  Likely to repeat the same routing mistakes © 2009 EPFL, Docomo Euro-Labs

8 8 Forward feedback protocol (FFP) Requestor determines the quality of the provided service decision binary: good or bad Feedback follows the same path as the request Feedback is obligatory, no feedback = bad feedback

9 9 A peer on the path Knows only its overlay neighbors Based on feedback, learns which neighbors are reliable Associates a success estimator with each (j, dz) pair:  j – neighbor address  dz – destination zone A success estimator is an exponentially averaged success rate, [0..1]  Initially 0.5  Increased on positive feedback  Decreased on negative feedback or feedback timeout © 2009 EPFL, Docomo Euro-Labs phpeer nh

10 10 Next hop selection Based on the state of the success estimators Pick a neighbor j for which the current value of a success estimator is the highest  i.e. maximize the probability of success based on performance history © 2009 EPFL, Docomo Euro-Labs

11 11 The FFP protocol in action ph peer nh2 nh1 nh1 has history of success but starts failing peer switches to nh2 -- + - + + © 2009 EPFL, Docomo Euro-Labs

12 12 Cumulative effect  The root cause of the failure receives the most negative feedback  The links to the faulty peer are avoided by its neighbors - lossy peer © 2009 EPFL, Docomo Euro-Labs

13 13 Scalability through dest zoning O(log N) zones and O(log N) neighbors Total state at each node: O(log 2 N) © 2009 EPFL, Docomo Euro-Labs Increasing overlay distance to destination Increasing destination zone number 012 3 Exponentially decreasing zone size

14 14 Evaluation PlanetLab – a planetary-scale testbed 350 peers Conditions:  Median system load: 5.3  Unpredictable delays and loss „natural” on PlanetLab Challege:  introduce loss and delays in a Chord-like DHT  place a tight 3s timeout on service requests  see if protocols can route around faulty peers Workload: multi-source, multi-destination © 2009 EPFL, Docomo Euro-Labs

15 15 The line-up BASE – baseline, no fault-tolerance mechanisms MULTI4 – 4-way multipath routing ITER4 – Kademlia-based iterative routing, 4 parallel RPCs FFP © 2009 EPFL, Docomo Euro-Labs

16 16 Every 5 mins:  a new 10% of peers become droppers Droppers drop all requests © 2009 EPFL, Docomo Euro-Labs

17 17 © 2009 EPFL, Docomo Euro-Labs

18 18 © 2009 EPFL, Docomo Euro-Labs Every 5 mins:  a new 10% of peers become delayers Delayers delay all messages by 100-2000ms

19 19 25% of droppers arrive at 300s Convergence time depends on the traffic pattern © 2009 EPFL, Docomo Euro-Labs

20 20 Topology-oblivious routing Starts with all success estimators = 0.5  Empty routing tables Learn by trial and error  Which neighbors are good forwarders for which destinations Routing tables are entirely emergent Initially random walks  converge to reliable routes © 2009 EPFL, Docomo Euro-Labs

21 21 © 2009 EPFL, Docomo Euro-Labs Warmup: initially use the original Chord routing tables After some time switch to FFP routing tables

22 22 Summary FFP uses 2-5 times less bandwidth than MULTI and ITER Same or higher fault-tolerance More suitable for workloads:  that are high-rate  with fewer src-dest pairs © 2009 EPFL, Docomo Euro-Labs

23 23 Benefits of the self-org approach Decentralized  scalability Topology-oblivious  Applicable to many networks Agnostic to the causes of failures  Robust to many failure scenarios Even those it was not designed for © 2009 EPFL, Docomo Euro-Labs

24 24 FFP used for secure routing in MANETs Additional crypto to prevent feedback forgery No PKI ! Tech report: http://tinyurl.com/ffp-manet © 2009 EPFL, Docomo Euro-Labs

25 25 FFP: a signaling meta-protocol Feedback is binary FFP can be used to signal any Boolean property of the routing path  Service provisioning success (currently)  Congestion on the path (ECN bit in IP) Congestion control  Delay exceeding thresholds Latency-minimizing routing? What about non-Boolean? © 2009 EPFL, Docomo Euro-Labs

26 26 At 800s, 40% peers become droppers FFP’s performance is not affected by churn © 2009 EPFL, Docomo Euro-Labs

27 27 Loop-freedom Requests stuck in a loop  negative feedback If requests exit a loop they have already accumulated a large delay  potentially negative feedback All in all, the © 2007 EPFL, DoCoMo Euro-Labs

28 28 Compared to ant algorithms FFP designed with focus on scalability FFP designed for highly dynamic systems:  P2P overlays  MANETs Not exactly an ant algorithm:  we found the „evaporation” to degrade performance © 2009 EPFL, Docomo Euro-Labs


Download ppt "Self-organized fault-tolerant routing in P2P overlays Wojciech Galuba, Karl Aberer EPFL, Switzerland Zoran Despotovic, Wolfgang Kellerer Docomo Euro-Labs,"

Similar presentations


Ads by Google