Overlays and DHTs Presented by Dong Wang and Farhana Ashraf.

Slides:



Advertisements
Similar presentations
P2P data retrieval DHT (Distributed Hash Tables) Partially based on Hellerstein’s presentation at VLDB2004.
Advertisements

Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Peer to Peer and Distributed Hash Tables
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK
Scalable Content-Addressable Network Lintao Liu
1 Network Measurements in Overlay Networks Richard Cameron Craddock School of Electrical and Computer Engineering Georgia Institute of Technology.
Chord: A scalable peer-to- peer lookup service for Internet applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashock, Hari Balakrishnan.
Peer-to-Peer Structured Overlay Networks
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
1 PASTRY Partially borrowed from Gabi Kliot ’ s presentation.
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems Antony Bowstron & Peter Druschel Presented by: Long Zhang.
Scribe: A Large-Scale and Decentralized Application-Level Multicast Infrastructure Miguel Castro, Peter Druschel, Anne-Marie Kermarrec, and Antony L. T.
Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.
Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems Antony Rowstron and Peter Druschel Proc. of the 18th IFIP/ACM.
Topics in Reliable Distributed Systems Lecture 2, Fall Dr. Idit Keidar.
Pastry Partially borrowed for Gabi Kliot. Pastry Scalable, decentralized object location and routing for large-scale peer-to-peer systems  Antony Rowstron.
Spring 2003CS 4611 Peer-to-Peer Networks Outline Survey Self-organizing overlay network File system on top of P2P network Contributions from Peter Druschel.
Distributed Lookup Systems
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek and Hari alakrishnan.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
SkipNet: A Scaleable Overlay Network With Practical Locality Properties Presented by Rachel Rubin CS294-4: Peer-to-Peer Systems By Nicholas Harvey, Michael.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
Wide-area cooperative storage with CFS
1 Peer-to-Peer Networks Outline Survey Self-organizing overlay network File system on top of P2P network Contributions from Peter Druschel.
Tapestry: A Resilient Global-scale Overlay for Service Deployment Ben Y. Zhao, Ling Huang, Jeremy Stribling, Sean C. Rhea, Anthony D. Joseph, and John.
File Sharing : Hash/Lookup Yossi Shasho (HW in last slide) Based on Chord: A Scalable Peer-to-peer Lookup Service for Internet ApplicationsChord: A Scalable.
Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems (Antony Rowstron and Peter Druschel) Shariq Rizvi First.
Structured P2P Network Group14: Qiwei Zhang; Shi Yan; Dawei Ouyang; Boyu Sun.
RON: Resilient Overlay Networks David Andersen, Hari Balakrishnan, Frans Kaashoek, and Robert Morris MIT Laboratory for Computer Science
Mobile Ad-hoc Pastry (MADPastry) Niloy Ganguly. Problem of normal DHT in MANET No co-relation between overlay logical hop and physical hop – Low bandwidth,
Tapestry GTK Devaroy (07CS1012) Kintali Bala Kishan (07CS1024) G Rahul (07CS3009)
Designing a DHT for low latency and high throughput Robert Vollmann P2P Information Systems.
1 PASTRY. 2 Pastry paper “ Pastry: Scalable, decentralized object location and routing for large- scale peer-to-peer systems ” by Antony Rowstron (Microsoft.
PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
RON: Resilient Overlay Networks David Andersen, Hari Balakrishnan, Frans Kaashoek, Robert Morris MIT Laboratory for Computer Science
RON: Resilient Overlay Networks David Andersen, Hari Balakrishnan, Frans Kaashoek, Robert Morris MIT Laboratory for Computer Science
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
A Routing Underlay for Overlay Networks Akihiro Nakao Larry Peterson Andy Bavier SIGCOMM’03 Reviewer: Jing lu.
A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.
Resilient Overlay Networks By David Andersen, Hari Balakrishnan, Frans Kaashoek, and Robert Morris MIT RON Paper from ACM Oct Advanced Operating.
An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan Presented.
Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems Antony Rowstron and Peter Druschel, Middleware 2001.
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
1 Distributed Hash Table CS780-3 Lecture Notes In courtesy of Heng Yin.
Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 2: Distributed Hash.
Pastry Antony Rowstron and Peter Druschel Presented By David Deschenes.
CS 6401 Overlay Networks Outline Overlay networks overview Routing overlays Resilient Overlay Networks Content Distribution Networks.
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
Scribe: A Large-Scale and Decentralized Application-Level Multicast Infrastructure Miguel Castro, Peter Druschel, Anne-Marie Kermarrec, and Antony I.T.
1 Tuesday, February 03, 2009 “Work expands to fill the time available for its completion.” - Parkinson’s 1st Law.
Plethora: A Locality Enhancing Peer-to-Peer Network Ronaldo Alves Ferreira Advisor: Ananth Grama Co-advisor: Suresh Jagannathan Department of Computer.
CS694 - DHT1 Distributed Hash Table Systems Hui Zhang University of Southern California.
Fabián E. Bustamante, Fall 2005 A brief introduction to Pastry Based on: A. Rowstron and P. Druschel, Pastry: Scalable, decentralized object location and.
Chapter 29 Peer-to-Peer Paradigm Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 24 – Introduction to Peer-to-Peer (P2P) Systems Klara Nahrstedt (presented by Long Vu)
Pastry Scalable, decentralized object locations and routing for large p2p systems.
Controlling the Cost of Reliability in Peer-to-Peer Overlays
COS 461: Computer Networks
PASTRY.
COS 461: Computer Networks
Presentation transcript:

Overlays and DHTs Presented by Dong Wang and Farhana Ashraf

Schedule Review of Overlay and DHT RON Pastry Kelips

Review on Overlay and DHT Overlay  Network build on top of another network  Nodes connected to each other by logical/virtual links  Improve Internet Routing and Easy to deploy  P2P(Gnutella, Chord…), RON DHT  Allows you to do lookup, insert, delete objects with keys in a distributed settings  Performance Concerns Load Balancing Fault Tolerance Efficiency of lookups and inserts Locality  CAN, Chord, Pasrty, Tapestry are all DHTs

Resilient Overlay Network David G. Andersen, etc. MIT SOSP 2001 Acknowledged: Previous CS525 Courses

RON-Resilient Overlay Network Motivation Goal Design Evaluation Discussion

RON-Motivation Current Internet Backbone NOT be able to Detect failed path and recover quickly BGP takes several mins to recover from faults Detect flooding and congestion effectively Leverage redundant path efficiently Express fine-grained policy/metrics

RON-Basic Idea A B C D Inequality of Triangles does not usually hold for Internet! -Eg. Latency-It is possible that: AB+BC<AC RON makes use of underlying path redundancy of Internet to provide better path and route around failure RON is end to end solution, packets are simply wrapped around and sent normally

RON-Main Goals Fast Failure Detection and Recovery  Average detect and recover delay<20s Tighter integration of routing/path selection with the application  Application can specify metrics to affect routing Expressive Policy Routing  Fine-grained and aim at users and hosts

RON-Design Overlay  Old idea in networks  Easily deployed and let Internet focus on scalability  Only keep functionality between active peers Approach  Aggressively probe all inter-RON node paths  Exchange routing information  Route along best path (from end to end view) consistent with routing policy

RON-Design: Architecture Probe between nodes, detect path quality Store path qualities at Performance Database Link-state routing protocol among nodes Data are handled by application-specific conduit and forwarded in UDP

RON-Design: Routing and Lookup Policy routing  Classify by policy  Generate table per policy Metric optimization  Application tags the packet with its specific metric  Generate table per metric Multi-level routing table and 3 stage lookup  Policy->Metric->Next hop

RON Design-Probing and Outage Detect Probe every PROBE_INTERVAL (12s) With 3 packets, both participants get an RTT and reachability without syn. Clocks If probe is lost, send next immediately, up to 3 more probes (PROBE_TIMEOUT 3s) Notify outage after 4 consecutive probe loses Outage detection time on average=19 s

RON Design-Policy Routing Allow user to define types of traffic allowed on particular network links Place policy tags on packets and build up policy based routing table Two policy mechanisms  exclusive cliques  general policies

RON-Evaluation Two main dataset from Internet deployment  RON1-N=12 nodes, 132 distinct paths, traverse 36 AS and 74 Inter-AS paths  RON2-N=16 nodes, 240 distinct paths, traverse 50 AS and 118 Inter-AS paths Policy-prohibit sending traffic to or from commercial sites over the Internet2

RON Evaluation-Major Results Increase the resilience of the overlay network  RON takes ~10s to route around failure Compared to BGP’s several minutes  Many Internet outage are avoidable Improve performance –Loss rate, Latency, TCP Throughput Single-hop indirect routing works well Overhead is reasonable and acceptable

RON vs Internet 30 minute loss rates For RON1-able to route around all outages; For RON2-about 60% outages are overcome

Performance-Loss rate

Performance-Latency

Performance-TCP Throughput Performance Improvement: RON employs App- specific metric optimization to select path

Scalability

Conclusions RON improves network reliability and performance Overlay approach is attractive for resiliency: development, fewer nodes, simple substrate Single-hop indirection in RON works well RON also introduces more probing and updating traffic into network

RON-Discussion Aggressiveness  RON never back-off as TCP does  Can it coexist with current traffic on Internet?  What happens if everyone starts to use RON?  Is it possible to modify RON to achieve good behavior in a global sense Scalability  Trade scalability for improved reliability  Many RONS coexisting in the Internet  Hierarchical structure of RON network

Distributed Hash Table

Problem Route a msg with key, K to the node, Z which has a ID closest to key K Not scalable if routing table contains all the nodes Tradeoffs  Memory per node  Lookup latency  Number of messages X=d46a1c d462ba d4213f d13da3 A = 65a1fc d467c4 d471f1 Route(d46a1c)

Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems Antony Rowstron Peter Druschel Middleware 2001 Acknowledged: Previous CS525 Courses

Motivation Node IDs are assigned randomly  With high probability nodes with adjacent IDs are diverse Considers network locality  Seeks to minimize distance messages travel Scalar proximity metric  #IP routing hops  RTT

Pastry: Node Soft State Immediate Neighbors in ID space (Used for routing) Nodes closest according to locality (Used to update routing table) Used for routing Similar to successor and predecessor Similar to finger table entries Storage requirement in each node = O(log N)

Pastry: Node Soft State Leaf Set  Contains L nodes, closest in the ID space Neighborhood Set  Contains M nodes, closest according to proximity metric Routing Table  Entries of row n, shares exactly the first n digits with the local node  Nodes are chosen according to proximity metric

Pastry: Routing Case I  Key within leaf set Route to the node in the leaf set with ID closest to key Case II (Prefix Routing)  Key not within leaf set Route a node in the routing table, such that the new node shares one more digit with the key than the local node Case III  Key not within leaf set  Case II not possible Route to a node which shares at least same number of digits with the key, but is closer to the key than the local node

Routing Example Cuts the ID space into 1/(2^b) Number of hops needed is log 2^b N d46a1c d462ba d4213f d13da3 65a1fc d467c4 d471f1 lookup(d46a1c)

Self Organization: Node Join X=d46a1c Route(d46a1c) d462ba d4213f d13da3 A = 65a1fc Z=d467c4 d471f1 New node: X=d46a1c A is X’s neighbor

Pastry: Node State Initialization Leaf set (X) = leaf set (Z) Neighborhood set (A) = neighborhood set (X) Routing Table  Row zero of X = row zero of A  Row one of X = row one of B X=d46a1c Route(d46a1c) d462ba d4213f B= d13da3 A = 65a1fc Z=d467c4 d471f1 New node: X=d46a1c

X informs any nodes that need to be aware of its arrival  X also improves its table locality by requesting neighborhood sets from all nodes X knows  In practice: optimistic approach Pastry: Node State Update

Pastry: Node departure (failure) Leaf set repair (eager – all the time):  Leaf set members exchange keep-alive messages  Request set from furthest live node in set Routing table repair (lazy – upon failure):  Get table from peers in the same row, if not found – from higher rows Neighborhood set repair (eager)

Routing Performance Average no. of hops = log(N)Pastry uses locality information

Kelips: Building an Efficient and Stable P2P DHT through Increased Memory and Background Overhead Indranil Gupta, Ken Birman, Prakash Linga, Al Demers, and Robert van Renesse IPTPS 2003 Acknowledged: Previous CS525 Courses

Motivation For n= and 20 byte per entry  Storage requirement at a Pastry node = 120 byte Not using memory efficiently How can we achieve O(1) lookup latency?  Increase memory usage per node

Design Consists of k virtual affinity groups Each node member of an affinity group Soft State  Affinity Group View (Partial) set of other nodes lying in the same affinity group  Contacts (constant sized) set of nodes lying in each of the foreign affinity groups  Filetuples (partial) set of filename and host IP address (homenode), where homenode lies in the same affinity group Contains RTT, heartbeat count for each of the entries

Kelips: Node Soft State … Affinity Group # 0 # 1 # k soft state id hbeatrttime ms 67ms … affinity group view group contactnodes 0 1 [129, 30, … ] [15, 160, …] fname homenode … [18, 167, … ] p2p.txt contacts filetuple

Storage Kelips Node S(k,n) = n/k + c * (k-1)+ F/k entries  Minimized at k = √( (n+F) / c ) Assume F is proportional to n, and c fixed Optimal k = O(sqrt(n))  For n= , and F = 10 million Total memory requirement < 20 MB Affinity groupsContacts Filetuples

Algorithm: Lookup Lookup (key D at node A)  Affinity group G of homenode of D = hash(D)  A sends message to closest node X in the contact set for affinity group G  X finds homenode of D from filetuple set O(1) lookup

Maintaining Soft State Heartbeat mechanism  Soft state entries refreshed periodically within and across groups Each node periodically selects a few nodes as gossip targets and sends them partial soft state information Uses constant gossip message size O(log n) complexity for gossiping

Load Balance N = 1500 with 38 affinity groups1000 nodes with 30 affinity groups

Discussion Points What happens when triangular inequality does not hold for a proximity metric? What happens for high churn rate in Pastry and Kelips? What was the intuition behind affinity groups? Can we use Kelips in a Internet scale network?

Conclusion A DHT tradeoff:  Storage requirement  Lookup Latency Going one step further  One hop lookups for p2p overlays ChordPastryKelips Storage requirement O(log N) O(√n) Lookup Latency O(log N) O(1)