Handling Churn in a DHT USENIX Annual Technical Conference June 29, 2004 Sean Rhea, Dennis Geels, Timothy Roscoe, and John Kubiatowicz UC Berkeley and.

Slides:



Advertisements
Similar presentations
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Advertisements

Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
EECS 262a Advanced Topics in Computer Systems Lecture 21 Chord/Tapestry November 7 th, 2012 John Kubiatowicz and Anthony D. Joseph Electrical Engineering.
CHORD – peer to peer lookup protocol Shankar Karthik Vaithianathan & Aravind Sivaraman University of Central Florida.
Chord: A scalable peer-to- peer lookup service for Internet applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashock, Hari Balakrishnan.
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
EIGRP routing protocol Omer ben-shalom Omer Ben-Shalom: Must show how EIGRP is dealing with count to infinity problem Omer Ben-Shalom: Must.
Receiver-driven Layered Multicast S. McCanne, V. Jacobsen and M. Vetterli SIGCOMM 1996.
Presented by Elisavet Kozyri. A distributed application architecture that partitions tasks or work loads between peers Main actions: Find the owner of.
Scribe: A Large-Scale and Decentralized Application-Level Multicast Infrastructure Miguel Castro, Peter Druschel, Anne-Marie Kermarrec, and Antony L. T.
Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.
X Non-Transitive Connectivity and DHTs Mike Freedman Karthik Lakshminarayanan Sean Rhea Ion Stoica WORLDS 2005.
Positive Feedback Loops in DHTs or Be Careful How You Simulate January 13, 2004 Sean Rhea, Dennis Geels, Timothy Roscoe, and John Kubiatowicz From “Handling.
Distributed Lookup Systems
University of Oregon Slides from Gotz and Wehrle + Chord paper
Secure routing for structured peer-to-peer overlay networks (by Castro et al.) Shariq Rizvi CS 294-4: Peer-to-Peer Systems.
CS218 – Final Project A “Small-Scale” Application- Level Multicast Tree Protocol Jason Lee, Lih Chen & Prabash Nanayakkara Tutor: Li Lao.
Fixing the Embarrassing Slowness of OpenDHT on PlanetLab Sean Rhea, Byung-Gon Chun, John Kubiatowicz, and Scott Shenker UC Berkeley (and now MIT) December.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
Or, Providing Scalable, Decentralized Location and Routing Network Services Tapestry: Fault-tolerant Wide-area Application Infrastructure Motivation and.
1 Peer-to-Peer Networks Outline Survey Self-organizing overlay network File system on top of P2P network Contributions from Peter Druschel.
Tapestry: A Resilient Global-scale Overlay for Service Deployment Ben Y. Zhao, Ling Huang, Jeremy Stribling, Sean C. Rhea, Anthony D. Joseph, and John.
Peer-to-peer file-sharing over mobile ad hoc networks Gang Ding and Bharat Bhargava Department of Computer Sciences Purdue University Pervasive Computing.
1 Koorde: A Simple Degree Optimal DHT Frans Kaashoek, David Karger MIT Brought to you by the IRIS project.
Beyond Theory: DHTs in Practice CS Networks Sean C. Rhea April 18, 2005 In collaboration with: Dennis Geels, Brighten Godfrey, Brad Karp, John Kubiatowicz,
Data Consistency in the Structured Peer-to-Peer Network Cheng-Ying Ou, Polly Huang Network and Systems Lab 台灣大學電機資訊學院電機所.
Mobile Ad-hoc Pastry (MADPastry) Niloy Ganguly. Problem of normal DHT in MANET No co-relation between overlay logical hop and physical hop – Low bandwidth,
Distributed Hash Table Systems Hui Zhang University of Southern California.
Designing a DHT for low latency and high throughput Robert Vollmann P2P Information Systems.
Network Layer (3). Node lookup in p2p networks Section in the textbook. In a p2p network, each node may provide some kind of service for other.
EECS 262a Advanced Topics in Computer Systems Lecture 22 P2P Storage: Dynamo November 17 th, 2014 John Kubiatowicz Electrical Engineering and Computer.
DHTs and Peer-to-Peer Systems Supplemental Slides Aditya Akella 03/21/2007.
Impact of Neighbor Selection on Performance and Resilience of Structured P2P Networks IPTPS Feb. 25, 2005 Byung-Gon Chun, Ben Y. Zhao, and John Kubiatowicz.
The Impact of DHT Routing Geometry on Resilience and Proximity K. Gummadi, R. Gummadi..,S.Gribble, S. Ratnasamy, S. Shenker, I. Stoica.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
A Routing Underlay for Overlay Networks Akihiro Nakao Larry Peterson Andy Bavier SIGCOMM’03 Reviewer: Jing lu.
A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.
Presentation 1 By: Hitesh Chheda 2/2/2010. Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT Laboratory for Computer Science.
An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer.
EECS 262a Advanced Topics in Computer Systems Lecture 22 P2P Storage: Dynamo November 20 th, 2013 John Kubiatowicz and Anthony D. Joseph Electrical Engineering.
Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
DHT-based unicast for mobile ad hoc networks Thomas Zahn, Jochen Schiller Institute of Computer Science Freie Universitat Berlin 報告 : 羅世豪.
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 2: Distributed Hash.
1. Efficient Peer-to-Peer Lookup Based on a Distributed Trie 2. Complex Queries in DHT-based Peer-to-Peer Networks Lintao Liu 5/21/2002.
PRIN WOMEN PROJECT Research Unit: University of Naples Federico II G. Ferraiuolo
Peer to Peer Network Design Discovery and Routing algorithms
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
Beyond Theory: DHTs in Practice CS Distributed Systems Sean C. Rhea April 18, 2005 In collaboration with: Dennis Geels, Brighten Godfrey, Brad Karp,
Plethora: A Locality Enhancing Peer-to-Peer Network Ronaldo Alves Ferreira Advisor: Ananth Grama Co-advisor: Suresh Jagannathan Department of Computer.
Data Structure of Chord  For each peer -successor link on the ring -predecessor link on the ring -for all i ∈ {0,..,m-1} Finger[i] := the peer following.
CS694 - DHT1 Distributed Hash Table Systems Hui Zhang University of Southern California.
The Cost of Inconsistency in Chord Shelley Zhuang, Ion Stoica, Randy Katz OASIS/i3 Retreat, January 2005.
Incrementally Improving Lookup Latency in Distributed Hash Table Systems Hui Zhang 1, Ashish Goel 2, Ramesh Govindan 1 1 University of Southern California.
Distributed Hash Tables (DHT) Jukka K. Nurminen *Adapted from slides provided by Stefan Götz and Klaus Wehrle (University of Tübingen)
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications * CS587x Lecture Department of Computer Science Iowa State University *I. Stoica,
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 24 – Introduction to Peer-to-Peer (P2P) Systems Klara Nahrstedt (presented by Long Vu)
Distributed Hash Tables
Controlling the Cost of Reliability in Peer-to-Peer Overlays
COS 461: Computer Networks
An Overlay Infrastructure for Decentralized Object Location and Routing Ben Y. Zhao University of California at Santa Barbara.
DHT Routing Geometries and Chord
John Kubiatowicz Electrical Engineering and Computer Sciences
ECE 544 Project3 Team member: BIAO LI, BO QU, XIAO ZHANG 1 1.
COS 561: Advanced Computer Networks
COS 461: Computer Networks
Exploiting Routing Redundancy via Structured Peer-to-Peer Overlays
P2P: Distributed Hash Tables
Presentation transcript:

Handling Churn in a DHT USENIX Annual Technical Conference June 29, 2004 Sean Rhea, Dennis Geels, Timothy Roscoe, and John Kubiatowicz UC Berkeley and Intel Research Berkeley

What’s a DHT? Distributed Hash Table –Peer-to-peer algorithm to offering put/get interface –Associative map for peer-to-peer applications More generally, provide lookup functionality –Map application-provided hash values to nodes –(Just as local hash tables map hashes to memory locs.) –Put/get then constructed above lookup Many proposed applications –File sharing, end-system multicast, aggregation trees

How Does Lookup Work? 0… 10… 110… 111… Lookup ID Source Response Assign IDs to nodes –Map hash values to node with closest ID Leaf set is successors and predecessors –All that’s needed for correctness Routing table matches successively longer prefixes –Allows efficient lookups

Why Focus on Churn? AuthorsSystems ObservedSession Time SGG02Gnutella, Napster 50% < 60 minutes CLL02Gnutella, Napster 31% < 10 minutes SW02FastTrack 50% < 1 minute BSV03Overnet 50% < 60 minutes GDS03Kazaa 50% < 2.4 minutes Chord is a “scalable protocol for lookup in a dynamic peer-to-peer system with frequent node arrivals and departures” -- Stoica et al., 2001

A Simple lookup Test Start up 1,000 DHT nodes on ModelNet network –Emulates a 10,000-node, AS-level topology –Unlike simulations, models cross traffic and packet loss –Unlike PlanetLab, gives reproducible results Churn nodes at some rate –Poisson arrival of new nodes –Random node departs on every new arrival –Exponentially distributed session times Each node does 1 lookup every 10 seconds –Log results, process them after test

Early Test Results Tapestry (the OceanStore DHT) falls over completely –Worked great in simulations, but not on more realistic network –Despite sharing almost all code between the two And the problem isn’t limited to Tapestry:

Handling Churn in a DHT Forget about comparing different impls. –Too many differing factors –Hard to isolate effects of any one feature Implement all relevant features in one DHT –Using Bamboo (similar to Pastry) Isolate important issues in handling churn 1.Recovering from failures 2.Routing around suspected failures 3.Proximity neighbor selection

Recovering From Failures For correctness, maintain leaf set during churn –Also routing table, but not needed for correctness The Basics –Ping new nodes before adding them –Periodically ping neighbors –Remove nodes that don’t respond Simple algorithm –After every change in leaf set, send to all neighbors –Called reactive recovery

The Problem With Reactive Recovery Under churn, many pings and change messages –If bandwidth limited, interfere with each other –Lots of dropped pings looks like a failure Respond to failure by sending more messages –Probability of drop goes up –We have a positive feedback cycle (squelch) Can break cycle two ways 1.Limit probability of “false suspicions of failure” 2.Recovery periodically

Periodic Recovery Periodically send whole leaf set to a random member –Breaks feedback loop –Converges in O(log N) Back off period on message loss –Makes a negative feedback cycle (damping)

Routing Around Failures Being conservative increases latency –Original next hop may have left network forever –Don’t want to stall lookups DHT has many possible routes –But retrying too soon leads to packet explosion Goal: 1.Know for sure that packet is lost 2.Then resend along different path

Calculating Good Timeouts Use TCP-style timers –Keep past history of latencies –Use this to compute timeouts for new requests Works fine for recursive lookups –Only talk to neighbors, so history small, current RecursiveIterative In iterative lookups, source directs entire lookup –Must potentially have good timeout for any node

Virtual Coordinates Machine learning algorithm to estimate latencies –Distance between coords. proportional to latency –Called Vivaldi; used by MIT Chord implementation Compare with TCP-style under recursive routing –Insight into cost of iterative routing due to timeouts

Proximity Neighbor Selection (PNS) For each neighbor, may be many candidates –Choosing closest with right prefix called PNS –One of the most researched areas in DHTs –Can we achieve good PNS under churn? Remember: –leaf set for correctness –routing table for efficiency? Insight: extend this philosophy –Any routing table gives O(log N) lookup hops –Treat PNS as an optimization only –Find close neighbors by simple random sampling

PNS Results (very abbreviated--see paper for more) Random sampling almost as good as everything else –24% latency improvement free –42% improvement for 40% more b.w. –Compare to 68%-84% improvement by using good timeouts Other algorithms more complicated, not much better

Related Work Liben-Nowell et al. –Analytical lower bound on maintenance costs Mahajan et al. –Simulation-based study of Pastry under churn –Automatic tuning of maintenance rate –Suggest increasing rate on failures! Other simulations –Li et al. –Lam and Liu Zhuang –Cooperative failure detection in DHTs Dabek et al. –Throughput and latency improvements w/o churn

Future Work Continue study of iterative routing –Have shown virtual coordinates good for timeouts –How does congestion control work under churn? Broaden methodology –Better network and churn models Move beyond lookup layer –Study put/get and multicast algorithms under churn

Conclusions/Recommendations Avoid positive feedback cycles in recovery –Beware of “false suspicions of failure” –Recover periodically rather than reactively Route around potential failures early –Don’t wait to conclude definite failure –TCP-style timeouts quickest for recursive routing –Virtual-coordinate-based timeouts not prohibitive PNS can be cheap and effective –Only need simple random sampling

For code and more information: bamboo-dht.org