Distributed hash tables Protocols and applications Jinyang Li.

Slides:



Advertisements
Similar presentations
Information-Centric Networks05c-1 Week 5 / Paper 3 Democratizing content publication with Coral –Michael J. Freedman, Eric Freudenthal, David Mazières.
Advertisements

P2P data retrieval DHT (Distributed Hash Tables) Partially based on Hellerstein’s presentation at VLDB2004.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Peer to Peer and Distributed Hash Tables
Peer-to-Peer (P2P) Distributed Storage 1Dennis Kafura – CS5204 – Operating Systems.
P2P Systems and Distributed Hash Tables Section COS 461: Computer Networks Spring 2011 Mike Freedman
Chord: A scalable peer-to- peer lookup service for Internet applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashock, Hari Balakrishnan.
Distributed Hash Tables CPE 401 / 601 Computer Network Systems Modified from Ashwin Bharambe and Robert Morris.
Chord A Scalable Peer-to-peer Lookup Service for Internet Applications
Democratizing Content Publication with Coral Mike Freedman Eric Freudenthal David Mazières New York University NSDI 2004.
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
Application Layer Overlays IS250 Spring 2010 John Chuang.
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.
Topics in Reliable Distributed Systems Lecture 2, Fall Dr. Idit Keidar.
Introduction to Peer-to-Peer (P2P) Systems Gabi Kliot - Computer Science Department, Technion Concurrent and Distributed Computing Course 28/06/2006 The.
Scalable Resource Information Service for Computational Grids Nian-Feng Tzeng Center for Advanced Computer Studies University of Louisiana at Lafayette.
P2P: Advanced Topics Filesystems over DHTs and P2P research Vyas Sekar.
Distributed Lookup Systems
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 2: Peer-to-Peer.
Object Naming & Content based Object Search 2/3/2003.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
Wide-area cooperative storage with CFS
File Sharing : Hash/Lookup Yossi Shasho (HW in last slide) Based on Chord: A Scalable Peer-to-peer Lookup Service for Internet ApplicationsChord: A Scalable.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
Storage management and caching in PAST PRESENTED BY BASKAR RETHINASABAPATHI 1.
Introduction to Peer-to-Peer Networks. What is a P2P network Uses the vast resource of the machines at the edge of the Internet to build a network that.
INTRODUCTION TO PEER TO PEER NETWORKS Z.M. Joseph CSE 6392 – DB Exploration Spring 2006 CSE, UT Arlington.
Application Layer – Peer-to-peer UIUC CS438: Communication Networks Summer 2014 Fred Douglas Slides: Fred, Kurose&Ross (sometimes edited)
Content Overlays (Nick Feamster). 2 Content Overlays Distributed content storage and retrieval Two primary approaches: –Structured overlay –Unstructured.
Chord & CFS Presenter: Gang ZhouNov. 11th, University of Virginia.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
1 Distributed Hash Tables (DHTs) Lars Jørgen Lillehovde Jo Grimstad Bang Distributed Hash Tables (DHTs)
Network Computing Laboratory Scalable File Sharing System Using Distributed Hash Table Idea Proposal April 14, 2005 Presentation by Jaesun Han.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
SIGCOMM 2001 Lecture slides by Dr. Yingwu Zhu Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,
Information-Centric Networks Section # 5.3: Content Distribution Instructor: George Xylomenos Department: Informatics.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Distributed Hash Tables Steve Ko Computer Sciences and Engineering University at Buffalo.
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
CS Spring 2014 CS 414 – Multimedia Systems Design Lecture 37 – Introduction to P2P (Part 1) Klara Nahrstedt.
Two Peer-to-Peer Networking Approaches Ken Calvert Net Seminar, 23 October 2001 Note: Many slides “borrowed” from S. Ratnasamy’s Qualifying Exam talk.
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
CS 347Notes081 CS 347: Parallel and Distributed Data Management Notes 08: P2P Systems.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
Malugo – a scalable peer-to-peer storage system..
CSE 486/586 Distributed Systems Distributed Hash Tables
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 24 – Introduction to Peer-to-Peer (P2P) Systems Klara Nahrstedt (presented by Long Vu)
Peer-to-Peer Information Systems Week 12: Naming
CSE 486/586 Distributed Systems Distributed Hash Tables
Distributed Hash Tables
Peer-to-Peer Data Management
(slides by Nick Feamster)
Distributed Hash Tables
Early Measurements of a Cluster-based Architecture for P2P Systems
EE 122: Peer-to-Peer (P2P) Networks
DHT Routing Geometries and Chord
Building Peer-to-Peer Systems with Chord, a Distributed Lookup Service
CS 162: P2P Networks Computer Science Division
P2P Systems and Distributed Hash Tables
MIT LCS Proceedings of the 2001 ACM SIGCOMM Conference
Consistent Hashing and Distributed Hash Table
CSE 486/586 Distributed Systems Distributed Hash Tables
Peer-to-Peer Information Systems Week 12: Naming
Presentation transcript:

Distributed hash tables Protocols and applications Jinyang Li

Peer to peer systems Symmetric in node functionalities –Anybody can join and leave –Everybody gives and takes Great potentials –Huge aggregate network capacity, storage etc. –Resilient to single point of failure and attack E.g. Gnapster, Gnutella, Bittorrent, Skype, Joost

Motivations: the lookup problem How to locate something given its name? Examples: –File sharing –VoIP –Content distribution networks (CDN)

Strawman #1: a central search service Central service poses a single point of failure or attack Not scalable(?) Central service needs $$$

Improved #1: Hierarchical lookup Performs hierarchical lookup (like DNS) More scalable than a central site Root of the hierarchy is still a single point of vulnerability

Strawman #2: Flooding Symmetric design: No special servers needed Too much overhead traffic Limited in scale No guarantee for results Where is item K?

Improved #2: super peers E.g. Kazaa, Skype Lookup only floods super peers Still no guarantees for lookup results

DHT approach: routing Structured: –put(key, value); get(key, value) –maintain routing state for fast lookup Symmetric: –all nodes have identical roles Many protocols: –Chord, Kademlia, Pastry, Tapestry, CAN, Viceroy

DHT ideas Scalable routing needs a notion of “closeness” Ring structure [Chord]

Different notions of “closeness” Tree-like structure, [Pastry, Kademlia, Tapestry] **011001**011010**011000** Distance measured as the length of longest matching prefix with the lookup key

Different notions of “closeness” Cartesian space [CAN] 0,0.5, 0.5,1 0.5,0.5, 1,1 0,0, 0.5, ,0.25,0.75, ,0,0.75, ,0, 1,0.5 (0,0) (1,1) Distance measured as geometric distance

DHT: complete protocol design 1.Name nodes and data items 2.Define data item to node assignment 3.Define per-node routing table contents 4.Handle churn How do new nodes join? How do nodes handle others’ failures?

Chord: Naming Each node has a unique flat ID –E.g. SHA-1(IP address) Each data item has a unique flat ID (key) –E.g. SHA-1(data content)

Data to node assignment Consistent hashing Predecessor is “closest” to Successor is responsible for

Key-based lookup (routing) Correctness –Each node must know its correct successor

Key-based lookup Fast lookup –A node has O(log n) fingers exponentially “far” away Node x’s i-th finger is at x+2 i away Why is lookup fast?

Handling churn: join New node w joins an existing network –Issue a lookup to find its successor x –Set its successor: w.succ = x –Set its predecessor: w.pred = x.pred –Obtain subset of responsible items from x –Notify predecessor: x.pred = w

Handling churn: stabilization Each node periodically fixes its state –Node w asks its successor x for x.pred –If x.pred is “closer” to itself, set w.succ = x.pred Starting with any connected graph, stabilization eventually makes all nodes find correct successors

Handling churn: failures Nodes leave without notice –Each node keeps s (e.g. 16) successors –Replicate keys on s successors

Handling churn: fixing fingers Much easier to maintain fingers –Any node at [2 i, 2 i+1 ] distance away will do –Geographically closer fingers --> lower latency –Periodically flush dead fingers

Using DHTs in real systems Amazon’s Dynamo key-value storage [SOSP’07] Serve as persistent state for applications –shopping carts, user preferences How does it apply DHT ideas? –Assign key-value pairs to responsible nodes –Automatic recovery from churn New challenges? –Manage read-write data consistency

Using DHTs in real systems Keyword-based searches for file sharing –eMule, Azureus How to apply a DHT? –User has file 1f3d… with name jingle bell Britney –Insert mappings: SHA-1(jingle)  1f3d, SHA-1(bell)  1f3d, SHA-1(britney)  1f3d –How to answer query “jingle bell”, “Britney”? Challenges? –Some keywords are much more popular than others –RIAA inserts a billion “Britney spear xyz”  crap?

Using DHTs in real systems CoralCDN

Motivation: alleviating flash crowd Origin Server Browser http proxy Proxies handle most client requests Proxies cooperate to fetch content from each other

Getting content with CoralCDN Origin Server 1.Server selection What CDN node should I use? Clients use CoralCDN via modified domain name nytimes.com/file → nytimes.com.nyud.net/file 2.Lookup(URL) What nodes are caching the URL? 3.Content transmission From which caching nodes should I download file?

Coral design goals Don’t control data placement –Nodes cache based on access patterns Reduce load at origin server –Serve files from cached copies whenever possible Low end-to-end latency –Lookup and cache download optimize for locality Scalable and robust lookups

Given a URL: –Where is the data cached? –Map name to location: URL  {IP 1, IP 2, IP 3, IP 4 } –lookup(URL)  Get IPs of nearby caching nodes –insert(URL,myIP)  Add me as caching URL Lookup for cache locations,TTL) for TTL seconds Isn’t this what a distributed hash table is for?

URL 1 ={IP 1,IP 2,IP 3,IP 4 } SHA-1(URL 1 ) insert(URL 1,myIP) A straightforward use of DHT Problems –No load balance for a single URL –All insert and lookup go to the same node (cannot be close to everyone)

#1 Solve load balance problem: relax hash table semantics DHTs designed for hash-table semantics –Insert and replace:URL  IP last –Insert and append:URL  {IP 1, IP 2, IP 3, IP 4 } Each Coral lookup needs only few values –lookup (URL)  {IP 2, IP 4 } –Preferably ones close in network

Prevent hotspots in index 123 # hops: Route convergence –O(b) nodes are 1 hop from root –O(b 2 ) nodes are 2 hops away from root … Leaf nodes (distant IDs) Root node (closest ID)

Prevent hotspots in index 123 # hops: Request load increases exponentially towards root URL={IP 1,IP 2,IP 3,IP 4 } Root node (closest ID) Leaf nodes (distant IDs)

Rate-limiting requests 123 # hops: Refuse to cache if already have max # “fresh” IPs / URL, –Locations of popular items pushed down tree Root node (closest ID) URL={IP 1,IP 2,IP 3,IP 4 } Leaf nodes (distant IDs) URL={IP 3,IP 4 } URL={IP 5 }

Rate-limiting requests 123 # hops: Root node (closest ID) URL={IP 1,IP 2,IP 6,IP 4 } Leaf nodes (distant IDs) Refuse to cache if already have max # “fresh” IPs / URL, –Locations of popular items pushed down tree Except, nodes leak through at most β inserts / min / URL –Bound rate of inserts towards root, yet root stays fresh URL={IP 3,IP 4 } URL={IP 5 }

Load balance results Aggregate request rate: ~12 million / min Rate-limit per node ( β ): 12 / min Root has fan-in from 7 others 494 nodes on PlanetLab 3 β 2 β 1 β 7 β

#2 Solve lookup locality problem Cluster nodes hierarchically based on RTT Lookup traverses up hierarchy –Route to “closest” node at each level

Preserve locality through hierarchy 000… 111… Distance to key None < 60 ms < 20 ms Thresholds Minimizes lookup latency Prefer values stored by nodes within faster clusters

Clustering reduces lookup latency Reduces median lat by factor of 4

Putting all together: Coral reduces server load Local disk caches begin to handle most requests Most hits in 20-ms Coral cluster Few hits to origin 400+ nodes provide 32 Mbps 100x capacity of origin