Distributed Hash Tables

Slides:



Advertisements
Similar presentations
P2P data retrieval DHT (Distributed Hash Tables) Partially based on Hellerstein’s presentation at VLDB2004.
Advertisements

Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Peer to Peer and Distributed Hash Tables
Scalable Content-Addressable Network Lintao Liu
P2P Systems and Distributed Hash Tables Section COS 461: Computer Networks Spring 2011 Mike Freedman
Precept 6 Hashing & Partitioning 1 Peng Sun. Server Load Balancing Balance load across servers Normal techniques: Round-robin? 2.
Chord A Scalable Peer-to-peer Lookup Service for Internet Applications Prepared by Ali Yildiz (with minor modifications by Dennis Shasha)
Technische Universität Chemnitz Kurt Tutschku Vertretung - Professur Rechner- netze und verteilte Systeme Chord - A Distributed Hash Table Yimei Liao.
The Chord P2P Network Some slides have been borowed from the original presentation by the authors.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Speaker: Cathrin Weiß 11/23/2004 Proseminar Peer-to-Peer Information Systems.
1 1 Chord: A scalable Peer-to-peer Lookup Service for Internet Applications Dariotaki Roula
Xiaowei Yang CompSci 356: Computer Network Architectures Lecture 22: Overlay Networks Xiaowei Yang
Distributed Hash Tables CPE 401 / 601 Computer Network Systems Modified from Ashwin Bharambe and Robert Morris.
Robert Morris, M. Frans Kaashoek, David Karger, Hari Balakrishnan, Ion Stoica, David Liben-Nowell, Frank Dabek Chord: A scalable peer-to-peer look-up.
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.
Topics in Reliable Distributed Systems Lecture 2, Fall Dr. Idit Keidar.
Introduction to Peer-to-Peer (P2P) Systems Gabi Kliot - Computer Science Department, Technion Concurrent and Distributed Computing Course 28/06/2006 The.
Chord: A Scalable Peer-to-Peer Lookup Protocol for Internet Applications Stoica et al. Presented by Tam Chantem March 30, 2007.
Distributed Lookup Systems
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications 吳俊興 國立高雄大學 資訊工程學系 Spring 2006 EEF582 – Internet Applications and Services 網路應用與服務.
Wide-area cooperative storage with CFS
File Sharing : Hash/Lookup Yossi Shasho (HW in last slide) Based on Chord: A Scalable Peer-to-peer Lookup Service for Internet ApplicationsChord: A Scalable.
1 Koorde: A Simple Degree Optimal DHT Frans Kaashoek, David Karger MIT Brought to you by the IRIS project.
Content Overlays (Nick Feamster). 2 Content Overlays Distributed content storage and retrieval Two primary approaches: –Structured overlay –Unstructured.
Chord & CFS Presenter: Gang ZhouNov. 11th, University of Virginia.
1 Reading Report 5 Yin Chen 2 Mar 2004 Reference: Chord: A Scalable Peer-To-Peer Lookup Service for Internet Applications, Ion Stoica, Robert Morris, david.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Presentation 1 By: Hitesh Chheda 2/2/2010. Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT Laboratory for Computer Science.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
Lecture 12 Distributed Hash Tables CPE 401/601 Computer Network Systems slides are modified from Jennifer Rexford.
1 Distributed Hash Table CS780-3 Lecture Notes In courtesy of Heng Yin.
Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 2: Distributed Hash.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Distributed Hash Tables Steve Ko Computer Sciences and Engineering University at Buffalo.
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
CS694 - DHT1 Distributed Hash Table Systems Hui Zhang University of Southern California.
CSE 486/586 Distributed Systems Distributed Hash Tables
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications * CS587x Lecture Department of Computer Science Iowa State University *I. Stoica,
1 Distributed Hash tables. 2 Overview r Objective  A distributed lookup service  Data items are distributed among n parties  Anyone in the network.
The Chord P2P Network Some slides taken from the original presentation by the authors.
Peer-to-Peer Information Systems Week 12: Naming
Ion Stoica, Robert Morris, David Liben-Nowell, David R. Karger, M
CSE 486/586 Distributed Systems Distributed Hash Tables
The Chord P2P Network Some slides have been borrowed from the original presentation by the authors.
A Scalable Peer-to-peer Lookup Service for Internet Applications
Peer-to-Peer Data Management
(slides by Nick Feamster)
COS 461: Computer Networks
Distributed Hash Tables
DHT Routing Geometries and Chord
P2P Systems and Distributed Hash Tables
EECS 498 Introduction to Distributed Systems Fall 2017
Koorde: A Simple Degree Optimal DHT
COS 461: Computer Networks
MIT LCS Proceedings of the 2001 ACM SIGCOMM Conference
Consistent Hashing and Distributed Hash Table
P2P: Distributed Hash Tables
CSE 486/586 Distributed Systems Distributed Hash Tables
A Scalable Peer-to-peer Lookup Service for Internet Applications
Peer-to-Peer Information Systems Week 12: Naming
Presentation transcript:

Distributed Hash Tables Mike Freedman COS 461: Computer Networks http://www.cs.princeton.edu/courses/archive/spr14/cos461/

Scalable algorithms for discovery If many nodes are available to cache, which one should file be assigned to? If content is cached in some node, how can we discover where it is located, avoiding centralized directory or all-to-all communication? origin server CDN server CDN server CDN server Akamai CDN: hashing to responsibility within cluster Today: What if you don’t know complete set of nodes?

Partitioning Problem Suppose we use modulo hashing Consider problem of data partition: Given document X, choose one of k servers to use Suppose we use modulo hashing Number servers 1..k Place X on server i = (X mod k) Problem? Data may not be uniformly distributed Place X on server i = hash (X) mod k Problem? What happens if a server fails or joins (k  k±1)? Problem? What is different clients has different estimate of k? Answer: All entries get remapped to new nodes!

Consistent Hashing Consistent hashing partitions key-space among nodes lookup(key1) insert(key1,value) key1=value key1 key2 key3 Consistent hashing partitions key-space among nodes Contact appropriate node to lookup/store key Blue node determines red node is responsible for key1 Blue node sends lookup or insert to red node

Consistent Hashing Partitioning key-space among nodes 0000 0010 0110 1010 1111 1100 1110 URL1 URL2 URL3 0001 0100 1011 Partitioning key-space among nodes Nodes choose random identifiers: e.g., hash(IP) Keys randomly distributed in ID-space: e.g., hash(URL) Keys assigned to node “nearest” in ID-space Spreads ownership of keys evenly across nodes

Consistent Hashing Construction Desired features 6 Consistent Hashing Construction Assign n hash buckets to random points on mod 2k circle; hash key size = k Map object to random position on circle Hash of object = closest clockwise bucket successor (key)  bucket 14 12 4 Bucket 8 Desired features Balanced: No bucket has disproportionate number of objects Smoothness: Addition/removal of bucket does not cause movement among existing buckets (only immediate buckets)

Consistent hashing and failures 7 Consistent hashing and failures Consider network of n nodes If each node has 1 bucket Owns 1/nth of keyspace in expectation Says nothing of request load per bucket If a node fails: (A) Nobody owns keyspace (B) Keyspace assigned to random node (C) Successor owns keyspaces (D) Predecessor owns keyspace After a node fails: Load is equally balanced over all nodes Some node has disproportional load compared to others  14 12 4 Bucket 8 Answer #1: (C) Answer #2: (B)

Consistent hashing and failures 8 Consistent hashing and failures Consider network of n nodes If each node has 1 bucket Owns 1/nth of keyspace in expectation Says nothing of request load per bucket If a node fails: Its successor takes over bucket Achieves smoothness goal: Only localized shift, not O(n) But now successor owns 2 buckets: keyspace of size 2/n Instead, if each node maintains v random nodeIDs, not 1 “Virtual” nodes spread over ID space, each of size 1 / vn Upon failure, v successors take over, each now stores (v+1) / vn 14 12 4 Bucket 8

Consistent hashing vs. DHTs Distributed Hash Tables Routing table size O(n) O(log n) Lookup / Routing O(1) Join/leave: Routing updates Join/leave: Key Movement

Distributed Hash Table 0000 0010 0110 1010 1100 1110 1111 0001 0100 1011 Nodes’ neighbors selected from particular distribution Visual keyspace as a tree in distance from a node

Distributed Hash Table 0000 0010 0110 1010 1100 1110 1111 Nodes’ neighbors selected from particular distribution Visual keyspace as a tree in distance from a node At least one neighbor known per subtree of increasing size /distance from node

Distributed Hash Table 0000 0010 0110 1010 1100 1110 1111 Nodes’ neighbors selected from particular distribution Visual keyspace as a tree in distance from a node At least one neighbor known per subtree of increasing size /distance from node Route greedily towards desired key via overlay hops

The Chord DHT Chord ring: ID space mod 2160 Routing correctness: nodeid = SHA1 (IP address, i) for i=1..v virtual IDs keyid = SHA1 (name) Routing correctness: Each node knows successor and predecessor on ring Routing efficiency: Each node knows O(log n) well-distributed neighbors

Basic lookup in Chord Routing Route hop by hop via successors lookup (id): if ( id > pred.id && id <= my.id ) return my.id; else return succ.lookup(id); Route hop by hop via successors O(n) hops to find destination id Routing

Efficient lookup in Chord lookup (id): if ( id > pred.id && id <= my.id ) return my.id; else // fingers() by decreasing distance for finger in fingers(): if id >= finger.id return finger.lookup(id); return succ.lookup(id); Route greedily via distant “finger” nodes O(log n) hops to find destination id Routing

Building routing tables For i in 1...log n: finger[i] = successor ( (my.id + 2i ) mod 2160 )

Joining and managing routing Choose nodeid Lookup (my.id) to find place on ring During lookup, discover future successor Learn predecessor from successor Update succ and pred that you joined Find fingers by lookup ((my.id + 2i ) mod 2160 ) Monitor: If doesn’t respond for some time, find new Leave: Just go, already! (Warn your neighbors if you feel like it)

Performance optimizations 0000 0010 0110 1010 1100 1110 1111 Routing entries need not be drawn from strict distribution as finger algorithm shown Choose node with lowest latency to you Will still get you ~ ½ closer to destination Less flexibility in choice as closer to destination

DHT Design Goals An “overlay” network with: Flexible mapping of keys to physical nodes Small network diameter Small degree (fanout) Local routing decisions Robustness to churn Routing flexibility Decent locality (low “stretch”) Different “storage” mechanisms considered: Persistence w/ additional mechanisms for fault recovery Best effort caching and maintenance via soft state

Storage models Store only on key’s immediate successor Churn, routing issues, packet loss make lookup failure more likely Store on k successors When nodes detect succ/pred fail, re-replicate Use erasure coding: can recover with j-out-of-k “chunks” of file, each chunk smaller than full replica Cache along reverse lookup path Provided data is immutable …and performing recursive responses

Summary Peer-to-peer systems Distributed hash tables Unstructured systems (next Monday) Finding hay, performing keyword search Structured systems (DHTs) Finding needles, exact match Distributed hash tables Based around consistent hashing with views of O(log n) Chord, Pastry, CAN, Koorde, Kademlia, Tapestry, Viceroy, … Lots of systems issues Heterogeneity, storage models, locality, churn management, underlay issues, … DHTs deployed in wild: Vuze (Kademlia) has 1M+ active users