Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,

Slides:



Advertisements
Similar presentations
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Advertisements

CHORD – peer to peer lookup protocol Shankar Karthik Vaithianathan & Aravind Sivaraman University of Central Florida.
Chord A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert MorrisDavid, Liben-Nowell, David R. Karger, M. Frans Kaashoek,
Chord A Scalable Peer-to-peer Lookup Service for Internet Applications Prepared by Ali Yildiz (with minor modifications by Dennis Shasha)
Technische Universität Yimei Liao Chemnitz Kurt Tutschku Vertretung - Professur Rechner- netze und verteilte Systeme Chord - A Distributed Hash Table Yimei.
Technische Universität Chemnitz Kurt Tutschku Vertretung - Professur Rechner- netze und verteilte Systeme Chord - A Distributed Hash Table Yimei Liao.
Chord: A Scalable Peer-to- Peer Lookup Service for Internet Applications Ion StoicaRobert Morris David Liben-NowellDavid R. Karger M. Frans KaashoekFrank.
The Chord P2P Network Some slides have been borowed from the original presentation by the authors.
CHORD: A Peer-to-Peer Lookup Service CHORD: A Peer-to-Peer Lookup Service Ion StoicaRobert Morris David R. Karger M. Frans Kaashoek Hari Balakrishnan Presented.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Speaker: Cathrin Weiß 11/23/2004 Proseminar Peer-to-Peer Information Systems.
Ion Stoica, Robert Morris, David Liben-Nowell, David R. Karger, M
Chord: A scalable peer-to- peer lookup service for Internet applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashock, Hari Balakrishnan.
1 1 Chord: A scalable Peer-to-peer Lookup Service for Internet Applications Dariotaki Roula
Chord:A scalable peer-to-peer lookup service for internet applications
Xiaowei Yang CompSci 356: Computer Network Architectures Lecture 22: Overlay Networks Xiaowei Yang
Chord A Scalable Peer-to-peer Lookup Service for Internet Applications
Robert Morris, M. Frans Kaashoek, David Karger, Hari Balakrishnan, Ion Stoica, David Liben-Nowell, Frank Dabek Chord: A scalable peer-to-peer look-up.
Robert Morris, M. Frans Kaashoek, David Karger, Hari Balakrishnan, Ion Stoica, David Liben-Nowell, Frank Dabek Chord: A scalable peer-to-peer look-up protocol.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion StoicaRobert Morris David Liben-NowellDavid R. Karger M. Frans KaashoekFrank.
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.
1 Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Robert Morris Ion Stoica, David Karger, M. Frans Kaashoek, Hari Balakrishnan.
Topics in Reliable Distributed Systems Lecture 2, Fall Dr. Idit Keidar.
Chord: A Scalable Peer-to-Peer Lookup Protocol for Internet Applications Stoica et al. Presented by Tam Chantem March 30, 2007.
Distributed Lookup Systems
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek and Hari alakrishnan.
Secure Overlay Services Adam Hathcock Information Assurance Lab Auburn University.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications 吳俊興 國立高雄大學 資訊工程學系 Spring 2006 EEF582 – Internet Applications and Services 網路應用與服務.
Peer To Peer Distributed Systems Pete Keleher. Why Distributed Systems? l Aggregate resources! –memory –disk –CPU cycles l Proximity to physical stuff.
Wide-area cooperative storage with CFS
Lecture 10 Naming services for flat namespaces. EECE 411: Design of Distributed Software Applications Logistics / reminders Project Send Samer and me.
Chord A Scalable Peer-to-peer Lookup Service for Internet Applications Lecture 3 1.
Effizientes Routing in P2P Netzwerken Chord: A Scalable Peer-to- peer Lookup Protocol for Internet Applications Dennis Schade.
Network Layer (3). Node lookup in p2p networks Section in the textbook. In a p2p network, each node may provide some kind of service for other.
Content Overlays (Nick Feamster). 2 Content Overlays Distributed content storage and retrieval Two primary approaches: –Structured overlay –Unstructured.
Chord & CFS Presenter: Gang ZhouNov. 11th, University of Virginia.
1 Reading Report 5 Yin Chen 2 Mar 2004 Reference: Chord: A Scalable Peer-To-Peer Lookup Service for Internet Applications, Ion Stoica, Robert Morris, david.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Presentation 1 By: Hitesh Chheda 2/2/2010. Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT Laboratory for Computer Science.
Node Lookup in P2P Networks. Node lookup in p2p networks In a p2p network, each node may provide some kind of service for other nodes and also will ask.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
Presented by: Tianyu Li
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan Presented.
SIGCOMM 2001 Lecture slides by Dr. Yingwu Zhu Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
Lecture 2 Distributed Hash Table
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
Chord Advanced issues. Analysis Theorem. Search takes O (log N) time (Note that in general, 2 m may be much larger than N) Proof. After log N forwarding.
Lecture 12 Distributed Hash Tables CPE 401/601 Computer Network Systems slides are modified from Jennifer Rexford.
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 2: Distributed Hash.
Chord Advanced issues. Analysis Search takes O(log(N)) time –Proof 1 (intuition): At each step, distance between query and peer hosting the object reduces.
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
Two Peer-to-Peer Networking Approaches Ken Calvert Net Seminar, 23 October 2001 Note: Many slides “borrowed” from S. Ratnasamy’s Qualifying Exam talk.
CS 347Notes081 CS 347: Parallel and Distributed Data Management Notes 08: P2P Systems.
Md Tareq Adnan Centralized Approach : Server & Clients Slow content must traverse multiple backbones and long distances Unreliable.
CS694 - DHT1 Distributed Hash Table Systems Hui Zhang University of Southern California.
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications * CS587x Lecture Department of Computer Science Iowa State University *I. Stoica,
1 Distributed Hash tables. 2 Overview r Objective  A distributed lookup service  Data items are distributed among n parties  Anyone in the network.
The Chord P2P Network Some slides taken from the original presentation by the authors.
The Chord P2P Network Some slides have been borrowed from the original presentation by the authors.
A Scalable Peer-to-peer Lookup Service for Internet Applications
(slides by Nick Feamster)
DHT Routing Geometries and Chord
Chord Advanced issues.
Chord Advanced issues.
Chord Advanced issues.
MIT LCS Proceedings of the 2001 ACM SIGCOMM Conference
Consistent Hashing and Distributed Hash Table
A Scalable Peer-to-peer Lookup Service for Internet Applications
Presentation transcript:

Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google, Inc. OSDI 2006

Introduction r Dynamo stores objects associated with a key through a simple interface: m get(),put() r It should be possible to scale Dynamo incrementally r This requires the ability to partition data over the set of nodes (storage hosts) r Dynamo relies on a concept called consistent hashing m The approach they used is similar to that found in Chord.

Distributed Hash Tables (DHT) r Operationally like standard hash tables r Stores (key, value) pairs m The key is like a filename m The value can be file contents or pointer to location r Goal: Efficiently insert/lookup/delete (key,value) pairs r Each peer stores a subset of (key, value) pairs in the system

DHT r Core operation: Find node responsible for a key m Map key to node m Efficiently route insert/lookup/delete request to this node r Allow for frequent node arrivals and departures

DHT r Introduce a hash function to map the object being searched for to a unique global identifier: m e.g., h(“NGC’02 Tutorial Notes”) → 8045 r Distribute the range of the hash function among all nodes in the network r Each node must “know about” at least one copy of each object that hashes within its range (when one exists)

DHT:Desirable Properties r Key ID space (search space) is uniformly populated m Mapping of keys to IDs using (consistent) hashing r A node is responsible for indexing all the keys in a certain subspace of the ID space r Nodes have only partial knowledge of other node’s responsibilities r Messages should be routed to a node efficiently (small number of hops) r Node arrival/departure should only affect a few nodes.

Consistent Hashing r The main idea: map both keys and nodes (node IPs) to the same (metric) ID space

Consistent Hashing r The main idea: map both keys and nodes (node IPs) to the same (metric) ID space The ring is just a possibility. Any metric space will do

Consistent Hashing r With high probability, the hash function balances load (all nodes receive roughly the same number of keys). r With high probability, when a node joins (or leaves) the network, only an fraction of the keys are moved to a different location. m This is clearly the minimum necessary to maintain a balanced load.

Consistent Hashing r The consistent hash function assigns each node and key an m-bit identifier using SHA-1 as a base hash function. r A node’s identifier is chosen by hashing the node’s IP address. r A key identifier is produced by hashing the key. r For more info see: m D. R. Karger, E. Lehman, F. Leighton, M. Levine, D. Lewin, and R.Panigrahy, “Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on theWorldWideWeb,” in Proc. 29 th ACM Symp. Theory of Computing, El Paso, TX, May 1997, pp. 654–663.

P2P Middleware: Differences r Different P2P middlewares differ in: m The choice of the ID space m The structure of their network of nodes (i.e. how each node chooses its neighbors) m For each object, node(s) whose range(s) cover that object must be reachable via a “short” path r This is a major research topic

Chord r m bit identifier space for both keys and nodes r Key identifier = SHA-1(key) m Key = “LetItBe” ID=50 m Key = “ ” ID=70 r How do we assign keys to nodes? SHA-1

Chord r Nodes organized in an identifier circle based on node identifiers r Keys assigned to their successor node in the identifier circle e.g., node with next higher ID.

Chord r Hash function ensures even distribution of nodes and keys on the circle r Range covered by node is from previous ID up to its own ID r Assume an N node network

Chord: Search Possibilities r Routing table size vs search cost r Every peer knows every other peer: O(N) routing table size r Every peer knows its successor: O(N) search time. r The “compromise” is to have each peer know the next m successors.

Finger Table r Let m be the number of bits in the key/node identifiers r Each node, n, maintains a routing table with at most m entries called the finger table. r The i th entry in the table at node n contains the identity of the first node, s, that succeeds n by at least 2 i-1. m s = successor(n+2 i-1 ) m s is called the i th finger of node n

Chord:Finger Table Finger table: finger[i] = successor (n + 2 i-1 ) where 1 ≤ i ≤ m O(log N) table size

Chord: Finger Table Finger table: finger[i] = successor (n + 2 i-1 )

Chord: Finger Table Finger table: finger[i] = successor (n + 2 i-1 )

Chord: Finger Table Finger table: finger[i] = successor (n + 2 i-1 )

Chord: Finger Table Finger table: finger[i] = successor (n + 2 i-1 )

Chord: Finger Table Finger table: finger[i] = successor (n + 2 i-1 )

Chord: Finger Table Finger table: finger[i] = successor (n + 2 i-1 )

Chord: Finger Table Finger table: finger[i] = successor (n + 2 i-1 )

Chord: Finger Table Finger table: finger[i] = successor (n + 2 i-1 )

The Chord algorithm – Scalable node localization

Chord: Search r Assume node n is searching for key k. r Node n does the following: m Find i th table entry of node n such that k  [finger[i].start, finger[i+1].start]) m If no such entry exists then return the node in the last entry of the finger table m The above two steps are repeated until the condition in the first step is satisfied.

Chord: Join r Nodes can join (and leave) at any time. r Challenge: Preserving the ability to locate every key in the network r Chord must preserve the following: m Each node’s successor correctly maintained m For every key k, node successor(k) is responsible for k. r For lookups to be fast, it is desirable for the finger tables to be correct.

Chord: Join Implementation r Each node in Chord maintains a predecessor pointer. m This consists of the Chord ID and IP address of the immediate predecessor of that node. m It can be used to walk counterclockwise around the identifier circle. r The new node to be added learns the identify of an existing Chord node by some external mechanism

Chord: Join Initialization Steps r Assume n is the node to join. r Find any existing node, n’. r Find successor of n from n’. Label this successor(n). r Ask successor(n) for its predecessor. This is labelled as predecessor(successor(n)).

Chord: Join Example Assume N26 wants to join; If finds N8 N8’s finger table suggests that N26 will be “between” N21 and N32.

Chord: Join (Initialize finger table) r Node n needs to have its finger table initialized r Node n can ask one its predecessor to be for its finger table as a starting point

Chord: Join (Changing Existing Finger Tables) r Node n needs to entered into the finger tables of some existing nodes. r Node n becomes the i th finger of node p, iff m p precedes n by at least 2 i-1 ; and m The i th finger of node p succeeds n. r The first node, p, that satisfies these conditions is the immediate predecessor of n-2 i-1 r For a given n, the algorithm starts with the i th finger of node n and then continues to walk in the counter-clock-wise direction on the identifier circle until it encounters a node whose i th finger precedes n.

Chord: Join Example (add N26) N21+1N32 N21+2N32 N21+4N32 N21+8N32 N21+16N38 N21+32N56 N21 (old finger table) N21+1N26 N21+2N26 N21+4N26 N21+8N32 N21+16N38 N21+32N56 N21 (new finger table) i=1: Does N21 precede N26 by at least 1 ( 2 i-1 ) ; yes: N21+1 becomes N26; i=2: Does N21 precede N26 by at least 2; yes: N21+2 becomes N26; i=3: Does N21 precede N26 by at least 4; yes: N21+4 becomes N26; i=4: Does N21 precede N26 by 8; no; evaluate N14;

Chord: Join Example (add N26) N14+1N21 N14+2N21 N14+4N21 N14+8N32 N14+16N32 N14+32N48 N14 (new finger table) N14+1N21 N14+2N21 N14+4N21 N14+8N26 N14+16N32 N14+32N48 N14 (new finger table) i=4: Does N14 precede N26 by at least 8; yes; N14+8 becomes N26 i=5; Does N15 precede N26 by at least 16; no; evaluate N8 Etc

Chord: Join (Transferring Keys) r Move responsibility for all the keys for which node n is the successor. r Typically this involves moving data associated with each key to the new node. r Node n can become the successor for keys that were previously the responsibility of the node immediately following n. r Node n only needs to contact one node to transfer responsibility for all relevant keys.

Chord: Join r The previous discussion on join focuses on a single node join. r What if there are multiple node joins? r Join requires that each node’s successor is correctly maintained

Chord: Stabilization Protocol r The successor/predecessor links are rebuilt by periodic stabilize notification messages m Sent by each node to its successor to inform it of the (possibly new) identity of the predecessor r The successor pointers are used to verify and correct finger table entries.

Chord: Join/Stabilize Example

N26 joins the system N26 acquires N32 as its successor N26 notifies N32 N32 acquires N26 as its predecessor

Chord: Join/Stabilize Example N26 copies keys N21 runs stabilize() and asks its successor N32 for its predecessor which is N26.

Chord: Join/Stabilize Example N21 aquires N26 as its successor

Chord Stabilization r Pointers and finger tables may be in a state of flux r Is it possible that data will not be found? m Yes r Recovery: try again

Chord: Node Failure N120 N113 N102 N80 N85 N80 doesn’t know correct successor, so incorrect lookup N10 Lookup(90)

Chord: Node Failure r Solution: Use successor lists  Each node knows r immediate successors r After failure, will know first live successor r Stabilize messages correct finger tables r Replicas of the data associated with a key at the r successor nodes might be used m Application dependent

Chord Properties r In a system with N nodes and K keys, with high probability… m each node receives at most K/N keys m each node maintains info. about O(log N) other nodes m lookups resolved with O(log N) hops m Insertions O(log 2 N) r The developers of Chord validated this through simulation studies. r No consistency among replicas r Hops have poor network locality

Chord: Network Locality r Nodes close on ring can be far in the network. N20 N41 N80 N40 * Figure from