Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chord: A scalable peer-to- peer lookup service for Internet applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashock, Hari Balakrishnan.

Similar presentations


Presentation on theme: "Chord: A scalable peer-to- peer lookup service for Internet applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashock, Hari Balakrishnan."— Presentation transcript:

1 Chord: A scalable peer-to- peer lookup service for Internet applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashock, Hari Balakrishnan Kathleen Ting 23 November 2004

2 The Chord project aims to build scalable, robust, distributed systems using peer-to-peer ideas. The Chord project aims to build scalable, robust, distributed systems using peer-to-peer ideas. basis is Chord distributed hash lookup primitive basis is Chord distributed hash lookup primitive Chord is completely decentralized and symmetric, and can find data using only log(N) messages, where N is the number of nodes in the system. Chord is completely decentralized and symmetric, and can find data using only log(N) messages, where N is the number of nodes in the system. Chord's lookup mechanism is provably robust in the face of frequent node failures and re-joins. Chord's lookup mechanism is provably robust in the face of frequent node failures and re-joins.

3 Goal: Better Peer-to-Peer Storage Lookup is the key problem Lookup is the key problem Lookup is not easy: Lookup is not easy: GNUtella scales badly GNUtella scales badly Freenet is imprecise Freenet is imprecise Chord lookup provides: Chord lookup provides: Good naming semantics and efficiency Good naming semantics and efficiency Elegant base for layered features Elegant base for layered features

4 Chord Architecture Interface: Interface: lookup(DocumentID)  NodeID, IP-Address lookup(DocumentID)  NodeID, IP-Address Chord consists of Chord consists of Consistent Hashing Consistent Hashing Small routing tables: log(n) Small routing tables: log(n) Fast join/leave protocol Fast join/leave protocol

5 Chord Uses log(N) “Fingers” (0) N80 ½ ¼ 1/8 1/16 1/32 1/64 1/128 Circular 7-bit ID space N80 knows of only seven other nodes.

6 Contributions from the Chord paper 1. Protocol that solves the lookup problem 1. Addition and deletion of Chord server nodes 2. Insert, update, and lookup of unstructured key/value pairs 2. Simple system that uses it for storing information 3. Evaluation of Chord protocol and system 1. Theoretical proofs 2. Simulation results based on 10,000 nodes 3. Measurement of actual implementation of Chord system

7 Chord protocol supports just one operation Chord protocol supports just one operation Determine the node in a distributed system that stores the value for a given key Determine the node in a distributed system that stores the value for a given key Chord protocol uses a variant of consistent hashing to assign keys to Chord server nodes Chord protocol uses a variant of consistent hashing to assign keys to Chord server nodes Load tends to be balanced Load tends to be balanced When N th node joins (or leaves) network, only O(1/N) fraction of key/value pairs are moved to different location When N th node joins (or leaves) network, only O(1/N) fraction of key/value pairs are moved to different location Previous research on consistent hashing  impractical to scale because nodes know about every other node in network Previous research on consistent hashing  impractical to scale because nodes know about every other node in network Chord  each node only maintains information about O(log N) nodes, resolves lookups using only O(log N) messages, updates require only O(log 2 N) messages when node joins or leaves Chord  each node only maintains information about O(log N) nodes, resolves lookups using only O(log N) messages, updates require only O(log 2 N) messages when node joins or leaves

8 Benefits of Chord Decentralized Decentralized Automatically adapts when hosts leave and join Automatically adapts when hosts leave and join Scalable Scalable Guarantees that queries make a logarithmic number of hops and that keys are well balanced Guarantees that queries make a logarithmic number of hops and that keys are well balanced Uses heuristic to achieve network proximity Uses heuristic to achieve network proximity Doesn’t require the availability of geographic- location information Doesn’t require the availability of geographic- location information Prevents single points of failure Prevents single points of failure

9 Chord vs. DNS Similarities Similarities Map names to values Map names to values Differences Differences No special root servers No special root servers No restrictions on the format and meaning of names, as Chord names are just the key/value pairs No restrictions on the format and meaning of names, as Chord names are just the key/value pairs No attempt to solve administrative problems No attempt to solve administrative problems

10 Chord vs. Freenet Similarities Similarities Decentralized, symmetric, and automatically adapts when hosts leave and join Decentralized, symmetric, and automatically adapts when hosts leave and join Differences Differences Queries always result in success or definitive failure Queries always result in success or definitive failure Scalable Scalable Cost of inserting and retrieving values, cost of adding and removing hosts grows slowly with the total number of hosts and key/value pairs Cost of inserting and retrieving values, cost of adding and removing hosts grows slowly with the total number of hosts and key/value pairs

11 API of the Chord system FunctionDescription insert (key, value) Inserts a key/value binding at r distinct nodes. lookup (key) Returns the value associated with the key. update (key, newval) Inserts the key/newval binding at r nodes. Under stable conditions, exactly r nodes contain key/newval binding. join (n) Causes a node to add itself as a server to the Chord system that node n is part of. Returns success or failure. leave ( ) Leave the Chord system. No return value.

12 Chord system Implemented as an application-layer overlay network of Chord server nodes Implemented as an application-layer overlay network of Chord server nodes Each node maintains a subset of the key/value pairs as well as routing table entries that point to a subset of carefully chosen Chord servers Each node maintains a subset of the key/value pairs as well as routing table entries that point to a subset of carefully chosen Chord servers Chord clients may, but don’t have to, run on the same hosts as Chord server nodes Chord clients may, but don’t have to, run on the same hosts as Chord server nodes

13 Chord system design goals Scalability Scalability Availability Availability Load-balanced operation Load-balanced operation Dynamism Dynamism Updatability Updatability Locating according to proximity Locating according to proximity

14 Scalable key location Each node stores information about only a small number of other nodes Each node stores information about only a small number of other nodes Amount of information maintained about other nodes falls off exponentially with the distance in key-space between the two nodes Amount of information maintained about other nodes falls off exponentially with the distance in key-space between the two nodes Finger table of a node may not contain enough information to determine the successor of an arbitrary key k Finger table of a node may not contain enough information to determine the successor of an arbitrary key k

15 Scalable key location

16 What happens when a node n doesn’t know the successor of a key k? Theorem: With high probability, the number of nodes that must be contacted to resolve a successor query in an N-node network is O(log N). Each recursive call to find the successor halves the distance to the target identifier.

17 Resolve successor query

18 Node joins and departures In a dynamic network, nodes can join and leave at any time. In a dynamic network, nodes can join and leave at any time. Preserve ability to locate every node in the network Preserve ability to locate every node in the network Each node’s finger table is correctly filled Each node’s finger table is correctly filled Each key k is stored at node successor (k) Each key k is stored at node successor (k) If a node is not the immediate predecessor of the key, then its finger table will hold a node closer to the key to which the query will be forwarded, until the key’s successor node is reached. If a node is not the immediate predecessor of the key, then its finger table will hold a node closer to the key to which the query will be forwarded, until the key’s successor node is reached. Theorem: With high probability, any node joining or leaving an N-node Chord network will use O(log 2 N) messages to re-establish the Chord routing invariants. Theorem: With high probability, any node joining or leaving an N-node Chord network will use O(log 2 N) messages to re-establish the Chord routing invariants. Predecessor pointer Predecessor pointer

19 Update fingers and predecessors of existing nodes

20 Constant is ½ Chord lookup cost is O(log N) Number of Nodes Average Messages per Lookup

21 Chord properties As long as every node knows its immediate predecessor and successor, no lookup will stall anywhere except at the node responsible for a key. As long as every node knows its immediate predecessor and successor, no lookup will stall anywhere except at the node responsible for a key. Any other node will know of at least one node (its successor) that is closer to the key than itself, and will forward the query to that closer node. Any other node will know of at least one node (its successor) that is closer to the key than itself, and will forward the query to that closer node.

22 Chord properties Log(n) lookup messages and table space. Log(n) lookup messages and table space. Well-defined location for each ID. Well-defined location for each ID. No search required. No search required. Natural load balance. Natural load balance. No name structure imposed. No name structure imposed. Minimal join/leave disruption. Minimal join/leave disruption. Does not store documents… Does not store documents…

23 Conclusion Intended to be used by decentralized, large- scale distributed applications Intended to be used by decentralized, large- scale distributed applications Because many distributed applications need to determine the node that stores a data item Because many distributed applications need to determine the node that stores a data item Given a key, Chord will determine the node responsible for storing the key’s value Given a key, Chord will determine the node responsible for storing the key’s value Maintains routing information for O(log N) nodes Maintains routing information for O(log N) nodes Resolves lookups with O(log N) messages Resolves lookups with O(log N) messages Updates with O(log 2 N) messages Updates with O(log 2 N) messages

24 Conclusion Chord provides distributed lookup Chord provides distributed lookup Efficient, low-impact join and leave Efficient, low-impact join and leave Flat key space allows flexible extensions Flat key space allows flexible extensions Good foundation for peer-to-peer systems Good foundation for peer-to-peer systemshttp://www.pdos.lcs.mit.edu/chord

25 Have a Happy Thanksgiving!


Download ppt "Chord: A scalable peer-to- peer lookup service for Internet applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashock, Hari Balakrishnan."

Similar presentations


Ads by Google