Presentation on theme: "“Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications” Ion Stoica, Robert Morris, David Liben-Nowell, David R. Karger, M. Frans Kaashoek,"— Presentation transcript:
“Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications” Ion Stoica, Robert Morris, David Liben-Nowell, David R. Karger, M. Frans Kaashoek, Frank Dabek, Hari Balakrishnan
Background Peer-to-peer systems - completely decentralized, no hierarchical control, each node with equivalent behavior Numerous applications - Napster, Freenet, Gnutella, etc. Core operation is lookup of data items in distributed environment (distributed hashtables)
What Does Chord Provide? Chord is a protocol that provides one operation: a mapping of a key to a node Applications may store associated values for each key on the mapped node Efficient storage (O(log N)) and lookup (O(log N)) where N = the number of nodes; scalable Load balanced even in the presence of node joins and leaves Flexible naming: no constraint on keys used, keyspace is flat
How is Chord Used? Chord is a library that is linked to the application using it Provides lookup(key) function that returns IP address of node responsible for key Notifies application when the set of keys the node is responsible for changes
Example of Chord-based storage system Storage App Distributed Hashtable Chord Storage App Distributed Hashtable Chord Storage App Distributed Hashtable Chord ClientServer
Chord Protocol: Consistent Hashing The heart of the Chord protocol uses a “consistent hashing” mechanism to achieve a mapping between keys and nodes that is balanced across all nodes with high probability Each node and key is assigned an m-bit identifier produced using a base hash function (SHA-1 is used) Consistent hashing assigns keys to nodes: –All identifiers (nodes and keys) are ordered on an identifier circle modulo 2 m –Key k is assigned to the first node whose identifier is equal to or follows the identifier of key k; the assigned node is the successor node of k, defined successor(k)
The Chord Ring N1 N8 N14 N21 N32 N38 N42 N48 N51 N56 K10 K24 K30 K38 K54 m = 6; n = 10; k = 5 With random identifiers for both nodes and keys, balanced load is achieved. Each node is responsible for a maximum of (1 + ε)k/n keys. Minimal movement of keys is achieved since only O(k/n) keys are transferred between joining/leaving node and another.
Simple Lookup // ask node n to find the successor of id n.find_successor(id) if(id (n, successor]) return successor; else // forward the query around the circle return successor.find_successor(id) Only require that each node knows its current successor. This is all that is needed for correctness.
Efficient Lookup Keep more routing information per node in the form of a finger table Finger table consists of up to m entries, where the i th entry in the table at node n represents the first node s that succeeds n by at least 2 i-1 on the identifier circle finger[k] = successor(n + 2 i-1 ) mod 2 m, 1 ≤ k ≤ m Note that finger is the successor of n
Efficient Lookup // search the local table for the highest // predecessor of id n.closest_preceding_node(id) for i = m downto 1 if(finger[i] (n, id)) return finger[i]; return n; // ask node n to find the successor of id n.find_successor(id) if(id (n, successor]) return successor; else n’ = closest_preceding_node(id); return n’.find_successor(id);
Efficient Lookup Properties Use of finger table requires each node to only store a small amount of information –It is claimed that only O(log n) entries need to be stored Each node knows more about nodes close to it on identifier circle than those farther away Intuitively, all entries in a node’s finger table are at power of two intervals around the identifier circle thus a node can always cover at least half the remaining distance between itself and the target identifier. –With high probability, the number of nodes that must be contacted in a lookup is O(log N)
Joins and Stabilization Correctness is maintained by correct successive pointers –As long as existing successive pointers are in tact, lookups can occur concurrently with joins A stabilization protocol is used to update successor pointers and finger tables
Joins and Stabilization // create a new Chord ring n.create() predecessor = nil; successor = n; // join a Chord ring containing node n’ n.join(n’) predecessor = nil; successor = n’.find_sucessor(n); // called periodically. verifies n’s immediate // successor and tells the successor about n n.stabilize() x = successor.predecessor; if(x (n, successor)) sucessor = x; successor.notify(n); // n’ thinks it might be our predecessor n.notify(n’) if(predecessor is nil or n’ (predecessor, n)) predecessor = n’; // called periodically. refreshes finger table entries. // next stores the index of the next finger to fix n.fix_fingers() next = next + 1; if(next > m) next = log(successor – n) + 1; finger[next] = find_successor(n + 2 next-1 ); // called periodically. checks whether predecessor // has failed. n.check_predecessor() if(predecessor has failed) predecessor = nil;
Impact of Joins on Lookup Efficiency Three Scenarios –If node joins, successive pointers and finger table entries are updated, then lookup will operate in O(log N) steps. –If node joins, successive pointers are updated, but finger table entries are not yet updated, lookup requests using the old finger entries may initially undershoot, but will eventually find the correct node because of correct successive pointers. Slower lookup may result. –If node joins, may have incorrect successive pointers or keys may not have migrated to new nodes. Result will be a lookup failure.
Failures Failures break the condition that all successor pointers are correct If nodes 14, 21, and 32 fail, node 8 cannot lookup key 30 Solution is to maintain a successor list of the node’s first r successors Modify stabilize code to reconcile successor lists with successor Modify closest_preceding_node() to also look in the successor list. Modify code to handle node failures with timeouts that try the next best predecessor.
Simulations Chord protocol can be implemented iteratively or recursively Simulations performed with iterative style Simplifying assumption that data movement by application is occurs without disrupting lookups
Future Work Security (availability) of lookup protocol Reliance against network partitions
References Presentation content and some diagrams and graphs were taken from: Ion Stoica, Robert Morris, David Liben-Nowell, David R. Karger, M. Frans Kaashoek, Frank Dabek, Hari Balakrishnan, Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications. To Appear in IEEE/ACM Transactions on Networking