Presentation on theme: "Ion Stoica, Robert Morris, David Liben-Nowell, David R. Karger, M"— Presentation transcript:
1 “Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications” Ion Stoica, Robert Morris, David Liben-Nowell, David R. Karger, M. Frans Kaashoek, Frank Dabek, Hari Balakrishnan
2 BackgroundPeer-to-peer systems - completely decentralized, no hierarchical control, each node with equivalent behaviorNumerous applications - Napster, Freenet, Gnutella, etc.Core operation is lookup of data items in distributed environment (distributed hashtables)
3 What Does Chord Provide? Chord is a protocol that provides one operation: a mapping of a key to a nodeApplications may store associated values for each key on the mapped nodeEfficient storage (O(log N)) and lookup (O(log N)) where N = the number of nodes; scalableLoad balanced even in the presence of node joins and leavesFlexible naming: no constraint on keys used, keyspace is flat
4 How is Chord Used?Chord is a library that is linked to the application using itProvides lookup(key) function that returns IP address of node responsible for keyNotifies application when the set of keys the node is responsible for changes
5 Example of Chord-based storage system Storage AppStorage AppStorage AppDistributedHashtableDistributedHashtableDistributedHashtableChordChordChordClientServerServer
6 Chord Protocol: Consistent Hashing The heart of the Chord protocol uses a “consistent hashing” mechanism to achieve a mapping between keys and nodes that is balanced across all nodes with high probabilityEach node and key is assigned an m-bit identifier produced using a base hash function (SHA-1 is used)Consistent hashing assigns keys to nodes:All identifiers (nodes and keys) are ordered on an identifier circle modulo 2mKey k is assigned to the first node whose identifier is equal to or follows the identifier of key k; the assigned node is the successor node of k, defined successor(k)
7 The Chord RingWith random identifiers for both nodes and keys, balanced load is achieved.Each node is responsible for a maximum of (1 + ε)k/n keys.Minimal movement of keys is achieved since only O(k/n) keys are transferred between joining/leaving node and another.m = 6; n = 10; k = 5N1N8N14N21N32N38N42N48N51N56K10K24K30K38K54
8 Simple LookupOnly require that each node knows its current successor. This is all that is needed for correctness.// ask node n to find the successor of idn.find_successor(id)if(id (n, successor])return successor;else// forward the query around the circlereturn successor.find_successor(id)
9 Efficient LookupKeep more routing information per node in the form of a finger tableFinger table consists of up to m entries, where the ith entry in the table at node n represents the first node s that succeeds n by at least 2i-1 on the identifier circlefinger[k] = successor(n + 2i-1) mod 2m, 1 ≤ k ≤ mNote that finger is the successor of nExtra routing info not needed for correctness, only for efficiency.
10 Efficient Lookup // search the local table for the highest // predecessor of idn.closest_preceding_node(id)for i = m downto 1if(finger[i] (n, id))return finger[i];return n;// ask node n to find the successor of idn.find_successor(id)if(id (n, successor])return successor;elsen’ = closest_preceding_node(id);return n’.find_successor(id);
11 Efficient Lookup Properties Use of finger table requires each node to only store a small amount of informationIt is claimed that only O(log n) entries need to be storedEach node knows more about nodes close to it on identifier circle than those farther awayIntuitively, all entries in a node’s finger table are at power of two intervals around the identifier circle thus a node can always cover at least half the remaining distance between itself and the target identifier.With high probability, the number of nodes that must be contacted in a lookup is O(log N)
12 Joins and Stabilization Correctness is maintained by correct successive pointersAs long as existing successive pointers are in tact, lookups can occur concurrently with joinsA stabilization protocol is used to update successor pointers and finger tables
13 Joins and Stabilization // create a new Chord ringn.create()predecessor = nil;successor = n;// join a Chord ring containing node n’n.join(n’)successor = n’.find_sucessor(n);// called periodically. verifies n’s immediate// successor and tells the successor about nn.stabilize()x = successor.predecessor;if(x (n, successor))sucessor = x;successor.notify(n);// n’ thinks it might be our predecessorn.notify(n’)if(predecessor is nil or n’ (predecessor, n))predecessor = n’;// called periodically. refreshes finger table entries.// next stores the index of the next finger to fixn.fix_fingers()next = next + 1;if(next > m)next = log(successor – n) + 1;finger[next] = find_successor(n + 2next-1);// called periodically. checks whether predecessor// has failed.n.check_predecessor()if(predecessor has failed)predecessor = nil;
14 Impact of Joins on Lookup Efficiency Three ScenariosIf node joins, successive pointers and finger table entries are updated, then lookup will operate in O(log N) steps.If node joins, successive pointers are updated, but finger table entries are not yet updated, lookup requests using the old finger entries may initially undershoot, but will eventually find the correct node because of correct successive pointers. Slower lookup may result.If node joins, may have incorrect successive pointers or keys may not have migrated to new nodes. Result will be a lookup failure.Third case can be because of multiple joins.
15 FailuresFailures break the condition that all successor pointers are correctIf nodes 14, 21, and 32 fail, node 8 cannot lookup key 30Solution is to maintain a successor list of the node’s first r successorsModify stabilize code to reconcile successor lists with successorModify closest_preceding_node() to also look in the successor list.Modify code to handle node failures with timeouts that try the next best predecessor.
16 SimulationsChord protocol can be implemented iteratively or recursivelySimulations performed with iterative styleSimplifying assumption that data movement by application is occurs without disrupting lookups
19 Future Work Security (availability) of lookup protocol Reliance against network partitions
20 ReferencesPresentation content and some diagrams and graphs were taken from:Ion Stoica, Robert Morris, David Liben-Nowell, David R. Karger, M. Frans Kaashoek, Frank Dabek, Hari Balakrishnan, Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications. To Appear in IEEE/ACM Transactions on Networking