Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Distributed Hash tables. 2 Overview r Objective  A distributed lookup service  Data items are distributed among n parties  Anyone in the network.

Similar presentations


Presentation on theme: "1 Distributed Hash tables. 2 Overview r Objective  A distributed lookup service  Data items are distributed among n parties  Anyone in the network."— Presentation transcript:

1 1 Distributed Hash tables

2 2 Overview r Objective  A distributed lookup service  Data items are distributed among n parties  Anyone in the network can find an item efficiently  No central node or infrastructure r Applications:  P2P file sharing, without central node (items are files)  Large scale, distributed or grid computation (items are computational tasks)

3 3 Chord r Uses consistent hashing to map keys to nodes. r Enables  Lookup  Node joining the network  Node leaving the network r Each node maintains routing information on the network.

4 4 Consistent hashing r Consistent hash function assigns each node and key an m-bit identifier. r A node is a member of the network: a processing and communication unit. r A key identifies a data item:  Example: movie name r A node’s identifier can be defined by e.g. hashing the node’s IP address. r An identifier for node or key can be produced by hashing  ID(node) = hash(IP address)  ID(key) = hash(key)

5 Consistent hashing (cont.) r Properties of consistent hashing:  Load balancing – given N nodes, with high probability O(1/N) of the keys are stored in each node.  When the N-th node joins, with high probability O(1/N) of the keys are moved. r Practical problems with identifiers  IP is not always a good identifier: NAT DHCP  Key is not always well defined (e.g. file name). 5

6 6 Chord ring r Chord orders all m-bit identifiers (both nodes and keys) on ring of 2 m elements r Order is defined modulo 2 m (2 m -1<0) r Identifier space for keys – {0,…,2 m -1}  Not all keys need be in use concurrently r Identifier space for nodes – {0,…,2 m -1}  Not all nodes are necessarily active concurrently r Key k is assigned to node n if  id(n)=min {id (n’); id(n’)  id(k)} (  is interpreted modulo 2 m, i.e. first node clockwise). r n is called successor(k)

7 7 Successor Nodes 6 1 2 6 0 4 26 5 1 3 7 2 identifier circle identifier node X key successor(1) = 1 successor(2) = 3successor(6) = 0 m=3

8 More on hashing r Desirable properties for hash function:  Collision resistant on nodes (mandatory)  Collision resistant on keys (hopefully)  Load balancing  Good adversarial behavior. r Chord suggestion  SHA-1  ID(node) = SHA-1(IP address)  ID(key) = SHA-1(key) r Drawback  Full SHA-1 implies m=160  Truncated SHA-1 may not be collision-resistant 8

9 9 Key lookup r Suppose each node stores successor(node) r We could use the following pseudo-code n.find_successor(id) if (id  [node, successor]) return successor; else // forward the query around the circle return successor.find_successor(id); r Example of execution r What is the time/message complexity?

10 10 Key lookup in O(m) r Each node n holds additional routing information: finger table  m entries (at most)  Each entry is called “finger” r Data in each entry  Finger[i].node=successor (n+2 i-1 mod 2 m ), i=1,…,m  Finger[i] includes TCP/IP access information such as IP and port. r Denote the i-th finger by node.finger(i)

11 11 Example – finger tables 0 4 26 5 1 3 7 1 3 0 finger table succ. keys 1 start 235235 330330 finger table succ. keys 2 start 457457 000000 finger table succ. keys 6 124124 start [1,2) [2,4) [4,0) int. [2,3) [3,5) [5,1) Int. [4,5) [5,7) [7,3) int.

12 12 Example – fast lookup

13 13 Fast lookup - pseudo code // ask node n to find the successor of id n.find_successor(id) if (id  (n, successor]) return successor; else n’ = closest_preceding_node(id); return n’.find_successor(id); // search the local table for the highest predecessor of id n.closest_preceding_node(id) for i = m downto 1 if (finger[i]  (n, id)) return finger[i]; return n;

14 14 Fast lookup - analysis r Time complexity at each node  O(m) r Theorem - Number of nodes in a search path is O(m) in the worst case r Number of messages is equal to number of nodes in search path r Theorem – if nodes are distributed uniformly among identifiers then number of nodes in search path is O(log n) with high probability.

15 Fast lookup – analysis (cont.) r X i – i-th node is in last interval of 2 m /N identifiers in search path. r X=Σx i r  =E[X]=1 r Chernoff bound –  Pr[X>(1+  )  ]<(e  /(1+  ) 1+  )  15

16 16 Node Joins and Stabilizations r Nodes join dynamically r Network maintains two invariants  Node successor is up to date.  For every key k, successor(k) is responsible for k. r When node n joins:  n.successor and n.finger[i] are updated.  Successor and finger tables of other nodes are updated.  Correct keys are transferred to n. r Each node runs a “stabilization” protocol periodically in the background to update successor pointer and finger table.

17 17 Node Joins and Stabilizations r “Stabilization” protocol contains 6 functions:  create()  join()  stabilize()  notify()  fix_fingers()  check_predecessor() r Each node has both successor and predecessor pointers

18 18 Node Joins – join() r When node n first starts, it calls n.join(n’), where n’ is any known Chord node. r The join() function asks n’ to find the immediate successor of n. r join() does not make the rest of the network aware of n.

19 19 Node Joins – join() // create a new Chord ring. n.create() predecessor = nil; successor = n; // join a Chord ring containing node n’. n.join(n’) predecessor = nil; successor = n’.find_successor(n);

20 20 Node Joins – stabilize() r Each time node n runs stabilize(), it asks its successor for the successor’s predecessor p, and decides whether p should be n’s successor instead. r stabilize() notifies node n’s successor of n’s existence, giving the successor the chance to change its predecessor to n. r The successor does this only if it knows of no closer predecessor than n.

21 21 Node Joins – stabilize() // called periodically. verifies n’s immediate // successor, and tells the successor about n. n.stabilize() x = successor.predecessor; if (x  (n, successor)) successor = x; successor.notify(n); // n’ thinks it might be our predecessor. n.notify(n’) if (predecessor is nil or n’  (predecessor, n)) predecessor = n’;

22 22 Node Joins – Join and Stabilization npnp succ(n p ) = n s nsns n pred(n s ) = n p r n joins  predecessor = nil  n acquires n s as successor via some n’ r n runs stabilize  n notifies n s being the new predecessor  n s acquires n as its predecessor r n p runs stabilize  n p asks n s for its predecessor (now n)  n p acquires n as its successor  n p notifies n  n will acquire n p as its predecessor r all predecessor and successor pointers are now correct r fingers still need to be fixed, but old fingers will still work nil pred(n s ) = n succ(n p ) = n

23 23 Node Joins – fix_fingers() r Each node periodically calls fix fingers to make sure its finger table entries are correct. r New nodes initialize their finger tables using fix_fingers(). r Existing nodes incorporate new nodes into their finger tables using fix_fingers(). r Each node maintains a pointer next into its finger table.

24 24 Node Joins – fix_fingers() // called periodically. refreshes finger table entries. n.fix_fingers() next = next + 1 ; if (next > m) next = 1 ; finger[next].node = find_successor(n + 2 next-1 ); // checks whether predecessor has failed. n.check_predecessor() if (predecessor has failed) predecessor = nil; r What is the complexity?

25 25 Node Failures r Key step in failure recovery is maintaining correct successor pointers r To help achieve this, each node maintains a successor-list of its r nearest successors on the ring r If node n notices that its successor has failed, it replaces it with the first live entry in the list r Successor lists are stabilized as follows:  node n reconciles its list with its successor s by copying s’s successor list, removing its last entry, and prepending s to it.  If node n notices that its successor has failed, it replaces it with the first live entry in its successor list and reconciles its successor list with its new successor.


Download ppt "1 Distributed Hash tables. 2 Overview r Objective  A distributed lookup service  Data items are distributed among n parties  Anyone in the network."

Similar presentations


Ads by Google