Presentation is loading. Please wait.

Presentation is loading. Please wait.

Distributed Hash-based Lookup for Peer-to-Peer Systems Mohammed Junaid Azad 09305050 Gopal Krishnan 09305915 Mtech1,CSE.

Similar presentations

Presentation on theme: "Distributed Hash-based Lookup for Peer-to-Peer Systems Mohammed Junaid Azad 09305050 Gopal Krishnan 09305915 Mtech1,CSE."— Presentation transcript:

1 Distributed Hash-based Lookup for Peer-to-Peer Systems Mohammed Junaid Azad 09305050 Gopal Krishnan 09305915 Mtech1,CSE

2 Agenda Peer-to-Peer System Initial Approaches to Peer-to-Peer Systems Their Limitations Distributed Hash Tables – CAN-Content Addressable Network – CHORD

3 Peer-to-Peer Systems Distributed and Decentralized Architecture No centralized Server(Unlike Client Server Architecture) Any Peer can behave as Server

4 Napster P2P file sharing system Central Server stores the index of all the files available on the network To retrieve a file, central server contacted to obtain location of desired file Not completely decentralized system Central directory not scalable Single point of failure

5 Gnutella P2P file sharing system No Central Server store to index the files available on the network File location process decentralized as well Requests for files are flooded on the network No Single point of failure Flooding on every request not scalable

6 File Systems for P2P systems The file system would store files and their metadata across nodes in the P2P network The nodes containing blocks of files could be located using hash based lookup The blocks would then be fetched from those nodes

7 Scalable File indexing Mechanism In any P2P system, File transfer process is inherently scalable However, the indexing scheme which maps file names to location crucial for scalability Solution:- Distributed Hash Table

8 8 Distributed Hash Tables Traditional name and location services provide a direct mapping between keys and values What are examples of values? A value can be an address, a document, or an arbitrary data item Distributed hash tables such as CAN/Chord implement a distributed service for storing and retrieving key/value pairs

9 9 DNS vs. Chord/CAN DNS provides a host name to IP address mapping relies on a set of special root servers names reflect administrative boundaries is specialized to finding named hosts or services Chord can provide same service: Name = key, value = IP requires no special servers imposes no naming structure can also be used to find data objects that are not tied to certain machines

10 10 Example Application using Chord: Cooperative Mirroring Highest layer provides a file-like interface to user including user-friendly naming and authentication This file systems maps operations to lower-level block operations Block storage uses Chord to identify responsible node for storing a block and then talk to the block storage server on that node File System Block Store Chord Block Store Chord Block Store Chord ClientServer

11 CAN

12 What is CAN ? CAN is a distributed infrastructure that provides hash table like functionality CAN is composed of many individual nodes Each CAN node stores a chunk (zone) of the entire hash table Request for a particular key is routed by intermediate CAN nodes whose zone contains that key The design can be implemented in application level (no changes to kernel required)

13 Co-ordinate space in CAN

14 Design Of CAN Involves a virtual d-dimensional Cartesian Co- ordinate space – The co-ordinate space is completely logical – Lookup keys hashed into this space The co-ordinate space is partitioned into zones among all nodes in the system – Every node in the system owns a distinct zone The distribution of zones into nodes forms an overlay network

15 Design of CAN (..continued) To store (Key,value) pairs, keys are mapped deterministically onto a point P in co-ordinate space using a hash function The (Key,value) pair is then stored at the node which owns the zone containing P To retrieve an entry corresponding to Key K, the same hash function is applied to map K to the point P The retrieval request is routed from requestor node to node owning zone containing P

16 Routing in CAN Every CAN node holds IP address and virtual co- ordinates of each of it’s neighbours Every message to be routed holds the destination co-ordinates Using it’s neighbour’s co-ordinate set, a node routes a message towards the neighbour with co- ordinates closest to the destination co-ordinates Progress: how much closer the message gets to the destination after being routed to one of the neighbours

17 Routing in CAN(continued…) For a d-dimensional space partitioned into n equal zones, routing path length = O(d.n 1/d ) hops – With increase in no. of nodes, routing path length grows as O(n 1/d ) Every node has 2d neighbours – With increase in no. of nodes, per node state does not change

18 Before a node joins CAN

19 After a Node Joins

20 Allocation of a new node to a zone 1.First the new node must find a node already in CAN(Using Bootstrap Nodes) 2.The new node randomly chooses a point P in the co- ordinate space 3.It sends a JOIN request to point P via any existing CAN node 4.The request is forwarded using CAN routing mechanism to the node D owning the zone containing P 5.D then splits it’s node into half and assigns one half to new node 6.The new neighbour information is determined for both the nodes

21 Failure of node Even if one of the neighbours fails, messages can be routed through other neighbours in that direction If a node leaves CAN, the zone it occupies is taken over by the remaining nodes – If a node leaves voluntarily, it can handover it’s database to some other node – When a node simply becomes unreachable, the database of the failed node is lost CAN depends on sources to resubmit data, to recover lost data


23 Features CHORD is a distributed hash table implementation Addresses a fundamental problem in P2P  Efficient location of the node that stores desired data item  One operation: Given a key, maps it onto a node  Data location by associating a key with each data item Adapts Efficiently  Dynamic with frequent node arrivals and departures  Automatically adjusts internal tables to ensure availability Uses Consistent Hashing  Load balancing in assigning keys to nodes  Little movement of keys when nodes join and leave

24 Features (continued) Efficient Routing  Distributed routing table  Maintains information about only O(logN) nodes  Resolves lookups via O(logN) messages Scalable  Communication cost and state maintained at each node scales logarithmically with number of nodes Flexible Naming  Flat key-space gives applications flexibility to map their own names to Chord keys Decentralized

25 Some Terminology Key  Hash key or its image under hash function, as per context  m-bit identifier, using SHA-1 as a base hash function Node  Actual node or its identifier under the hash function  Length m such that low probability of a hash conflict Chord Ring  The identifier circle for ordering of 2 m node identifiers Successor Node  First node whose identifier is equal to or follows key k in the identifier space Virtual Node  Introduced to limit the bound on keys per node to K/N  Each real node runs Ω(logN) virtual nodes with its own identifier

26 Chord Ring

27 Consistent Hashing A consistent hash function is one which changes minimally with changes in the range of keys and a total remapping is not required Desirable properties  High probability that the hash function balances load  Minimum disruption, only O(1/N) of the keys moved when a nodes joins or leaves  Every node need not know about every other node, but a small amount of “routing” information m-bit identifier for each node and key Key k assigned to Successor Node

28 Simple Key Location

29 Example

30 30 Scalable Key Location A very small amount of routing information suffices to implement consistent hashing in a distributed environment Each node need only be aware of its successor node on the circle Queries for a given identifier can be passed around the circle via these successor pointers Resolution scheme correct, BUT inefficient: it may require traversing all N nodes!

31 31 Acceleration of Lookups Lookups are accelerated by maintaining additional routing information Each node maintains a routing table with (at most) m entries (where N=2 m ) called the finger table i th entry in the table at node n contains the identity of the first node, s, that succeeds n by at least 2 i-1 on the identifier circle (clarification on next slide) s = successor(n + 2 i-1 ) (all arithmetic mod 2) s is called the i th finger of node n, denoted by n.finger(i).node

32 32 Finger Tables (1) 0 4 26 5 1 3 7 1 2 4 [1,2) [2,4) [4,0) 1 3 0 finger table startint.succ. keys 1 235235 [2,3) [3,5) [5,1) 330330 finger table startint.succ. keys 2 457457 [4,5) [5,7) [7,3) 000000 finger table startint.succ. keys 6

33 33 Finger Tables (2) - characteristics Each node stores information about only a small number of other nodes, and knows more about nodes closely following it than about nodes farther away A node’s finger table generally does not contain enough information to determine the successor of an arbitrary key k Repetitive queries to nodes that immediately precede the given key will lead to the key’s successor eventually

34 Pseudo code

35 Example

36 36 Node Joins – with Finger Tables 0 4 26 5 1 3 7 124124 [1,2) [2,4) [4,0) 130130 finger table startint.succ. keys 1 235235 [2,3) [3,5) [5,1) 330330 finger table startint.succ. keys 2 457457 [4,5) [5,7) [7,3) 000000 finger table startint.succ. keys finger table startint.succ. keys 702702 [7,0) [0,2) [2,6) 003003 6 6 6 6 6

37 37 Node Departures – with Finger Tables 0 4 26 5 1 3 7 124124 [1,2) [2,4) [4,0) 130130 finger table startint.succ. keys 1 235235 [2,3) [3,5) [5,1) 330330 finger table startint.succ. keys 2 457457 [4,5) [5,7) [7,3) 660660 finger table startint.succ. keys finger table startint.succ. keys 702702 [7,0) [0,2) [2,6) 003003 6 6 6 0 3

38 38 Source of Inconsistencies: Concurrent Operations and Failures Basic “stabilization” protocol is used to keep nodes’ successor pointers up to date, which is sufficient to guarantee correctness of lookups Those successor pointers can then be used to verify the finger table entries Every node runs stabilize periodically to find newly joined nodes

39 Pseudo code

40 Pseudo Code(Continue..)

41 41 Stabilization after Join npnp succ(n p ) = n s nsns n pred(n s ) = n p  n joins predecessor = nil n acquires n s as successor via some n’ n notifies n s being the new predecessor n s acquires n as its predecessor  n p runs stabilize n p asks n s for its predecessor (now n) n p acquires n as its successor n p notifies n n will acquire n p as its predecessor  all predecessor and successor pointers are now correct  fingers still need to be fixed, but old fingers will still work nil pred(n s ) = n succ(n p ) = n

42 42 Failure Recovery Key step in failure recovery is maintaining correct successor pointers To help achieve this, each node maintains a successor-list of its r nearest successors on the ring If node n notices that its successor has failed, it replaces it with the first live entry in the list stabilize will correct finger table entries and successor-list entries pointing to failed node Performance is sensitive to the frequency of node joins and leaves versus the frequency at which the stabilization protocol is invoked

43 Impact of Node Joins on Lookups: Correctness For a lookup before stabilization has finished, 1.Case 1: All finger table entries involved in the lookup are reasonably current then lookup finds correct successor in O(logN) steps 2.Case 2: Successor pointers are correct, but finger pointers are inaccurate. This scenario yields correct lookups but may be slower 3.Case 3: Incorrect successor pointers or keys not migrated yet to newly joined nodes. Lookup may fail. Option of retrying after a quick pause, during which stabilization fixes successor pointers

44 Impact of Node Joins on Lookups: Performance After stabilization, no effect other than increasing the value of N in O(logN) Before stabilization is complete Possibly incorrect finger table entries Does not significantly affect lookup speed, since distance halving property depends only on ID-space distance If new nodes’ IDs are between the target predecessor and the target, then lookup speed is influenced Still takes O(logN) time for N new nodes

45 Handling Failures Problem: what if node does not know who its new successor is, after failure of old successor May be in a gap in the finger table Chord would be stuck! Maintain successor list of size r, containing the node’s first r successors If immediate successor does not respond, substitute the next entry in the successor list Modified version of stabilize protocol to maintain the successor list Modified closest_preceding_node to search not only finger table but also successor list for most immediate predecessor If find_successsor fails, retry after some time Voluntary Node Departures Transfer keys to successor before departure Notify predecessor p and successor s before leaving

46 Theorems Theorem IV.3: Inconsistencies in successor are transient If any sequence of join operations is executed interleaved with stabilizations, then at sometime after the last join the successor pointers will form a cycle on all the nodes in the network. Theorem IV.4: Lookup take log(N) time with high probability even if N nodes join a stable N node network, once successor pointers are correct, even if finger pointers are not updated Theorem IV.6: If network is initially stable, even if every node fails with probability ½, expected time to execute find_succcessor is O(log N)

47 Simulation Implements Iterative Style (other one is recursive style) Node resolving a lookup initiates all communication unlike Recursive Style, where intermediate nodes forward request Optimizations During stabilization, a node updates its immediate successor and 1 other entry in successor list or finger table Each entry out of k unique entries gets refreshed once in ’k’ stabilization rounds Size of successor list is 1 Immediate notification of predecessor change to old predecessor, without waiting for next stabilization round

48 Parameters Mean of delay of each packet is 50 ms Round trip time is 500 ms Number of nodes is 10 4 Number of Keys vary from 10 4 to 10 6

49 Load Balance Test ability of consistent hashing, to allocate keys to nodes evenly Number of keys per node exhibits large variations, that increase linearly with the number of keys Association of keys with Virtual Nodes Makes the number of keys per node more uniform and Significantly improves load balance Asymptotic value of query path length not affected much Total identifier space covered remains same on average Worst-case number of queries does not change Not much increase in routing state maintained Asymptotic number of control messages not affected

50 In the absence of Virtual Node

51 In Presence of Virtual Nodes

52 Path Length Number of nodes that must be visited to resolve a query, measured as the query path length As per theorem, IV.2 The number of nodes that must be contacted to find a successor in an N-node Network is O(log N) Observed Results : Mean query path length increases logarithmically with number of nodes Average Same as expected average query path length

53 Path Length Simulator Parameters A network with N = 2K nodes No of Keys 100 x 2K K varied between 3 to 14 and Path length is measured

54 Graph

55 Future Work Resilience against Network Partitions Detect and heal partitions For every node have a set of initial nodes Maintain a long term memory of a random set of nodes Likely to include nodes from other partition Handle Threats to Availability of data Malicious participants could present incorrect view of data Periodical Global Consistency Checks for each node Better Efficiency O(logN) messages per lookup too many for some apps Increase the number of fingers

56 References Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications, I. Stoica, R. Morris, D. Karger, M. Frans Kaashoek, H. Balakrishnan, In Proc. ACM SIGCOMM 2001. Expanded version appears in IEEE/ACM Trans. Networking, 11(1), February 2003. Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications A Scalable Content-Addressable Network, S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenker, In Proc. ACM SIGCOMM 2001) A Scalable Content-Addressable Network Querying the Internet with PIER Ryan Huebsch, Joseph M. Hellerstein, Nick Lanham, Boon Thau Loo, Scott Shenker, and Ion Stoica, VLDB 03 Querying the Internet with PIER

57 Thank You !

58 Any Question ??

Download ppt "Distributed Hash-based Lookup for Peer-to-Peer Systems Mohammed Junaid Azad 09305050 Gopal Krishnan 09305915 Mtech1,CSE."

Similar presentations

Ads by Google