Presentation is loading. Please wait.

Presentation is loading. Please wait.

Peer to Peer and Distributed Hash Tables

Similar presentations

Presentation on theme: "Peer to Peer and Distributed Hash Tables"— Presentation transcript:

1 Peer to Peer and Distributed Hash Tables
CS 271

2 Distributed Hash Tables
Challenge: To design and implement a robust and scalable distributed system composed of inexpensive, individually unreliable computers in unrelated administrative domains Partial thanks Idit Keidar) CS 271

3 Searching for distributed data
Goal: Make billions of objects available to millions of concurrent users e.g., music files Need a distributed data structure to keep track of objects on different sires. map object to locations Basic Operations: Insert(key) Lookup(key) CS 271

4 Searching N2 N1 N3 ? N4 N6 N5 Key=“title” Value=MP3 data… Internet
Client Publisher 1000s of nodes. Set of nodes may change… Lookup(“title”) N4 N6 N5 CS 271

5 Simple Solution First There was Napster
Centralized server/database for lookup Only file-sharing is peer-to-peer, lookup is not Launched in 1999, peaked at 1.5 million simultaneous users, and shut down in July 2001. Pros: low overhead, no duplicates, server knows all nodes that have a certain file, 100% success Cons: single-point-of-failure; load on central DB; requires heavy infrastructure; if central DB is partitioned, success rate is no longer 100% CS 271

6 Napster: Publish insert(X, ... Publish
I have X, Y, and Z! CS 271

7 Napster: Search Fetch search(A) --> Query
Reply Where is file A? CS 271

8 Overlay Networks A virtual structure imposed over the physical network (e.g., the Internet) A graph, with hosts as nodes, and some edges Overlay Network Keys Node ids A node does not need to know all other nodes. Hash fn Hash fn CS 271

9 Unstructured Approach: Gnutella
Build a decentralized unstructured overlay Each node has several neighbors Holds several keys in its local database When asked to find a key X Check local database if X is known If yes, return, if not, ask your neighbors Use a limiting threshold for propagation. Flooding. In practice: limit TTL of flood CS 271

10 Gnutella: Search I have file A. Reply Query Where is file A? CS 271

11 Structured vs. Unstructured
The examples we described are unstructured There is no systematic rule for how edges are chosen, each node “knows some” other nodes Any node can store any data so a searched data might reside at any node Structured overlay: The edges are chosen according to some rule Data is stored at a pre-defined place Tables define next-hop for lookup CS 271

12 Hashing Data structure supporting the operations:
void insert( key, item ) item search( key ) Implementation uses hash function for mapping keys to array cells Expected search time O(1) provided that there are few collisions CS 271

13 Distributed Hash Tables (DHTs)
Nodes store table entries lookup( key ) returns the location of the node currently responsible for this key We will mainly discuss Chord, Stoica, Morris, Karger, Kaashoek, and Balakrishnan SIGCOMM 2001 Other examples: CAN (Berkeley), Tapestry (Berkeley), Pastry (Microsoft Cambridge), etc. Same spec, not distributed CS 271

14 For d dimensions, routing takes O(dn1/d) hops
CAN [Ratnasamy, et al] Map nodes and keys to coordinates in a multi-dimensional cartesian space Zone source key Routing through shortest Euclidean path For d dimensions, routing takes O(dn1/d) hops

15 Chord Logical Structure (MIT)
m-bit ID space (2m IDs), usually m=160. Nodes organized in a logical ring according to their IDs. N1 N56 N51 N8 N10 N48 N14 N42 N21 N38 N30

16 DHT: Consistent Hashing
Key 5 K5 Node 105 N105 K20 Circular ID space N32 N90 K80 A key is stored at its successor: node with next higher ID CS 271 Thanks CMU for animation

17 Consistent Hashing Guarantees
For any set of N nodes and K keys: A node is responsible for at most (1 + )K/N keys When an (N + 1)st node joins or leaves, responsibility for O(K/N) keys changes hands CS 271

18 DHT: Chord Basic Lookup
N120 N10 “Where is key 80?” N105 N32 “N90 has K80” Just need to make progress, and not overshoot. Will talk about initialization later. And robustness. Now, how about speed? K80 N90 N60 Each node knows only its successor Routing around the circle, one node at a time. CS 271

19 DHT: Chord “Finger Table”
1/4 1/2 1/8 1/16 1/32 1/64 1/128 N80 Small tables, but multi-hop lookup. Table entries: IP address and Chord ID. Navigate in ID space, route queries closer to successor. Log(n) tables, log(n) hops. Route to a document between ¼ and ½ … Entry i in the finger table of node n is the first node that succeeds or equals n + 2i In other words, the ith finger points 1/2n-i way around the ring CS 271

20 DHT: Chord Join Assume an identifier space [0..8] Node n1 joins 1 7 6
Succ. Table i id+2i succ 1 7 6 2 5 3 4 CS 271

21 DHT: Chord Join Node n2 joins 1 7 6 2 5 3 4 Succ. Table i id+2i succ
i id+2i succ 1 7 6 2 Succ. Table i id+2i succ 5 3 4 CS 271

22 DHT: Chord Join Nodes n0, n6 join 1 7 6 2 5 3 4 Succ. Table
i id+2i succ Nodes n0, n6 join Succ. Table i id+2i succ 1 7 Succ. Table i id+2i succ 6 2 Succ. Table i id+2i succ 5 3 4 CS 271

23 DHT: Chord Join Nodes: n1, n2, n0, n6 Items: f7, f1 1 7 6 2 5 3 4
Succ. Table Items i id+2i succ 7 Nodes: n1, n2, n0, n6 Items: f7, f1 Succ. Table Items 1 7 i id+2i succ 1 Succ. Table 6 2 i id+2i succ Succ. Table i id+2i succ 5 3 4 CS 271

24 DHT: Chord Routing Upon receiving a query for item id, a node:
Checks whether stores the item locally? If not, forwards the query to the largest node in its successor table that does not exceed id Succ. Table Items i id+2i succ 7 Succ. Table Items 1 7 i id+2i succ 1 query(7) Succ. Table 6 2 i id+2i succ Succ. Table i id+2i succ 5 3 4 CS 271

25 Chord Data Structures Finger table First finger is successor
Predecessor What if each node knows all other nodes O(1) routing Expensive updates CS 271

26 Routing Time Node n looks up a key stored at node p
p is in n’s ith interval: p  ((n+2i-1)mod 2m, (n+2i)mod 2m] n contacts f=finger[i] The interval is not empty so: f  ((n+2i-1)mod 2m, (n+2i)mod 2m] f is at least 2i-1 away from n p is at most 2i-1 away from f The distance is halved at each hop. n n+2i-1 f finger[i] p n+2i m steps worst case; O(log N) whp under load balance assumptions.

27 Routing Time Assuming uniform node distribution around the circle, the number of nodes in the search space is halved at each step: Expected number of steps: log N Note that: m = 160 For 1,000,000 nodes, log N = 20 CS 271

28 P2P Lessons Decentralized architecture. Avoid centralization
Flooding can work. Logical overlay structures provide strong performance guarantees. Churn a problem. Useful in many distributed contexts. CS 271

Download ppt "Peer to Peer and Distributed Hash Tables"

Similar presentations

Ads by Google