Presentation is loading. Please wait.

Presentation is loading. Please wait.

File Sharing : Hash/Lookup Yossi Shasho (HW in last slide) Based on Chord: A Scalable Peer-to-peer Lookup Service for Internet ApplicationsChord: A Scalable.

Similar presentations


Presentation on theme: "File Sharing : Hash/Lookup Yossi Shasho (HW in last slide) Based on Chord: A Scalable Peer-to-peer Lookup Service for Internet ApplicationsChord: A Scalable."— Presentation transcript:

1 File Sharing : Hash/Lookup Yossi Shasho (HW in last slide) Based on Chord: A Scalable Peer-to-peer Lookup Service for Internet ApplicationsChord: A Scalable Peer-to-peer Lookup Service for Internet Applications Partially based on The Impact of DHT Routing Geometry on Resilience and ProximityThe Impact of DHT Routing Geometry on Resilience and Proximity Partially based on Building a Low-latency, Proximity-aware DHT-Based P2P Network http://www.computer.org/portal/web/csdl/doi/10.1109/KSE.2009.49 http://www.computer.org/portal/web/csdl/doi/10.1109/KSE.2009.49 Some slides liberally borrowed from: Carnegie Melon Peer-2-Peer 15-411 Petar Maymounkov and David Mazières’ Kademlia Talk, New York University 1

2 Peer-2-Peer – Distributed systems without any centralized control or hierarchical organization. – Long list of applications: Redundant storage Permanence Selection of nearby servers Anonymity, search, authentication, hierarchical naming and more – Core operation in most p2p systems is efficient location of data items 2

3 Outline 1. The Problem 2. Distributed hash tables (DHT) 3. Chord: a DHT scheme Geometry Lookup Node Joins Performance 4. Extras 3

4 1. The Problem 2. Distributed hash tables (DHT) 3. Chord: a DHT scheme Geometry Lookup Node Joins Performance 4. Extras 4

5 Think Big /home/google/ One namespace, thousands of servers – Map each key (=filename) to a value (=server) – Hash table? Think again What if a new server joins? server fails? How to keep track of all servers? What about redundancy? And proximity? Not scalable, Centralized, Fault intolerant Lots of new problems to come up… 5

6 1. The Problem 2. Distributed hash tables (DHT) 3. Chord: a DHT scheme Geometry Lookup Node Joins Performance 4. Extras 6

7 DHT: Overview Abstraction: a distributed “hash-table” (DHT) data structure: – put(id, item); – item = get(id); Scalable, Decentralized, Fault Tolerant Implementation: nodes in system form a distributed data structure – Can be Ring, Tree, Hypercube, Skip List, Butterfly Network,... 7

8 DHT: Overview (2) Many DHTs: 8

9 DHT: Overview (3) Good properties: – Distributed construction/maintenance – Load-balanced with uniform identifiers – O(log n) hops / neighbors per node – Provides underlying network proximity 9

10 Consistent Hashing When adding rows (servers) to hash-table, we don’t want all keys to change their mappings When adding the N th row, we want ~1/N of the keys to change their mappings. Is this achievable? Yes. 10

11 11 1. The Problem 2. Distributed hash tables (DHT) 3. Chord: a DHT scheme Geometry Lookup Node Joins Performance 4. Extras

12 12

13 13

14 Chord: Overview Just one operation: item = get(id) Each node needs routing info about few other nodes O(logN) for lookup, O(log 2 N) for join/leave Simple, provable correctness, provable performance Apps built on top of it do the rest 14

15 Chord: Geometry Identifier space [1,N], example: binary strings Keys (filenames) and values (server IPs) on the same identifier space Keys & values evenly distributed Now, put this identifier space on a circle Consistent Hashing: A key is stored at its successor. 15

16 Chord: Geometry (2) A key is stored at its successor: node with next higher ID 16 N32 N90 N105 K80 K20 K5 Circular ID space Key 5 Node 105 Get(5)=32 Get(20)=32 Get(80)=90 Who maps to 105? Nobody.

17 Chord: Back to Consistent Hashing “When adding the N th row, we want ~1/N of the keys to change their mappings.” (The problem, a few slides back) 17 N32 N90 N105 K80 K20 K5 Circular ID space Key 5 Node 105 Get(5)=32 Get(20)=32 Get(80)=90 Who maps to 105? Nobody. N50 N15 Get(5)=3215 Get(20)=32 Get(80)=90 Who maps to 105? Nobody.

18 18 1. The Problem 2. Distributed hash tables (DHT) 3. Chord: a DHT scheme Geometry Lookup Node Joins Performance 4. Extras

19 Chord: Basic Lookup 19 N32 N90 N105 N60 N10 N120 K80 “Where is key 80?” “N90 has K80” K80 Each node remembers only next node O(N) lookup time – no good! get(k): If (I have k) Return “ME” Else P  next node Return P.get(k)

20 Chord: “Finger Table” Previous lookup was O(N). We want O(logN) 20 N80 1/2 1/4 1/8 1/16 1/32 1/64 1/128 Entry i in the finger table of node n is the first node n’ such that n’ ≥ n + 2 i In other words, the i th finger of n points 1/2 n-i way around the ring i id+2 i succ 0 80+2 0 = 81 __ 1 82+2 1 = 82 __ 2 84+2 2 = 84 __ Finger Table

21 Chord: “Finger Table” Lookups 21 N80 1/2 1/4 1/8 1/16 1/32 1/64 1/128 Entry i in the finger table of node n is the first node n’ such that n’ ≥ n + 2 i In other words, the i th finger of n points 1/2 n-i way around the ring get(k): If (I have k) Return “ME” Else P  next node Closest finger i ≤ k Return P.get(k) i id+2 i succ 0 80+2 0 = 81 __ 1 82+2 1 = 82 __ 2 84+2 2 = 84 __ Finger Table

22 22 N65 N74 N81 N90 N2 N9 N19 N31 N49 K40 “Where is key 40?” i id+2 i succ 0 65+2 0 = 66 N74 1 65+2 1 = 67 N74 6 65+2 6 = 29 N19 Finger Table i id+2 i succ 0 20 N31 1 21 N31 4 35 N49 Finger Table “40!” get(k): If (I have k) Return “ME” Else P  Closest finger i ≤ k Return P.get(k) Chord: “Finger Table” Lookups

23 Chord: Example 23 Assume an identifier space [0..8] Node n1 joins Responsible for all keys (Succ == successor) 0 1 2 3 4 5 6 7 i id+2 i succ 0 1+2 0 = 2 1 1 1+2 1 = 3 1 2 1+2 2 = 5 1 Succ. Table

24 Chord: Example 24 Node n2 joins 0 1 2 3 4 5 6 7 i id+2 i succ 0 2 12 1 3 1 2 5 1 Succ. Table i id+2 i succ 0 3 1 1 4 1 2 6 1 Succ. Table

25 Chord: Example 25 Node n0, n6 join 0 1 2 3 4 5 6 7 i id+2 i succ 0 2 12 1 3 16 2 5 16 Succ. Table i id+2 i succ 0 3 16 1 4 16 2 6 16 Succ. Table i id+2 i succ 0 1 1 1 2 2 2 4 0 Succ. Table i id+2 i succ 0 7 0 1 0 0 2 2 2 Succ. Table

26 Chord: Example 26 Nodes: n1, n2, n0, n6 Items: 1,7 0 1 2 3 4 5 6 7 i id+2 i succ 0 2 12 1 3 16 2 5 16 Succ. Table i id+2 i succ 0 3 16 1 4 16 2 6 16 Succ. Table i id+2 i succ 0 1 1 1 2 2 2 4 0 Succ. Table 7 Items 1 i id+2 i succ 0 7 0 1 0 0 2 2 2 Succ. Table

27 Chord: Routing 27 Upon receiving a query for item id, a node: 1.Checks if it stores the item locally 2.If not, forwards query to largest node i in its finger table such that i ≤ id 0 1 2 3 4 5 6 7 i id+2 i succ 0 2 2 1 3 6 2 5 6 Succ. Table i id+2 i succ 0 3 6 1 4 6 2 6 6 Succ. Table i id+2 i succ 0 1 1 1 2 2 2 4 0 Succ. Table 7 Items 1 i id+2 i succ 0 7 0 1 0 0 2 2 2 Succ. Table query(7)

28 28 1. The Problem 2. Distributed hash tables (DHT) 3. Chord: a DHT scheme Geometry Lookup Node Joins Performance 4. Extras

29 Chord: Node Join Node n joins: Need one existing node - n', in hand 1.Initialize fingers of n – Ask n' to look them up (logN fingers to init) 2.Update fingers of the rest – Few nodes need to be updated – Look them up and tell them n is new in town 3.Transfer keys 29

30 Chord: Improvements Every 30s, ask successor for its predecessor – Fix your own successor based on this Also, pick and verify a random finger – Rebuild finger table entries this way keep successor list of r successors – Deal with unexpected node failures – Can use these to replicate data 30

31 31 1. The Problem 2. Distributed hash tables (DHT) 3. Chord: a DHT scheme Geometry Lookup Node Joins Performance 4. Extras

32 Chord: Performance 32 Routing table size? – Log N fingers Routing time? – Each hop expects to half the distance to the desired id => expect O(log N) hops. Node joins – Query for the fingers => O(log N) – Update other nodes’ fingers => O(log 2 N)

33 Chord: Performance (2) 33 Real time: Lookup time / #nodes

34 Chord: Performance (3) 34 Comparing to other DHTs

35 Chord: Performance (4) 35 Promises few O(logN) hops on the overlay – But, on the physical network, this can be quite far f f A Chord network with N(=8) nodes and m(=8)-bit key space

36 36 1. The Problem 2. Distributed hash tables (DHT) 3. Chord: a DHT scheme Geometry Lookup Node Joins Performance 4. Extras

37 Applications employing DHTs eMule (KAD implements Kademlia - a DHT) A anonymous network (≥ 2 mil downloads to day) BitTorrent (≥ 4.1.2 beta) – Trackerless BitTorrent, allows anonymity (thank god) 1.Clients A & B handshake 2.A: “I have DHT, its on port X” 3.B: ping port X of A 4.B gets a reply => start adjusting - nodes, rows… 37

38 Kademlia (KAD) Distance between A and B is A XOR B Nodes are treated as leafs in binary tree Node’s position in A’s tree is determined by the longest postfix it shares with A – A’s ID: 010010101 – B’s ID: 101000101 38

39 Kademlia: Postfix Tree Node’s position in A’s tree is determined by the longest postfix it shares with A (=> logN subtrees) 39 Node / Peer Our node common prefix: 001 common prefix: 00 common prefix: 0 No common prefix

40 Kademlia: Lookup 40 Consider a query for ID 111010… initiated by node 0011100… Node / Peer Our node

41 Kademlia: K-Buckets 11…1100…00 1 1 1 1 1 1 1 11 11 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 ` Its binary tree is divided into a series of subtrees Consider routing table for a node with prefix 0011 A contact consist of The routing table is composed of a k-bucket s corresponding to each of these subtrees Consider a 2-bucket example, each bucket will have atleast 2 contacts for its key range Node / Peer Our node

42 Summary 42 1. The Problem 2. Distributed hash tables (DHT) 3. Chord: a DHT scheme Geometry Lookup Node Joins Performance 4. Extras

43 Homework Load balance is achieved when all Servers in the Chord network are responsible for (roughly) the same amount of keys Still, with some probability, one server can be responsible for significantly more keys How can we lower the upper bound to the number of keys assigned to a server? Hint: Simulation 43


Download ppt "File Sharing : Hash/Lookup Yossi Shasho (HW in last slide) Based on Chord: A Scalable Peer-to-peer Lookup Service for Internet ApplicationsChord: A Scalable."

Similar presentations


Ads by Google