Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS 347Notes081 CS 347: Parallel and Distributed Data Management Notes 08: P2P Systems.

Similar presentations


Presentation on theme: "CS 347Notes081 CS 347: Parallel and Distributed Data Management Notes 08: P2P Systems."— Presentation transcript:

1 CS 347Notes081 CS 347: Parallel and Distributed Data Management Notes 08: P2P Systems

2 CS 347Notes082

3 CS 347Notes083 Peer To Peer Systems Distributed application where nodes are: –Autonomous –Very loosely coupled –Equal in role or functionality –Share & exchange resources with each other

4 CS 347Notes084 Related Terms File Sharing Grid Computing Autonomic Computing

5 CS 347Notes085 Search in a P2P System Resources: R 1,1, R 1,2,... Resources: R 2,1, R 2,2,... Resources: R 3,1, R 3,2,... Query: Who has X?

6 CS 347Notes086 Search in a P2P System Resources: R 1,1, R 1,2,... Resources: R 2,1, R 2,2,... Resources: R 3,1, R 3,2,... Query: Who has X? answers

7 CS 347Notes087 Search in a P2P System Resources: R 1,1, R 1,2,... Resources: R 2,1, R 2,2,... Resources: R 3,1, R 3,2,... Query: Who has X? answers request resource receive resource

8 CS 347Notes088 Distributed Lookup Have pairs Given k, find matching values 1 a 1 b 7 c 4 d 1 a 3 a 4 a lookup(4) = {a, d} k v

9 CS 347Notes089 Data Distributed Over Nodes N nodes Each holds some data

10 CS 347Notes0810 Distributed Hashing Chord Replicated HT Chord Paper: Ion Stoica, Robert Morris, David Liben-Nowell, David R. Karger, M. Frans Kaashoek, Frank Dabek, Hari Balakrishnan, Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications. IEEE/ACM Transactions on Networking http://pdos.csail.mit.edu/chord/papers/paper-ton.pdf

11 CS 347Notes0811 Hashing Values and Nodes H(v) is m-bit number (v is value) H(X) is m-bit number (X is node id) Hash function is “good” 0 2 m - 1 H(X) H(Y) H(k) k v stored at Y

12 CS 347Notes0812 Chord Circle N56 N51 N48 N42 N38 N32 N21 N14 N8 N1 K54 K38 K24, K30 K10 m=6

13 CS 347Notes0813 Rule Consider nodes X, Y such that Y follows X clockwise Node Y is responsible for keys k such that H(k) in ( H(X), H(Y) ] N3 N54 stores K55, K56,... K3 use hashed values... e.g., N54 is node whose id hashes to 54.

14 CS 347Notes0814 Succ, pred links N56 N51 N48 N42 N38 N32 N21 N14 N8 N1 N1.succ N1.pred m=6

15 CS 347Notes0815 Search using succ links N56 N51 N48 N42 N38 N32 N21 N14 N8 N1 N14.find_succ(K52) N51.find_succ(K52) =N56 m=6

16 CS 347Notes0816 Finger Table N56 N51 N48 N42 N38 N32 N21 N14 N8 N1 N8+1 N14 N8+2 N14 N8+4 N14 N8+8 N21 N8+16 N32 finger table for N8 N8+32 N42 m=6

17 CS 347Notes0817 Finger Table N56 N51 N48 N42 N38 N32 N21 N14 N8 N1 N8+1 N14 N8+2 N14 N8+4 N14 N8+8 N21 N8+16 N32 finger table for N8 N8+32 N42 m=6 Node responsible for key 8+32 = 40

18 CS 347Notes0818 Example N56 N51 N48 N42 N38 N32 N21 N14 N8 N1 find_succ(K54)

19 CS 347Notes0819 Example N56 N51 N48 N42 N38 N32 N21 N14 N8 N1 find_succ(K54)

20 CS 347Notes0820 Example N56 N51 N48 N42 N38 N32 N21 N14 N8 N1 find_succ(K54) K54

21 CS 347Notes0821 Adding Nodes to Circle N56 N51 N48 N42 N38 N32 N21 N14 N8 N1 K54 K38 K24, K30 K10 N26 joining node for now, assume nodes never die need to 1.update links 2.move data

22 CS 347Notes0822 Join Example Np Nx Ns

23 CS 347Notes0823 Join Example Np Nx Ns nil after join:

24 CS 347Notes0824 Join Example Np Nx Ns nil after Nx.stabilize: Y= Np

25 CS 347Notes0825 Join Example Np Nx Ns after Np.stabilize: Y= Nx

26 CS 347Notes0826 Moving Data: When? Np Nx Ns keys in (Np, Ns] keys in (Np, Nx]

27 CS 347Notes0827 Move Data After Ns.notify(Nx) Np Nx Ns nil send all keys in range (Np, Nx] when Ns.prev is updated...

28 CS 347Notes0828 Lookup May be at Wrong Node! Np Nx Ns nil resp. for (Nx, Ns] resp. for (Np, Nx] lookup for k in (Np, Nx] directed to Ns!

29 CS 347Notes0829 Results for N Node System With high probability, the number of nodes that must be contacted to find a successor is O(log N) Although finger table contains room for m entries, only O(log N) need to be stored Experimental results show average lookup time is (log N)/2

30 CS 347Notes0830 Node Failures Np Nx Ns k8 v8 k9 v9 k7 v7 data at Nx: Nx dies!! - links screwed up - data lost

31 CS 347Notes0831 Failure Example Np Nx Ns Initially... Nx dies... backup succ link (r=2)

32 CS 347Notes0832 Failure Example Np Ns After Ns.check_pred nil

33 CS 347Notes0833 Failure Example Np Ns After Np discovers Nx down... nil

34 CS 347Notes0834 Failure Example Np Ns After stabilization...

35 CS 347Notes0835 Protecting Against Data Loss One idea: robust node (see notes on replicated data) k8 v8 k9 v9 k7 v7 node X.1 k8 v8 k9 v9 k7 v7 node X.2 k8 v8 k9 v9 k7 v7 node X.3 robust node X backup protocols takeover protocols

36 CS 347Notes0836 Replicated Hash Table node N0 data for keys that hash to 0 node N1 data for keys that hash to 1 node N2 data for keys that hash to 2 node N3 data for keys that hash to 3

37 CS 347Notes0837 Node Joins node N0 data for keys that hash to 0,1 node N2 data for keys that hash to 2,3 N0 overloaded, asks N1 for help...

38 CS 347Notes0838 Node Joins node N0 data for keys that hash to 0,1 node N2 data for keys that hash to 2,3 First, set up N1... node N1

39 CS 347Notes0839 Node Joins node N0 data for keys that hash to 0,1 node N2 data for keys that hash to 2,3 Copy data to N1... node N1 data for keys that hash to 1 data copy

40 CS 347Notes0840 Node Joins node N0 data for keys that hash to 0,1 node N2 data for keys that hash to 2,3 Change control... node N1 data for keys that hash to 1 N1 Which do we update first, N0 or N1??

41 CS 347Notes0841 Node Joins node N0 data for keys that hash to 0 node N2 data for keys that hash to 2,3 Drop data at N0... node N1 data for keys that hash to 1 N1

42 CS 347Notes0842 Update Other Nodes node N0 data for keys that hash to 0,1 node N2 data for keys that hash to 2,3 node N1 data for keys that hash to 1 N1 How do other nodes get updated? Eagerly by N0? Lazily by future lookups?

43 CS 347Notes0843 What about inserts? node N0 data for keys that hash to 0,1 node N2 data for keys that hash to 2,3 node N1 data for keys that hash to 1 insert arrives at this point... apply at N0 and then copy? re-direct to N1? data copy

44 CS 347Notes0844 Chord vs Replicated HT Which code is simpler? Lookups O(log N) vs O(1) Impact of caching Routing table: Log N vs N Anonymity? Bootstrapping

45 CS 347Notes0845 Neighborhood Search (Gnutella) Each node stores its own data searches nearby nodes keyval 12a 7b 13c 25a keyval 47f 12d 51x 9y keyval 41g 99c 14d

46 CS 347Notes0846 Example keyval 12a 7b 13c 25a keyval 47f 12d 25x 9y keyval 41g 13c 14d keyval 41g 13x 14d keyval 41g 13c 14d keyval 41g 13f 14d keyval 13f 12c keyval 15f 12c keyval 13d 12c keyval 44s 05a node N1 N1.DTlookup(13), TTL = 4 keyval 13t 01d

47 CS 347Notes0847 Example keyval 12a 7b 13c 25a keyval 47f 12d 25x 9y keyval 41g 13c 14d keyval 41g 13x 14d keyval 41g 13c 14d keyval 41g 13f 14d keyval 13f 12c keyval 15f 12c keyval 13d 12c keyval 44s 05a node N1 N1.DTlookup(13), TTL = 4 keyval 13t 01d ttl=3 ttl=1 ttl=2 answer so far = {c, f, x}

48 CS 347Notes0848 Example keyval 12a 7b 13c 25a keyval 47f 12d 25x 9y keyval 41g 13c 14d keyval 41g 13x 14d keyval 41g 13c 14d keyval 41g 13f 14d keyval 13f 12c keyval 15f 12c keyval 13d 12c keyval 44s 05a node N1 N1.DTlookup(13), TTL = 4 keyval 13t 01d answer = {c, f, x, d} (no t!)

49 CS 347Notes0849 Optimization Queries have unique identifier Nodes keep cache of recent queries (query id plus TTL)

50 CS 347Notes0850 Example keyval 12a 7b 13c 25a keyval 47f 12d 25x 9y keyval 41g 13c 14d keyval 41g 13x 14d keyval 41g 13c 14d keyval 41g 13f 14d keyval 13f 12c keyval 15f 12c keyval 13d 12c keyval 44s 05a node N1 N1.DTlookup(13), TTL = 4, Qid=77 keyval 13t 01d ttl=3 ttl=1 ttl=2 [77,4] [77,1] [77,2] [77,3] avoid Qid=77!

51 CS 347Notes0851 Example keyval 12a 7b 13c 25a keyval 47f 12d 25x 9y keyval 41g 13c 14d keyval 41g 13x 14d keyval 41g 13c 14d keyval 41g 13f 14d keyval 13f 12c keyval 15f 12c keyval 13d 12c keyval 44s 05a node N1 N1.DTlookup(13), TTL = 4, Qid=77 keyval 13t 01d [77,4] [77,1] [77,2] [77,3] avoid Qid=77! ttl=3 ttl=1 ttl=2

52 CS 347Notes0852 Bootstrap Server known nodes: S =X1, X2, x3,... server X5 X1 X2 X7 get neighbors add to S if no response, remove from S sometimes called “pong server”

53 CS 347Notes0853 Problems with Neighborhood Search Unnecessary messages High load and traffic –Example: nodes have M new neighbors, number of messages is M TTL Low capacity nodes are a bottleneck Do not find all answers network area searched TTL

54 CS 347Notes0854 Why is Neighborhood Search Good? Can pose complex queries Simple robust algorithm Works well if data is highly replicated network area searched TTL sites that have latest Justin Bieber song

55 CS 347Notes0855 Super-Nodes Regular nodes index their content at super-nodes Super-nodes run neighborhood search SN

56 CS 347Notes0856 Motivation for Super-Nodes Take advantage of powerful nodes Searching larger index better than searching many smaller ones SN

57 CS 347Notes0857 Open Problems Performance Correctness Participation Efficiency Load-balancing Authentic Services Prevention of DoS Incentives Functionality Applications Anonymity

58 CS 347Notes0858 Open Problems: “Bad Guys” Availability (e.g., coping with DOS attacks) Authenticity Anonymity Access Control (e.g., IP protection, payments,...)

59 CS 347Notes0859 Authenticity title: origin of species author: charles darwin date: 1859 body: In an island far, far away...... ?

60 CS 347Notes0860 Authenticity title: origin of species author: charles darwin date: 1859 body: In an island far, far away...... ? 00


Download ppt "CS 347Notes081 CS 347: Parallel and Distributed Data Management Notes 08: P2P Systems."

Similar presentations


Ads by Google