Presentation is loading. Please wait.

Presentation is loading. Please wait.

LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

Similar presentations


Presentation on theme: "LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment"— Presentation transcript:

1 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
CERIA Laboratory LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment W. LITWIN T. SCHWARZ H.YAKOUBEN Paris Dauphine University Santa Clara University (USA) Paris Dauphine University

2 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
Plan Objective Overview: SDDS & P2P LH*RSP2P Architecture Addressing Properties Churn Management Conclusion June 8th, 2007 2 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

3 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
Objective Design a new SDDS for a structured P2P environment A High available Data Structure and treatment of CHURN LH*RSP2P key search requires at most one forwarding message Very Large Scalable Files High availability to deal with churn At most one forwarding message for key search or insert or scan (fastest known performance) June 8th, 2007 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

4 SDDS (1993) A File of records identified by keys
SDDS client nodes face the applications and send queries to SDDS server nodes No centralized addressing Servers contain application or parity data In buckets Overflowing servers split on new servers Servers do not notify clients about splits

5 SDDS (1993) Clients use images of the file state for addressing
Key based Range queries Scans Images get adjusted towards the file state during queries by Image Adjustment Messages Triggered by incorrect addressing by the client IAMs reflect the file evolution by splits or, rarely, merges. IAMs reflect also the location changes because of failures and recovery

6 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
SDDS Typology Data Structures SDDS(1993) Classics Hash Tree Structured P2P Schemes CHORD... BATON VBI-Tree 1-dimensional d-dimensional 1-d Tree m-d Tree LH*, DDH, EH*, IH*… k-RP*, SD-Rtree, DRT*, RP*, High Availability Security linked also to RP* because of Alg. Sign. LH*m LH*g k-Availability Security LH*sa LH*s Alg. Sign… LH*rs June 8th, 2007 6 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 6

7 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
SDDS Expansion Growth through splits under inserts New Peer Peer Clients June 8th, 2007 7 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 7

8 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
SDDS Client Image Evolution Image Adjustment Messaging Clients June 8th, 2007 8 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 8

9 SDDS 2007 Prototype Available at CERIA site Announced at DbWorld
Managing LH* RS and RP* files In distributed RAM Uner Windows Over 1gbs Ethernet Various functions Response time reaching 30 microsec Up to 300 times faster than disk files

10 P2P (1995 ?) Autonomous nodes store and search data
By flooding in early systems Freenet, Napster, Gnutella… Structured P2P reduce the flooding Using decentralized data structures Distributed Hash Table (DHT) especially Few folks know the concept is due to B. Devine FODO 93 Chord, P-tree, VBI, Baton… Structured P2P schemes are specific SDDS schemes

11 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
LH*RSP2P Addressing Global Addressing Rule a  hi(C ) ; /* a is the address of peer destination of the key C*/ if a < n then a  hi+1(C ) ; /* (i, n) state of an SDDS file, they are only known to the file coordinator node hi (C ) = C mod 2i Client Address Calculus a’  hi’(C ) ; /* a’ is the address of peer destination of the key C*/ if a’ < n’ then a  hi’+1(C ) ; June 8th, 2007 11 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 11

12 LH*RSP2P File Expansion
File starts with i = 0 and n = 0 and a single data bucket 0 Every bucket m keeps the bucket level j of hash function hi last used to split, j = 0 initially. Overflowing bucket m alerts the coordinator Coordinator notifies bucket n to split Bucket n applies hi + 1 About half of keys migrates to new bucket n + 2i Bucket n and the new one set j = j + 1 Coordinator performs n = n + 1 if n = 2i then i = i + 1 and n = 0

13 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
Architecture based on LH*RS LH*RSP2P Peer LH*RS Client DB Candidate Peer Client & Spare Storage LH*RSP2P Peer j i’ n’ Client Part Server Part LH*P2P Peer LH*RS Client LH*RS PB Pupils LH*RSP2P Peer June 8th, 2007 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

14 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
Peer & Pupil Image Adjustment During Peer Split i’ = j  ; /* j value before the split n‘ = a +1  /* a is the splitting bucket if n’ = 2i’ then i’ = j + 1 ;  n’ = 0 ; June 8th, 2007 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

15 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
Example i’= j =1; n’= m+1= 1+1; If n’=21 then n’=0; i’= i’+1 and (i’, n’)= (2,0) After splitting j=2 i’=2 n’=0 CP P0 i’=1 n’=1 P2 P1 i=2 n=0 P3 Before splitting Coordinator Peer (CP) P0 j=2 i’=1 n’=1 P2 P1 i=1 n=1 j=1 n’=0 June 8th, 2007 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

16 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
Server Address Calculus a’  hj (C ) ; if a’= a then exit /* Bucket a is the correct one else send C to bucket a /* Forwarding to bucket a’ exit; Hanafi, please note the new alg. Simpler and faster than for LH* As only one forwarding is possible June 8th, 2007 16 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 16

17 Peer Image Adjustment by IAM
IAM comes from the correct bucket Bucket a is the forwarding one Bucket level j is that of the correct bucket 0f the forwarding one as well i’ j - 1, n’ a + 1 ; if n’ = 2i’ then n’ 0 ; i’ i’ + 1 ; Erreur dans l’algo (> au lieu de =) Same algorithm as for the adjustment of the local client and of pupils after a split June 8th, 2007 17 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 17

18 Peer Image Adjustment by IAM
Checking and forward the key using A2 9 IAM a = 1 j = 4 PC i =3 n=2 P0 j=4 i’=3 n’=1 Pairs P4 j=3 i’=2 P9 n’=2 P1 9 June 8th, 2007 18 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 18

19 Peer Image Adjustment by IAM
9 IAM a = 1 j = 4 P1 j=4 i’=3 n’=2 P9 j=4 i’=3 n’=2 Pairs j=4 j=3 Hanafi: corriger les chiffres qui restent en Italiques. Idem sur les autres diapos. Puis, noter les corrections des erreurs dans i’ et n’ (voir les valeurs dans la presentation originale) i’=3 n’=1 i’= 3 n’= 2 9 i =3 n=2 P0 P4 PC June 8th, 2007 19 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 19

20 LH*RSP2P Example of the File Expansion j=3 i’=2 n’=1 j=2 i’=1 n’=2 j=3
TUTOR, Update Pupil Example of the File Expansion PC i=2 n=2 P0 j=3 i’=2 n’=1 Peers P2 j=2 i’=1 P5 n’=2 P6 j=3 i’=2 n’=3 i’=2 n’=3 j=3 Assign a Tutor for Candidate Peer: LH-hash of the client IP Address Candidate Peer i’=0 n’=0 Pupil i’=2 n’=1 i=2 n=3 June 8th, 2007 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

21 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
Properties of LH*RSP2P : The maximal number of forwarding messages for the key search is one. The maximal number of rounds for the scan search can be two. The worst case addressing performance of LH*RSP2P as defined by Property 1 is the fastest possible for any SDDS or a practical structured P2P addressing scheme. June 8th, 2007 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

22 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
Proof Property 1 Case 1 : i’ = i and n’ < n Peer a addresses peer a’, using its image (i’,n’) from last split No IAM came since. j = i’+1 j = i’ a n a’ 2i’ n+2i’ a+2i’ No forwarding June 8th, 2007 22 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

23 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
Proof Property 1 Case 1 : i’ = i and n’ < n Peer a addresses peer a’, using its image (i’,n’) from last split No IAM came since. j = i’+1 j = i’ a n 2i’ n+2i’ a’ a+2i’ Forwarding possible for any address a’ between (a, n) June 8th, 2007 23 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

24 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
Proof Property 1 Case 2 : i = i’ + 1 and n < n’ Peer a addresses peer a’, using its image (i’,n’) from last split No IAM came since. j = i’+2 j = i’+1 j = i’ n a 2i’ 2i’+1 n+2i’+1 a’ Forwarding possible for any address a’ beyond [n, a] June 8th, 2007 24 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

25 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
Proof Property 2 Peer a sends the scan to all buckets in its image Including its image (i’, n’) Receiving peer a’ can have bucket level j as in the image j (a) = j’ (a) No forwarding of the scan Or, bucket a’ split Once and only once j (a) = j’ (a) + 1 See the figs for the key address calculus June 8th, 2007 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

26 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
Proof Property 2 Peer a’ forwards the scan to its (only) child No child can have a child Peer a would first need to split again as well Every peer gets thus the scan and only once There at worst two rounds June 8th, 2007 26 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

27 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
Proof Property 2 The only faster worst case performance is zero forwarding messages Every split has to be notified then to every peer It would be against the scalability goal of every SDDS & structured P2P scheme June 8th, 2007 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

28 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
LH*RSP2P Churn Management Bucket reliability group with k parity buckets protect against up to k bucket failures per group 5 4 3 2 1 Parity Peer Data Peer Rank Parity Record Data Record Tutoring records June 8th, 2007 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

29 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
LH*RSP2P Churn Management Peer leaves with notice Say that’s OK j j j … … Coordinator Peer i’,n’ i’,n’ i’,n’ i’,n’ Notification P0 Pl Pm Candidate Peer June 8th, 2007 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

30 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
LH*RSP2P Churn Management Peer leaves without notice or fails LH*RS Bucket Recovery j j j Forward Coordinator Peer i’,n’ i’,n’ i’,n’ i’,n’ Pl-1 Pl Pm Parity Peer Query June 8th, 2007 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

31 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
LH*RSP2P Churn Management Peer leaves without notice or fails LH*RS Bucket Recovery j j j i’,n’ Pl Coordinator Peer i’,n’ i’,n’ i’,n’ Answer Pl-1 Pm Parity Peer June 8th, 2007 31 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

32 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
LH*RSP2P Churn Management j Sure Search : Protects against outdated server read (transient communication or peer failure) Answer i’,n’ j j j Pl Coordinator Peer i’,n’ i’,n’ i’,n’ i’,n’ Pl-1 Pl Pm Parity Peer Query June 8th, 2007 32 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

33 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
Conclusion LH*RSP2P require at most one forward message when addressing error occur Is the fastest known SDDS and P2P key based addressing algorithm Protects efficiently against churn Allows to manage very large scalable files Should have numerous applications June 8th, 2007 33 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 33

34 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
Current & Future Work Implementation of the peer node architecture and of tutoring functions Using existing LH*RS prototype Created by Rim Moussa & shown at VLDB 2004 Performance Analysis Variants June 8th, 2007 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

35 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
END Thank you for Your Attention Work partly funded by the IST eGov-Bus project June 8th, 2007 35 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 35

36 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
References [1] Adina Crainiceanu, Prakash Linga, Johannes Gehrke, and Jayavel Shanmugasundaram. Querying Peer-to-Peer Networks Using P-Trees. In Proceedings of the Seventh International Workshop on the Web and Databases (WebDB 2004). , June 2004. [2] Bolosky W. J, Douceur J. R, Howell J. The Farsite Project: A Retrospective. Operating System Review, April 2007, p.17-26 [3] Devine R. Design and Implementation of DDH: A Distributed Dynamic Hashing Algorithm, Proc. Of the 4th Intl. Foundation of Data Organisation and Algorithms –FODO, 1993. [4] Litwin, W. Neimat, M-A., Schneider, D. LH*: Linear Hashing for Distributed Files. ACM- SIGMOD Int. Conf. On Management of Data, 93. [5] Litwin, W., Neimat, M-A., Schneider, D. LH*: A Scalable Distributed Data Structure. ACM- TODS, (Dec., 1996). [6] Litwin, W., Neimat, M-A. High Availability LH* Schemes with Mirroring, Intl. Conf on Cooperating systems, , IEEE Press 1996. [7] Litwin, W. Moussa R, Schwarz T. LH*rs- A Highly Available Distributed Data Storage. Proc of 30th VLDB Conference, , 2004. [8] Litwin, W. Moussa R, Schwarz T. LH*rs- A Highly Available Scalable Distributed Data Structure. ACM-TODS, Sept 2005. [9] Steven D. Gribble, Eric A. Brewer, Joseph M. Hellerstein, and David Culler. Scalable, Distributed Data Structures for Internet Service Construction, Proceedings of the Fourth Symposium on Operating Systems Design and Implementation (OSDI 2000) [10] Stoica, Morris, Karger, Kaashoek, Balakrishma. CHORD : A scalable Peer to Peer Lookup Service for Internet Application. SIGCOMM’O, August 27-31, 2001, June 8th, 2007 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment

37 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
June 8th, 2007 37 LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment 37


Download ppt "LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment"

Similar presentations


Ads by Google