Presentation is loading. Please wait.

Presentation is loading. Please wait.

PeerDB : A P2P-based System for Distributed Data Sharing DB Lab. M.S. 3 LEE MIN YOUNG Wee Siong Ng, Beng Chin Ooi, Kian-Lee Tan, Aoying Zhou, 19th International.

Similar presentations


Presentation on theme: "PeerDB : A P2P-based System for Distributed Data Sharing DB Lab. M.S. 3 LEE MIN YOUNG Wee Siong Ng, Beng Chin Ooi, Kian-Lee Tan, Aoying Zhou, 19th International."— Presentation transcript:

1 PeerDB : A P2P-based System for Distributed Data Sharing DB Lab. M.S. 3 LEE MIN YOUNG Wee Siong Ng, Beng Chin Ooi, Kian-Lee Tan, Aoying Zhou, 19th International Conference on Data Engineering, 2003

2 Overview  Introdution  BestPeer  PeerDB Implementation  Performance Study  Conclusion

3 Introduction

4 Background: P2P vs DDBS  P2P  DDBS CLIENT DBMS CLIENT Ad-hocMembershipStable nodes No global schemaSchemaShared schema IncompleteQuery result setComplete “Word of mouth”Content locationShared catalog

5 P2P DDBS  Limitation of existed P2P system Coarse- grained granularity (file level sharing) Limited in extensibility and flexibility  PeerDB Each node is a DBMS No global schema Incomplete answers possible Fine granularity, content searching

6 BestPeer

7  A general purpose peer-to-peer platform  Key features: LIGLO server (Location-Independent Global Names Lookup Server) Self-configurable neighborhood maintenance policy Agent-based service mechanism  Share the resource Data and information Storage power Computational power

8 BestPeer - LIGLO Server

9 BestPeer - Self-configurable Network

10 PeerDB Implementation

11 PeerDB - Architecture Local Data (schema, keywords) Database Sharable Data Data Management System : MySql meta-data stored in Local Dictionary and Export Dictionary DBAgent : for mobile agent Master agent that manages the queries Dispatches worker agents to neighboring nodes

12 PeerDB – no global schema  Problems with supporting SQL queries: No global schema information Different nodes could name the same table/attribute differently ( “ len ”, “ length ” )  Solution User supplies metadata for each relation name and attribute P1 P4 P2 P3 SeqID Length proteinSeq Kinases ID seqLen ProteinKLen SeqNo Len sequence Protein Name char Protein ID sequence ProteinKSeq Protein, kinases,length Number, identifier Length Protein, sequence Number, identifier Sequence ProteinKLen ID seqLen ProteinKSeq ID sequence Protein, human Key, identifier, ID Length Sequence Kinases SeqID Length proteinSeq Local/Export Dictionary Protein number, identifier Length Sequence Protein SeqNo Len sequence Protein, kinase name Features, functons Protein name char

13 Local Query Processing  Parse Query to Extract the list of relations and attributes name  Check Local Dictionary for matching relations Promising relations Return to the user  flood to all neighbors :query, IP, TTL  Wait for responses Display results to user as they arrive  User selects the relations he/she wants Create a “Data Retrieval Agent” Rewrite query in terms of new relations  If local, submit SQL to local db  Contact remote nodes directly to access the data Phase 1 Phase 2

14 Local Query Processing P1 P4 P2 P3 Protein, human Key, identifier, ID Length Sequence Kinases SeqID Length proteinSeq Local Dictionary P5 P6 User 1. query2. query, IP, TTL 3. schema 4.Show shcemas P1 : Kinases P2: Protein P3: ProteinKLen P4. Protein P1 1. SQL2. result SeqID Length proteinSeq

15 Remote Query Processing  Find relations Relation Matching Agents flood with TTL Check Export Dictionary for a match  Return matches directly  Get data Data Retrieval Agent submits SQL to DBMS Return data to the requesting node directly Run further data processing before returning Phase 1 Phase 2

16 Remote Query Processing 1. query P1 P4 P2 P3 P5 P6 User 2. query, IP, TTL 3. schema 4.Show shcemas P1 : Kinases P2: Protein P3: ProteinKLen P4. Protein P2 1. SQL 2. result Protein SeqNo Len sequence Protein number, identifier Length Sequence Export Dictionary SeqNo Len sequence Protein

17 PeerDB  Statistics Reconfiguration based on the policy selected by the user Relation information, number of answer objects returned  Caches Cache all query results locally LRU replacement Users choose which copy they want

18 Performance Study

19 Rate of Returning Answers Client/Server :Returns via the search path PDMR :after each query, PDMR reconfigure itselt :reach out to more promising nodes directly

20 Performance Study First search query % answers returned

21 Performance Study Completion time vs Data size Communication Overhead

22 Conclusion  PeerDB Each node is a DBMS No global schema Incomplete answers possible Fine granularity, content searching


Download ppt "PeerDB : A P2P-based System for Distributed Data Sharing DB Lab. M.S. 3 LEE MIN YOUNG Wee Siong Ng, Beng Chin Ooi, Kian-Lee Tan, Aoying Zhou, 19th International."

Similar presentations


Ads by Google