Presentation is loading. Please wait.

Presentation is loading. Please wait.

Forschungspraktikum: Analysis of different Replication and Caching Strategies in the P2P Search Engine Minerva Daniel Dahrendorf Oberseminar D5: Databases.

Similar presentations


Presentation on theme: "Forschungspraktikum: Analysis of different Replication and Caching Strategies in the P2P Search Engine Minerva Daniel Dahrendorf Oberseminar D5: Databases."— Presentation transcript:

1 Forschungspraktikum: Analysis of different Replication and Caching Strategies in the P2P Search Engine Minerva Daniel Dahrendorf Oberseminar D5: Databases and Information Systems Supervisor: Dipl.-Inf. Christian Zimmer 27.02.2007

2 Analysis of different Replication and Caching Strategies in the P2P Search Engine Minerva 2 Caching and replication are used in much areas of computer science. For Example: Internet applicationsHard disksCPUs

3 27.02.2007Analysis of different Replication and Caching Strategies in the P2P Search Engine Minerva 3 Google uses caching to keep up with the immense workload and replication for reliability

4 27.02.2007Analysis of different Replication and Caching Strategies in the P2P Search Engine Minerva 4 Problems in MINERVA: Problem 1 Each query is newly computed, although an other peer has already the result for the query

5 27.02.2007Analysis of different Replication and Caching Strategies in the P2P Search Engine Minerva 5 Problems in MINERVA: Problem 2 If a peer disconnects from the network information is lost (documents and PeerLists) Only the data is lost, the system is still running

6 27.02.2007Analysis of different Replication and Caching Strategies in the P2P Search Engine Minerva 6 Question How can these problems solved in MINERVA?

7 27.02.2007Analysis of different Replication and Caching Strategies in the P2P Search Engine Minerva 7 Approaches Caching: Using result caching to avoid duplicated work Improve cached result to get better results Approximate non cached queries to save traffic Replication: Replicate data and PeerLists for system stability

8 27.02.2007Analysis of different Replication and Caching Strategies in the P2P Search Engine Minerva 8 Caching

9 27.02.2007Analysis of different Replication and Caching Strategies in the P2P Search Engine Minerva 9 CachedResult ResultList Termstatistics Documentstatistics List of peers already asked

10 27.02.2007Analysis of different Replication and Caching Strategies in the P2P Search Engine Minerva 10 Organizing of the Cache in Minerva Using an global but decentralized Cache: The CachedResults are stored by the peer which manages the lexicographical smallest term of the query Peer aPeer bPeer c abcbcc abb ac a

11 27.02.2007Analysis of different Replication and Caching Strategies in the P2P Search Engine Minerva 11 Result caching 1.Send terms and whole query to responsible peers 2.Get PeerLists (and CachedResult) 3.Get DocumentLists and compute Result 4.Send CachedResult to responsible peer

12 27.02.2007Analysis of different Replication and Caching Strategies in the P2P Search Engine Minerva 12 Improving CachedResults 1.Send terms and whole query to responsible peers 2.Get PeerLists and CachedResult 3.Get DocumentLists from peers, who are not already queried 4.Compute result and update CachedResult 5.Send CachedResult to responsible peer

13 27.02.2007Analysis of different Replication and Caching Strategies in the P2P Search Engine Minerva 13 Approximation of results Idea: Union of cached ResultLists Approximation is only used if enough subCachedResults are stored in the network Called approximation because PeerLists and ResultLists are cut off

14 27.02.2007Analysis of different Replication and Caching Strategies in the P2P Search Engine Minerva 14 Experiments setting for caching 250,000 documents 50 peers with 2000 documents (no overlap) 20 different queries (2 to 4 terms) Each Experiment consists of 10 rounds. In each round 5 more Peers are queried Using the relative recall to determine the quality

15 27.02.2007Analysis of different Replication and Caching Strategies in the P2P Search Engine Minerva 15 Experiments Improving result through caching

16 27.02.2007Analysis of different Replication and Caching Strategies in the P2P Search Engine Minerva 16 Experiments Approximation results

17 27.02.2007Analysis of different Replication and Caching Strategies in the P2P Search Engine Minerva 17 Replication

18 27.02.2007Analysis of different Replication and Caching Strategies in the P2P Search Engine Minerva 18 Ideas for replication in Minerva Data-Replication –If a Peer has computed the ResultList for a query he downloads the topK documents for this query Directory-Replication –The successors of each peer store although the PeerLists –A second Chord-Ring is used to build up a parallel directory

19 27.02.2007Analysis of different Replication and Caching Strategies in the P2P Search Engine Minerva 19 Outlook Ensure that a CachedResult is up-to-date (using time to live and update CachedResult automatically) Prefetching CachedResults Improve replacement algorithms for the cache Combination of replication and caching

20 27.02.2007Analysis of different Replication and Caching Strategies in the P2P Search Engine Minerva 20 Conclusions Caching reduces traffic The outcome of improving CachedResults is better results at the same costs Approximation is a further method to reduces traffic, but needs enough CachedResults Combination of replication and caching promise an increase effect


Download ppt "Forschungspraktikum: Analysis of different Replication and Caching Strategies in the P2P Search Engine Minerva Daniel Dahrendorf Oberseminar D5: Databases."

Similar presentations


Ads by Google