Presentation is loading. Please wait.

Presentation is loading. Please wait.

Top-k Query Evaluation with Probabilistic Guarantees By Martin Theobald, Gerald Weikum, Ralf Schenkel.

Similar presentations


Presentation on theme: "Top-k Query Evaluation with Probabilistic Guarantees By Martin Theobald, Gerald Weikum, Ralf Schenkel."— Presentation transcript:

1 Top-k Query Evaluation with Probabilistic Guarantees By Martin Theobald, Gerald Weikum, Ralf Schenkel

2 Content Problem Past algorithms Contribution in this paper Approach –Differences Results, Observation and Conclusion

3 Relevance Searching Interested in only one or few relevant and novel data items/links User may not care if some the links are not that useful Precision, the fraction of the top-k which is actually in the true topk

4 Content Problem Past algorithms Contribution in this paper Approach –Differences Results, Observation and Conclusion

5 Algorithms we have learned … Fagin’s TA algorithm TA-Random –Problem with TA-Random, random accesses are expensive TA-Sorted –Problem with TA-sorted, sorted indices may not be always available

6 Content Problem Past algorithms Contribution in this paper Approach –Differences Results, Observation and Conclusion

7 Contribution Probabilistic threshold test p(d) Looking at the current seen part of the score, “What is the probability that the tuple can be in final top-k?”

8

9 Content Problem Past algorithms Contribution in this paper Approach –Differences Results, Observation and Conclusion

10 Approach Probabilistic score prediction –Uniform distribution –Histograms –Poisson Distributions Approximation technique which is computationally cheaper than histograms

11 Histogram Probability Buckets and Value Ranges ∑ Probability = 1 0150

12 Algorithms Conservative Algorithm Aggressive Algorithm Progressive Algorithm Smart Algorithm

13 Conservative Algorithm Simply predict the scores of each candidate object in every step Maintains priority queue for each group of unseen part Incur very high overload for probabilistic threshold test

14 Aggressive Algorithm If the score of object falls below the threshold min-k the algorithm stops immediately Minimal overhead but result precision is low

15 Progressive Algorithm Between conservative and aggressive Tracks the best score changes after uniform interval Maintains a single priority Queue

16 Smart Algorithm Rebuilding the entire queue is also a costly operation when the queue is large in case of big datasets Maintains only bounded priority Queue, whenever its rebuilt only best b elements are kept

17 Content Problem Past algorithms Contribution in this paper Approach –Differences Results, Observation and Conclusion

18 Experiment

19 Conclusion Probabilistic score predictions can be very beneficial in terms of execution time for trading for some amount of top-k result quality


Download ppt "Top-k Query Evaluation with Probabilistic Guarantees By Martin Theobald, Gerald Weikum, Ralf Schenkel."

Similar presentations


Ads by Google