Download presentation
Presentation is loading. Please wait.
Published byGloria Long Modified over 8 years ago
1
SURAJIT CHAUDHURI RAJEEV MOTWANI VIVEK NARASAYYA On random sampling over Joins Presented by : Srikantha Nema
2
Outline Semantics of Sample Difficulty of join Sampling Algorithms for Sampling Sampling strategies New strategies for join Sampling Experimental evaluation Conclusions
3
Terminologies SAMPLE(R, f) is an SQL operation When a query Q is evaluated, we obtain relation R f is a fraction of a relation R
4
Semantics of Sample Sampling with Replacement (WR) Sampling without Replacement (WoR) Independent Coin Flips (CF)
5
Difficulty of Join Sampling
6
Classification of Join Sampling problem Case A No information is available for either or Case B No information is available for but indexes and /or statistics are available for Case C Indexes/statistics are available for and
7
Algorithms for Sampling Unweighted Sequential WR Sampling Black-Box U1 Black-Box U2 Weighted Sequential WR Sampling Black-Box WR1 Black-Box WR2
8
Unweighted Sequential WR Sampling Black-Box U2 Black-Box U1
9
Weighted Sequential Sampling Black-Box WR1 Black-Box WR2
10
Sampling Strategies (old) Strategy Naïve-Sample Strategy Olken-Sample
11
New strategies for join Sampling Strategy Stream-Sample Strategy Group-Sample Strategy Frequency-Partition-Sample
13
Experimental Evaluation 1
14
Experimental Evaluation 2
15
Experimental Evaluation 3
16
Conclusions Difficulty of join sampling Classification of the problem into 3 cases Strategies for join sampling New schemes for sequential random sampling for uniform and weighted sampling More efficient strategies can be developed for the case of single join More work needed to understand the problem of sampling the result of join trees
17
Thank You
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.