Download presentation

Presentation is loading. Please wait.

Published byErnest McDonald Modified about 1 year ago

1
5/10/20151 PSU’s CS Evaluating Relational Operators Table Statistics Operators and example schema Selection Projection Equijoins General Joins Set Operators Buffering Intergalactic standard reference G. Graefe, "Query Evaluation Techniques for Large Databases," ACM Computing Surveys, 25(2) (1993), pp

2
5/10/20152 PSU’s CS 587 Learning Objectives In a typical major DBMS, what statistics are automatically collected and when? Given collected statistics, estimate a predicate’s output size/selectivity. For each relational operator, describe the major algorithms, their optimizations, their pros and cons, and their costs Selection Projection Equijoins General Joins Set Operators

3
5/10/20153 PSU’s CS 587 Motivation/Review Assume Sailors has an index on age. Does the optimal plan for this query use the index? SELECT * FROM Sailors S WHERE S.age < 31 Moral: In order to choose the optimal plan we need to know the selectivity of the predicate/size of the output.

4
5/10/20154 PSU’s CS 587 Collecting Statistics What are statistics Table sizes, index sizes, ranges of values, etc. Where are statistics kept? In the system catalog. Why collect statistics? Statistics are needed to determine selectivity of predicates and sizes of outputs of operators. Data in the previous bullet is needed to calculate costs of plans. Data in the previous bullet is needed by the optimizer to find the optimal plan. How often are statistics collected? Typically done when 10% of the data has been updated. Can be overidden manually UPDATE STATISTICS, ANALYZE Typically done by sampling, if table is large Which tables/columns are monitored? Typically all tables/columns are monitored

5
5/10/20155 PSU’s CS 587 Sailors Example What statistics would you collect to find the number of rows satisfying Age < 31? Rank =4 ? In Intro DB we assumed data was uniformly distributed. 30/30.2/30.4/30.6/30.8/30.9/31/32/33/ /36/36.6/37.8/39/39.5/40/41/43/44 /46/48/50/51/52/54/56/58/59/60

6
5/10/20156 PSU’s CS 587 Histograms Histograms are more accurate than the uniformity assumption. Equiwidth Histogram: Estimate number of rows/selectivity for the predicate “age< 31”: Equidepth Histogram: Estimate number of rows/selectivity for the predicate “age< 31”: Actual value: 6. Moral: Equidepth histograms are more accurate. #vals range #vals range

7
5/10/20157 PSU’s CS 587 Real DBMSs Store Histograms and More The next slide is the result of: SELECT attname, n_distinct, most_common_vals, most_common_freqs, histogram_bounds FROM pg_stats WHERE tablename = 'sailors'; n_distinct: positive means number of distinct values, negative means distinct values over number of rows, so -1 means unique most_common_vals: NULL if no values are more common than others histogram: equidepth, for values except most common

8
5/10/20158 PSU’s CS 587 Example Statistics: Postgres EXPLAIN SELECT * from sailors where age < 31; QUERY PLAN Seq Scan on sailors (cost= rows=6 width=21)Filter: (age < 31::double precision)

9
5/10/20159 PSU’s CS 587 Compound Conditions How many rows: WHERE age < 31 AND rank = 4 Assume statistical independence Selectivity of AND is product of selectivities WHERE age < 31 OR rank = 4 Typical formula is s1+s2-s1*s2

10
5/10/ PSU’s CS 587 Join Size Estimation Suppose there are ר rows in R, ש rows in S, R ⋈ S is a join on r i = s j r i is a foreign key that references s j How many rows are there in R ⋈ S? Most real joins are foreign key joins

11
5/10/ PSU’s CS 587 Pages, Rows in a Table? select relpages, reltuples from pg_class where relname = 'sailors';

12
5/10/ PSU’s CS 587 Assumptions General assumptions: All attributes are uniformly distributed. 2 Meg buffer space, or 256 8K pages So can sort up to 512Mof data, 64K pages, in two passes All indexes are alternative 2, height 2. Data entries are 10% the size of data records. Each I/O takes 10ms Reserves, Sailors, Boats: Sailors ( sid : integer, sname : string, rank : integer, age : real) Clustered index on sid Nonclustered index on rank Reserves ( sid : integer, bid : integer, day : dates, rname : string) Clustered index on (sid,bid,day) Nonclustered index on bid Boats ( bid : integer, color :string) Clustered index on bid Bytes /Row Rows /Page Pages Sailors Reserves Boats Rank: 1 to 10 Bid: 1 to 100

13
5/10/ PSU’s CS Evaluating Relational Operators Operators and example schema Selection One term Nonclustered index optimization Multiple terms Projection Equijoins General Joins Set Operators Buffering

14
5/10/ PSU’s CS One Table, One Term Predicate is of the form R.attr op value, where op is <. Assume 10% of rnames begin with A, B. Suppose there is a clustered B+ tree index and an unclustered B+ tree index, both on rname. What are the possibile plans (algorithms)? Which plan will the optimizer choose? We’ll compute the cost of each plan/algorithm. SELECT * FROM Reserves R WHERE R.rname < ‘C%’ 14. Op.Eval.

15
5/10/ PSU’s CS 587 What is the optimal plan? File Scan Plan Cost I/Os Seconds Nonclustered Index Scan Plan Cost Access first qualifying data entry Access first qualifying data item Access all qualifying data items Total Cost Clustered Index Scan Plan Cost Access first qualifying data entry Access first qualifying data item Access all qualifying data items Access all qualifying data entries Total Cost What is the optimal plan? 14. Op.Eval.

16
5/10/ PSU’s CS 587 Index entries Data entries (Index File) Data Records CLUSTERED (Data file) Data entries Data Records UNCLUSTERED

17
5/10/ PSU’s CS 587 Costs for a one-term selection on R, with selectivity ρ File scan Number of pages in R = M Nonclustered index scan Worst case is at least the number of qualifying rows in R = (Mp R ) ρ. Clustered index scan Hight of index + number of qualifying data entries + number of qualifying data pages = 2 +.1M ρ + M ρ = 2+1.1M ρ *The.1 comes from our assumption that data entries are 10% the length of data items, and should be adjusted accordingly.

18
5/10/ PSU’s CS 587 How do the access methods compare? Remember Clustered Indexes are rare p R is typically large Conclusions Clustered Index Scan is optimal if ρ < 1/1.1 If there is no clustered index, File Scan is a big win unless ρ is very small Let’s look at ways to make a Nonclustered Index Scan even better File ScanM Nonclustered Index ScanAt least p R * ρ *M Clustered Index Scan2+1.1* ρ *M

19
5/10/ PSU’s CS 587 Nonclustered Index Optimization Consider this algorithm for using a nonclustered index (the “shopping list” optimization): 1. Retrieve the RIDs in all the qualifying data entries. 2+ ρ *0.1*M 2. Sort those RIDs in order of page number. ??? 3. Fetch data records using those RIDs in sorted order. This ensures that each data page is retrieved just once (though the # of such pages likely to be higher than with clustering). At most M If ρ is small, cost is much less. Cost? At most 2+(1+.1 ρ )M + ??? 14. Op.Eval.

20
5/10/ PSU’s CS 587 A Second Nonclustered Index Optimization: The ??? Term. What if RIDs are too large to sort in memory? 1. Retrieve the RIDs in all the qualifying data entries. 2’. Make a bitmap, one bit for each page in the table. 3’. For each page in a qualifying RID, turn on the corresponding bit in the bitmap 4’. Fetch each page whose bit is on. 5’. Scan every record in every fetched page to find which ones qualify. Why is step 5’ necessary? Cost: At most 2+(1+.1 ρ )M If ρ is small, cost is much less. 14. Op.Eval.

21
5/10/ PSU’s CS 587 How do the access methods compare? Conclusions Clustered Index Scan is optimal if ρ < 1/1.1 If there is no clustered index, File Scan is a slight win unless ρ is very small. File ScanM Nonclustered Index ScanAt most 2+(1+.1 ρ )M if ρ is small it will be much less Clustered Index Scan2+1.1* ρ *M

22
5/10/ PSU’s CS 587 Access Paths, Matching An access path is a way of retrieving tuples. A full or partial scan An indexed access So an optimal plan, in this simple case (one- term WHERE clause), is an optimal access path. A term matches an access path if that access path can be used to retrieve just the tuples satisfying the term. Example: bid>50 matches what access path?

23
5/10/ PSU’s CS 587 General Selections First approach: 1.Choose a term (Which one?) 2.Retrieve tuples using the optimal access path for that term. 3.Apply any remaining terms, in-memory Second approach Applicable if we have 2 or more access paths matching terms and using Alternatives (2) or (3) 1.Get sets of rids of data records using each matching index. 2.Then intersect these sets of rids 3.Retrieve the records and apply any remaining terms. Which approach is cheaper? 14. Op.Eval. SELECT attribute list FROM relation list WHERE term1 AND... AND termk k > 1

24
5/10/ PSU’s CS Evaluating Relational Operators Operators and example schema Selection Projection What is the problem? Sort and hash approaches Equijoins General Joins Set Operators Buffering

25
5/10/ PSU’s CS Projection Hard part of projection is duplicate elimination Simple sort-based projection Assume size ratio is 0.50, duplicate ratio is Eliminate all attributes except sid, bid. 2.Sort the result by (sid,bid). 3.Scan that result, eliminating duplicates Cost? Optimizations? Combine attribute elimination with sort Cost? Combine duplicate elimination with sort. Best case cost? SELECT DISTINCT R.sid, R.bid FROM Reserves R 14. Op.Eval.

26
5/10/ PSU’s CS 587 Projection Based on Hashing Simple Hash-based projection Eliminate unwanted attributes Hash the result into B-1 output buffers and partitions For each partition: Read in, eliminate duplicates in memory, write the answer Optimizations? Combine elimination of unwanted attributes with hashing. Eliminate duplicates as soon as possible 14. Op.Eval.

27
5/10/ PSU’s CS 587 Discussion of Projection Sort-based approach is the standard; better handling of skew and result is sorted. If an index on the relation contains all wanted attributes in its search key, can do index-only scan. Apply projection techniques to data entries (much smaller!) If an ordered (i.e., tree) index contains all wanted attributes as prefix of search key, can do even better: Retrieve data entries in order (index-only scan), discard unwanted fields, compare adjacent tuples to check for duplicates. { Prefixes Single index (bid, sid, day)

28
5/10/ PSU’s CS Evaluating Relational Operators Operators and example schema Selection Projection Equijoins Nested Loop Join Index Nested Loop Join Sort-Merge Join Hash Join Hybrid Hash Join General Joins Set Operators Buffering

29
5/10/ PSU’s CS Equijoin With One Join Column SELECT S.name, R.bid FROM Reserves R, Sailors S WHERE R.sid=S.sid AND bid = 100 Example: Reserves Sailors ⋈ sid=sid bid=100 sname

30
5/10/ PSU’s CS 587 Nested Loops Join R is called the outer, S the inner table. (1) foreach tuple r in R do (2) for each tuple s in S if r i == s j then output 14. Op.Eval.

31
5/10/ PSU’s CS 587 The Join Algorithms Every in-memory join algorithm is in some sense a special case of Nested Loops Simple Nested Loops: Scan outer, scan inner Index Nested Loops: Scan outer, index scan inner Sort Merge: Inputs must be sorted on joining attributes. Scan outer in order, scan matching interval of inner. Hash Join: special case of Index Nested Loops, where index is constructed on-the-fly and inputs are partitioned. Implications Every join algorithm scans its outer relation. Every join algorithm costs at least M. 14. Op.Eval.

32
5/10/ PSU’s CS 587 Cost of Nested Loops Join (Scan of R)+(# rows in R)*(Scan of S) = M + (M*p R * ρ )*N Where ρ is the selectivity of the selection on R For simplicity we are assuming that M is accessed with a file scan. In general M could be accessed with an index scan and the first cost “M” would be replaced by the cost of the index scan. For our example, the cost is prohibitive: 1000+(1,000)*500 = 501,000 I/Os = 5,010 seconds ~= 1.4 hours Nested loops is used only for small tables.

33
5/10/ PSU’s CS 587 Index Nested Loops Join Cost: M + (M*p R* ρ ) * (Cost of finding matching S tuples) Again, the first cost M might be replaced by an index scan cost. Cost of finding matching S tuples= cost of probing S index + cost of finding S tuples. For each R tuple, cost of probing S index is about 1.2 for hash index, 2- 4 for B+ tree. Cost of then finding S tuples (assuming Alt. (2) or (3) for data entries) depends on clustering. Clustered index: 1 I/O (typical), unclustered: up to 1 I/O per matching S tuple. Valid only when S has an index on s j foreach tuple r in R do foreach tuple s in S where r i == s j use index here add to result 14. Op.Eval.

34
5/10/ PSU’s CS 587 Examples of Index Nested Loops In our example Sailors has a clustered index on sid so we can apply INL. Cost is M + (M*p R* ρ ) * 3 = (1000*100*.01) *3 = 4000 I/Os = 40 seconds. 14. Op.Eval.

35
5/10/ PSU’s CS Sort-Merge Join Costs are in Blue. Assume R, S can be sorted in 2 passes, i.e. M, N < B 2 No-brainer Implementation: Sort R 4M Sort S 4N Merge-join R and S M+N Total Cost 5(M+N) Better Implementation: Sort* R into runs 2M Sort* S into runs 2N Merge runs of R and S and merge-join R and S M+N Total Cost 3(M+N) *Use snowplow/replacement sort optimization 14. Op.Eval.

36
5/10/ PSU’s CS 587 Example of Sort-Merge Join 14. Op.Eval.

37
5/10/ PSU’s CS 587 Sort-Merge Join Optimization... Runs from S Runs from R. Fill Buffers Merge Join Output

38
5/10/ PSU’s CS 587 Hybrid SMJ costs If R and S are sorted, cost is M+N If R is sorted, cost is M+N +2 ρ S N – 2( B - ρ S N/B ) Argument similar to preceding page If neither R nor S are sorted, cost is M+N + 2 ρ R M + 2 ρ S N -2( B - ρ R M/B - ρ S N/B ) The initial M and N costs might be replaced by index scan costs if appropriate. ⋈ RR SS RS

39
5/10/ PSU’s CS 587 Example: Hybrid SMJ Assume this plan uses a clustered sid index scan of Reserves at a cost of 1102 and a file scan of Sailors at a cost of 500, B = 256. Plan cost is then M+N +2 ρ S N – 2( B - ρ S N/B ) = *.5*500-2(256-.5*500/256) = 1590 ⋈ bid=100 rank>5 ReservesSailors

40
5/10/ PSU’s CS Hash-Join [677] Confusing because it combines two distinct ideas: 1.Partition R and S using a hash function You should understand how and why this works. Read the text if you don’t. 2.Join corresponding partitions using a version of INL with hash indexes built on-the-fly. You should understand why the hash function used here must differ from the hash function used for partitioning. The partition step could be followed by any join method in step 2. Hashing is used because The tables are large (they needed partitioning) so NLJ is not a contender. The partitions were just created so they have no indexes on them, so classic INL is not possible. The partitions were just created so are not sorted, so SMJ is not a big winner. There may be cases where a sort order is needed later, but this is not considered.

41
5/10/ PSU’s CS 587 Partitions of R & S Input buffer for Si Hash table for partition Ri (k < B-1 pages) B main memory buffers Disk Output buffer Disk Join Result hash fn h2 B main memory buffers Disk Original Relation OUTPUT 2 INPUT 1 hash function h B-1 Partitions 1 2 B-1...

42
5/10/ PSU’s CS 587 Hash-Join Example: Partition R 16,23,17, 86,84,21, 61 S 19,17,84, 25, 22 Input 0 mod k 1 mod k k Partitions of R k Partitions of S

43
5/10/ PSU’s CS 587 Hash-Join Example: INL Join Output Partitions of R Partitions of S 16,86,84 23,17,21,61 84,22 19,17,25,43 Scan partition of S Hashed partition of R Output 16,86,84

44
5/10/ PSU’s CS 587 Cost of Hash Join We only need one of M,N to be B 2 ! Let it be M Read R, write B partitions: 2M Each one is, on the average, at most B pages Some fudging going on here Read S, write B partitions: 2N We don’t care about the size of these partitions For each of the B partitions of R and S: M+N Build a hash table for the R-partition Read each partition of R, and matching partition of S: M+N Total: 3(M+N) Can we do better if M << B 2 ? Hybridize!

45
5/10/ PSU’s CS 587 Hybrid Hash Join, M 2B, Partition Input Buffer Partition 0 Output Buffer Partition 1 Of R Partition 0 of S Partition 0 of R S R R 1 ⋈ S 1

46
5/10/ PSU’s CS 587 Hybrid Hash Join, M 2B, INL R0 ⋈ S0R0 ⋈ S0 Output Buffer Load R Partition 0 Partition 0 of S Scan S Partition 0 Partition 0 of R

47
5/10/ PSU’s CS 587 Hybrid Hash Join, Partition Input Buffer Partitions 0..k-1 Output Buffers Partition k Of R Partitions 0..k-1 of S Partitions 0..k-1 of R S R R k ⋈ S k … … …

48
5/10/ PSU’s CS 587 Hybrid Hash Join, INL, Partition i Ri ⋈ SiRi ⋈ Si Output Buffer Load R Partition i Partition i of S Scan S Partition 0 Partition i of R

49
5/10/ PSU’s CS 587 Fudging in our treatment of Hash-Join If we build an in-memory hash table to speed up the matching of tuples, a little more memory is needed. If the hash function does not partition uniformly, one or more R partitions may not fit in memory. In this case we can apply the hash-join technique recursively to do the join of this R-partition with its corresponding S-partition. 14. Op.Eval.

50
5/10/ PSU’s CS 587 Sort-Merge vs. Hash Join When can we achieve the low cost of 3(M+N)? Hash: Smaller of M, N B 2 Sort-merge: Both M, N B 2 So Hash Join is preferable if table sizes differ greatly. But Sort-Merge Join is less sensitive to data skew. The result of Sort Merge Join is ordered.

51
5/10/ PSU’s CS 587 Hash Join comments Can view first phase of Hash Join of R, S as divide and conquer: partitioning the join into subjoins that can be done with one input smaller than memory. What to do if we don’t know the size of R? Assume the case M B, then split off more and more of the hash table onto disk as it grows. Used in Postgres.

52
5/10/ PSU’s CS 587 Equijoin Cost Summary Nested Loops best for small tables Others have high overhead costs Nested LoopsM + MN Index Nested LoopsM + M p R (1.2-4)(1+) Sort-Merge, Hash3(M+N)

53
5/10/ PSU’s CS General Join Conditions Equalities over several attributes (e.g., R.sid=S.sid AND R.rname=S.sname ): For Index NL, build index on (if S is inner); or use existing indexes on sid or sname. For Sort-Merge and Hash Join, sort/partition on combination of the two join columns. Not much different than one-equality equijoins. 14. Op.Eval.

54
5/10/ PSU’s CS 587 Inequality Conditions Typical example: R.rname < S.sname Relatively rare. Does the example make sense? For Index NL, need (clustered!) B+ tree index to get any efficiency. Range probes on inner; # matches likely to be much higher than for equality joins. Hash Join, Sort Merge Join not applicable. With no index, Block NL is the best join method.

55
5/10/ PSU’s CS Set Operations INTERSECTION: INTERSECTION is a special case of join. How is it a special case? So all the usual join algorithms apply CROSS PRODUCT Also a special case of join How? What join algorithm is best to compute CROSS PRODUCT? Will Sort-Merge or Hash help? 14. Op.Eval.

56
5/10/ PSU’s CS 587 UNION and EXCEPT They are similar; we’ll do UNION. The hard part is removing duplicates. Sorting based approach to union: Sort both relations (on a key). Merge sorted relations, discarding duplicates. Alternative : Merge runs from Pass 0 for both relations. Hash based approach to union: Partition R and S using hash function h. For each S-partition, build in-memory hash table, scan corresponding R-partition and add tuples to table while discarding duplicates.

57
5/10/ PSU’s CS Aggregates w/o grouping Example SELECT AVE(S.age) FROM SAILORS S In general, requires scanning the relation. What is the cost for this query? Given index whose search key includes all attributes in the SELECT or WHERE clauses, can do index-only scan. If there is an index on age, what is the cost of this query? 14. Op.Eval.

58
5/10/ PSU’s CS 587 Aggregates with grouping Example SELECT S.rating, AVE(S.age) FROM SAILORS S GROUP BY S.rating What is an algorithm based on sorting? What is the cost of this query, assuming 2-pass sort? Optimizations? Cost? What is an algorithm based on hashing? How much memory is needed? Cost? Conclusion: Hash is best If there is an appropriate index, use index-only alg. 14. Op.Eval.

59
5/10/ PSU’s CS Impact of Buffering If several operations are executing concurrently, estimating the number of available buffer pages is guesswork. Repeated access patterns interact with buffer replacement policy. e.g., Inner relation is scanned repeatedly in Simple Nested Loop Join. With enough buffer pages to hold inner, replacement policy does not matter. Otherwise, MRU is best, LRU is worst ( sequential flooding ). What replacement policy is best for Block Nested Loops? Index Nested Loops? Sort-Merge Join sort phase? Sort-Merge Join merge phase? 14. Op.Eval.

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google