# 5/10/20151 PSU’s CS 587 14. Evaluating Relational Operators  Table Statistics  Operators and example schema  Selection  Projection  Equijoins  General.

## Presentation on theme: "5/10/20151 PSU’s CS 587 14. Evaluating Relational Operators  Table Statistics  Operators and example schema  Selection  Projection  Equijoins  General."— Presentation transcript:

5/10/20151 PSU’s CS 587 14. Evaluating Relational Operators  Table Statistics  Operators and example schema  Selection  Projection  Equijoins  General Joins  Set Operators  Buffering  Intergalactic standard reference  G. Graefe, "Query Evaluation Techniques for Large Databases," ACM Computing Surveys, 25(2) (1993), pp. 73-170

5/10/20152 PSU’s CS 587 Learning Objectives  In a typical major DBMS, what statistics are automatically collected and when?  Given collected statistics, estimate a predicate’s output size/selectivity.  For each relational operator, describe the major algorithms, their optimizations, their pros and cons, and their costs  Selection  Projection  Equijoins  General Joins  Set Operators

5/10/20153 PSU’s CS 587 Motivation/Review  Assume Sailors has an index on age.  Does the optimal plan for this query use the index? SELECT * FROM Sailors S WHERE S.age < 31  Moral: In order to choose the optimal plan we need to know the selectivity of the predicate/size of the output.

5/10/20154 PSU’s CS 587 Collecting Statistics  What are statistics  Table sizes, index sizes, ranges of values, etc.  Where are statistics kept? In the system catalog.  Why collect statistics?  Statistics are needed to determine selectivity of predicates and sizes of outputs of operators.  Data in the previous bullet is needed to calculate costs of plans.  Data in the previous bullet is needed by the optimizer to find the optimal plan.  How often are statistics collected?  Typically done when 10% of the data has been updated. Can be overidden manually UPDATE STATISTICS, ANALYZE  Typically done by sampling, if table is large  Which tables/columns are monitored?  Typically all tables/columns are monitored

5/10/20155 PSU’s CS 587 Sailors Example  What statistics would you collect to find the number of rows satisfying  Age < 31?  Rank =4 ?  In Intro DB we assumed data was uniformly distributed. 30/30.2/30.4/30.6/30.8/30.9/31/32/33/34.5 35/36/36.6/37.8/39/39.5/40/41/43/44 /46/48/50/51/52/54/56/58/59/60

5/10/20156 PSU’s CS 587 Histograms  Histograms are more accurate than the uniformity assumption.  Equiwidth Histogram:  Estimate number of rows/selectivity for the predicate “age< 31”:  Equidepth Histogram:  Estimate number of rows/selectivity for the predicate “age< 31”:  Actual value: 6. Moral: Equidepth histograms are more accurate. #vals range 30333639424548515457 9333212223 #vals range 3030.4 30.9 33363941465156 3333333333

5/10/20157 PSU’s CS 587 Real DBMSs Store Histograms and More  The next slide is the result of: SELECT attname, n_distinct, most_common_vals, most_common_freqs, histogram_bounds FROM pg_stats WHERE tablename = 'sailors';  n_distinct: positive means number of distinct values, negative means distinct values over number of rows, so -1 means unique  most_common_vals: NULL if no values are more common than others  histogram: equidepth, for values except most common

5/10/20158 PSU’s CS 587 Example Statistics: Postgres EXPLAIN SELECT * from sailors where age < 31; QUERY PLAN Seq Scan on sailors (cost=0.00..1.38 rows=6 width=21)Filter: (age < 31::double precision)

5/10/20159 PSU’s CS 587 Compound Conditions  How many rows:  WHERE age < 31 AND rank = 4 Assume statistical independence Selectivity of AND is product of selectivities  WHERE age < 31 OR rank = 4 Typical formula is s1+s2-s1*s2

5/10/201510 PSU’s CS 587 Join Size Estimation  Suppose  there are ר rows in R, ש rows in S,  R ⋈ S is a join on r i = s j  r i is a foreign key that references s j  How many rows are there in R ⋈ S?  Most real joins are foreign key joins

5/10/201511 PSU’s CS 587 Pages, Rows in a Table? select relpages, reltuples from pg_class where relname = 'sailors';

5/10/201512 PSU’s CS 587 Assumptions  General assumptions:  All attributes are uniformly distributed.  2 Meg buffer space, or 256 8K pages So can sort up to 512Mof data, 64K pages, in two passes  All indexes are alternative 2, height 2.  Data entries are 10% the size of data records.  Each I/O takes 10ms  Reserves, Sailors, Boats:  Sailors ( sid : integer, sname : string, rank : integer, age : real) Clustered index on sid Nonclustered index on rank  Reserves ( sid : integer, bid : integer, day : dates, rname : string) Clustered index on (sid,bid,day) Nonclustered index on bid  Boats ( bid : integer, color :string) Clustered index on bid Bytes /Row Rows /Page Pages Sailors5080500 Reserves401001000 Boats1625020 Rank: 1 to 10 Bid: 1 to 100

5/10/201513 PSU’s CS 587 14. Evaluating Relational Operators  Operators and example schema  Selection  One term  Nonclustered index optimization  Multiple terms  Projection  Equijoins  General Joins  Set Operators  Buffering

5/10/201514 PSU’s CS 587 14.1 One Table, One Term Predicate is of the form R.attr op value, where op is <. Assume 10% of rnames begin with A, B. Suppose there is a clustered B+ tree index and an unclustered B+ tree index, both on rname. What are the possibile plans (algorithms)? Which plan will the optimizer choose? We’ll compute the cost of each plan/algorithm. SELECT * FROM Reserves R WHERE R.rname < ‘C%’ 14. Op.Eval.

5/10/201515 PSU’s CS 587 What is the optimal plan?  File Scan Plan Cost  I/Os  Seconds  Nonclustered Index Scan Plan Cost  Access first qualifying data entry  Access first qualifying data item  Access all qualifying data items  Total Cost  Clustered Index Scan Plan Cost  Access first qualifying data entry  Access first qualifying data item  Access all qualifying data items  Access all qualifying data entries  Total Cost  What is the optimal plan? 14. Op.Eval.

5/10/201516 PSU’s CS 587 Index entries Data entries (Index File) Data Records CLUSTERED (Data file) Data entries Data Records UNCLUSTERED

5/10/201517 PSU’s CS 587 Costs for a one-term selection on R, with selectivity ρ  File scan  Number of pages in R = M  Nonclustered index scan  Worst case is at least the number of qualifying rows in R = (Mp R ) ρ.  Clustered index scan  Hight of index + number of qualifying data entries + number of qualifying data pages = 2 +.1M ρ + M ρ = 2+1.1M ρ *The.1 comes from our assumption that data entries are 10% the length of data items, and should be adjusted accordingly.

5/10/201518 PSU’s CS 587 How do the access methods compare?  Remember  Clustered Indexes are rare  p R is typically large  Conclusions  Clustered Index Scan is optimal if ρ < 1/1.1  If there is no clustered index, File Scan is a big win unless ρ is very small  Let’s look at ways to make a Nonclustered Index Scan even better File ScanM Nonclustered Index ScanAt least p R * ρ *M Clustered Index Scan2+1.1* ρ *M

5/10/201519 PSU’s CS 587 Nonclustered Index Optimization  Consider this algorithm for using a nonclustered index (the “shopping list” optimization): 1. Retrieve the RIDs in all the qualifying data entries. 2+ ρ *0.1*M 2. Sort those RIDs in order of page number. ??? 3. Fetch data records using those RIDs in sorted order. This ensures that each data page is retrieved just once (though the # of such pages likely to be higher than with clustering). At most M If ρ is small, cost is much less.  Cost? At most 2+(1+.1 ρ )M + ??? 14. Op.Eval.

5/10/201520 PSU’s CS 587 A Second Nonclustered Index Optimization: The ??? Term.  What if RIDs are too large to sort in memory? 1. Retrieve the RIDs in all the qualifying data entries. 2’. Make a bitmap, one bit for each page in the table. 3’. For each page in a qualifying RID, turn on the corresponding bit in the bitmap 4’. Fetch each page whose bit is on. 5’. Scan every record in every fetched page to find which ones qualify.  Why is step 5’ necessary?  Cost: At most 2+(1+.1 ρ )M  If ρ is small, cost is much less. 14. Op.Eval.

5/10/201521 PSU’s CS 587 How do the access methods compare?  Conclusions  Clustered Index Scan is optimal if ρ < 1/1.1  If there is no clustered index, File Scan is a slight win unless ρ is very small. File ScanM Nonclustered Index ScanAt most 2+(1+.1 ρ )M if ρ is small it will be much less Clustered Index Scan2+1.1* ρ *M

5/10/201522 PSU’s CS 587 Access Paths, Matching  An access path is a way of retrieving tuples.  A full or partial scan  An indexed access  So an optimal plan, in this simple case (one- term WHERE clause), is an optimal access path.  A term matches an access path if that access path can be used to retrieve just the tuples satisfying the term.  Example: bid>50 matches what access path?

5/10/201523 PSU’s CS 587 General Selections  First approach: 1.Choose a term (Which one?) 2.Retrieve tuples using the optimal access path for that term. 3.Apply any remaining terms, in-memory  Second approach  Applicable if we have 2 or more access paths matching terms and using Alternatives (2) or (3) 1.Get sets of rids of data records using each matching index. 2.Then intersect these sets of rids 3.Retrieve the records and apply any remaining terms.  Which approach is cheaper? 14. Op.Eval. SELECT attribute list FROM relation list WHERE term1 AND... AND termk k > 1

5/10/201524 PSU’s CS 587 14. Evaluating Relational Operators  Operators and example schema  Selection  Projection  What is the problem?  Sort and hash approaches  Equijoins  General Joins  Set Operators  Buffering

5/10/201525 PSU’s CS 587 14.3 Projection  Hard part of projection is duplicate elimination  Simple sort-based projection  Assume size ratio is 0.50, duplicate ratio is 0.25 1.Eliminate all attributes except sid, bid. 2.Sort the result by (sid,bid). 3.Scan that result, eliminating duplicates Cost?  Optimizations?  Combine attribute elimination with sort Cost?  Combine duplicate elimination with sort. Best case cost? SELECT DISTINCT R.sid, R.bid FROM Reserves R 14. Op.Eval.

5/10/201526 PSU’s CS 587 Projection Based on Hashing  Simple Hash-based projection  Eliminate unwanted attributes  Hash the result into B-1 output buffers and partitions  For each partition: Read in, eliminate duplicates in memory, write the answer  Optimizations?  Combine elimination of unwanted attributes with hashing.  Eliminate duplicates as soon as possible 14. Op.Eval.

5/10/201527 PSU’s CS 587 Discussion of Projection  Sort-based approach is the standard; better handling of skew and result is sorted.  If an index on the relation contains all wanted attributes in its search key, can do index-only scan.  Apply projection techniques to data entries (much smaller!)  If an ordered (i.e., tree) index contains all wanted attributes as prefix of search key, can do even better:  Retrieve data entries in order (index-only scan), discard unwanted fields, compare adjacent tuples to check for duplicates. { Prefixes Single index (bid, sid, day)

5/10/201528 PSU’s CS 587 14. Evaluating Relational Operators  Operators and example schema  Selection  Projection  Equijoins  Nested Loop Join  Index Nested Loop Join  Sort-Merge Join  Hash Join  Hybrid Hash Join  General Joins  Set Operators  Buffering

5/10/201529 PSU’s CS 587 14.4 Equijoin With One Join Column SELECT S.name, R.bid FROM Reserves R, Sailors S WHERE R.sid=S.sid AND bid = 100  Example: Reserves Sailors ⋈ sid=sid  bid=100  sname

5/10/201530 PSU’s CS 587 Nested Loops Join  R is called the outer, S the inner table. (1) foreach tuple r in R do (2) for each tuple s in S if r i == s j then output 14. Op.Eval.

5/10/201531 PSU’s CS 587 The Join Algorithms  Every in-memory join algorithm is in some sense a special case of Nested Loops  Simple Nested Loops: Scan outer, scan inner  Index Nested Loops: Scan outer, index scan inner  Sort Merge: Inputs must be sorted on joining attributes. Scan outer in order, scan matching interval of inner.  Hash Join: special case of Index Nested Loops, where index is constructed on-the-fly and inputs are partitioned.  Implications  Every join algorithm scans its outer relation.  Every join algorithm costs at least M. 14. Op.Eval.

5/10/201532 PSU’s CS 587 Cost of Nested Loops Join  (Scan of R)+(# rows in R)*(Scan of S) = M + (M*p R * ρ )*N  Where ρ is the selectivity of the selection on R  For simplicity we are assuming that M is accessed with a file scan. In general M could be accessed with an index scan and the first cost “M” would be replaced by the cost of the index scan.  For our example, the cost is prohibitive:  1000+(1,000)*500 = 501,000 I/Os = 5,010 seconds ~= 1.4 hours  Nested loops is used only for small tables.

5/10/201533 PSU’s CS 587 Index Nested Loops Join  Cost: M + (M*p R* ρ ) * (Cost of finding matching S tuples)  Again, the first cost M might be replaced by an index scan cost.  Cost of finding matching S tuples= cost of probing S index + cost of finding S tuples.  For each R tuple, cost of probing S index is about 1.2 for hash index, 2- 4 for B+ tree.  Cost of then finding S tuples (assuming Alt. (2) or (3) for data entries) depends on clustering. Clustered index: 1 I/O (typical), unclustered: up to 1 I/O per matching S tuple. Valid only when S has an index on s j foreach tuple r in R do foreach tuple s in S where r i == s j  use index here add to result 14. Op.Eval.

5/10/201534 PSU’s CS 587 Examples of Index Nested Loops  In our example Sailors has a clustered index on sid so we can apply INL.  Cost is M + (M*p R* ρ ) * 3 = 1000 + (1000*100*.01) *3 = 4000 I/Os = 40 seconds. 14. Op.Eval.

5/10/201535 PSU’s CS 587 14.4.2 Sort-Merge Join  Costs are in Blue.  Assume R, S can be sorted in 2 passes, i.e. M, N < B 2  No-brainer Implementation:  Sort R 4M  Sort S 4N  Merge-join R and S M+N  Total Cost 5(M+N)  Better Implementation:  Sort* R into runs 2M  Sort* S into runs 2N  Merge runs of R and S and merge-join R and S M+N  Total Cost 3(M+N) *Use snowplow/replacement sort optimization 14. Op.Eval.

5/10/201536 PSU’s CS 587 Example of Sort-Merge Join 14. Op.Eval.

5/10/201537 PSU’s CS 587 Sort-Merge Join Optimization... Runs from S Runs from R. Fill Buffers Merge Join Output

5/10/201538 PSU’s CS 587 Hybrid SMJ costs  If R and S are sorted, cost is M+N  If R is sorted, cost is M+N +2 ρ S N – 2( B - ρ S N/B )  Argument similar to preceding page  If neither R nor S are sorted, cost is M+N + 2 ρ R M + 2 ρ S N -2( B - ρ R M/B - ρ S N/B )  The initial M and N costs might be replaced by index scan costs if appropriate. ⋈ RR SS RS

5/10/201539 PSU’s CS 587 Example: Hybrid SMJ  Assume this plan uses a clustered sid index scan of Reserves at a cost of 1102 and a file scan of Sailors at a cost of 500, B = 256.  Plan cost is then  M+N +2 ρ S N – 2( B - ρ S N/B ) =  1102 + 500 +2*.5*500-2(256-.5*500/256) = 1590 ⋈  bid=100  rank>5 ReservesSailors

5/10/201540 PSU’s CS 587 14.4.3 Hash-Join [677]  Confusing because it combines two distinct ideas: 1.Partition R and S using a hash function You should understand how and why this works. Read the text if you don’t. 2.Join corresponding partitions using a version of INL with hash indexes built on-the-fly. You should understand why the hash function used here must differ from the hash function used for partitioning.  The partition step could be followed by any join method in step 2. Hashing is used because  The tables are large (they needed partitioning) so NLJ is not a contender.  The partitions were just created so they have no indexes on them, so classic INL is not possible.  The partitions were just created so are not sorted, so SMJ is not a big winner. There may be cases where a sort order is needed later, but this is not considered.

5/10/201541 PSU’s CS 587 Partitions of R & S Input buffer for Si Hash table for partition Ri (k < B-1 pages) B main memory buffers Disk Output buffer Disk Join Result hash fn h2 B main memory buffers Disk Original Relation OUTPUT 2 INPUT 1 hash function h B-1 Partitions 1 2 B-1...

5/10/201542 PSU’s CS 587 Hash-Join Example: Partition R 16,23,17, 86,84,21, 61 S 19,17,84, 25, 22 Input 0 mod k 1 mod k k Partitions of R k Partitions of S

5/10/201543 PSU’s CS 587 Hash-Join Example: INL Join Output Partitions of R Partitions of S 16,86,84 23,17,21,61 84,22 19,17,25,43 Scan partition of S Hashed partition of R Output 16,86,84

5/10/201544 PSU’s CS 587 Cost of Hash Join  We only need one of M,N to be  B 2 !  Let it be M  Read R, write B partitions: 2M Each one is, on the average, at most B pages Some fudging going on here  Read S, write B partitions: 2N We don’t care about the size of these partitions  For each of the B partitions of R and S: M+N Build a hash table for the R-partition Read each partition of R, and matching partition of S: M+N  Total: 3(M+N)  Can we do better if M << B 2 ? Hybridize!

5/10/201545 PSU’s CS 587 Hybrid Hash Join, M  2B, Partition Input Buffer Partition 0 Output Buffer Partition 1 Of R Partition 0 of S Partition 0 of R S R R 1 ⋈ S 1

5/10/201546 PSU’s CS 587 Hybrid Hash Join, M  2B, INL R0 ⋈ S0R0 ⋈ S0 Output Buffer Load R Partition 0 Partition 0 of S Scan S Partition 0 Partition 0 of R

5/10/201547 PSU’s CS 587 Hybrid Hash Join, Partition Input Buffer Partitions 0..k-1 Output Buffers Partition k Of R Partitions 0..k-1 of S Partitions 0..k-1 of R S R R k ⋈ S k … … …

5/10/201548 PSU’s CS 587 Hybrid Hash Join, INL, Partition i Ri ⋈ SiRi ⋈ Si Output Buffer Load R Partition i Partition i of S Scan S Partition 0 Partition i of R

5/10/201549 PSU’s CS 587 Fudging in our treatment of Hash-Join  If we build an in-memory hash table to speed up the matching of tuples, a little more memory is needed.  If the hash function does not partition uniformly, one or more R partitions may not fit in memory.  In this case we can apply the hash-join technique recursively to do the join of this R-partition with its corresponding S-partition. 14. Op.Eval.

5/10/201550 PSU’s CS 587 Sort-Merge vs. Hash Join  When can we achieve the low cost of 3(M+N)?  Hash: Smaller of M, N  B 2  Sort-merge: Both M, N  B 2  So Hash Join is preferable if table sizes differ greatly.  But  Sort-Merge Join is less sensitive to data skew.  The result of Sort Merge Join is ordered.

5/10/201551 PSU’s CS 587 Hash Join comments  Can view first phase of Hash Join of R, S as divide and conquer: partitioning the join into subjoins that can be done with one input smaller than memory.  What to do if we don’t know the size of R? Assume the case M  B, then split off more and more of the hash table onto disk as it grows. Used in Postgres.

5/10/201552 PSU’s CS 587 Equijoin Cost Summary  Nested Loops best for small tables  Others have high overhead costs Nested LoopsM + MN Index Nested LoopsM + M p R (1.2-4)(1+) Sort-Merge, Hash3(M+N)

5/10/201553 PSU’s CS 587 14.4.4 General Join Conditions  Equalities over several attributes (e.g., R.sid=S.sid AND R.rname=S.sname ):  For Index NL, build index on (if S is inner); or use existing indexes on sid or sname.  For Sort-Merge and Hash Join, sort/partition on combination of the two join columns.  Not much different than one-equality equijoins. 14. Op.Eval.

5/10/201554 PSU’s CS 587 Inequality Conditions  Typical example: R.rname < S.sname  Relatively rare. Does the example make sense?  For Index NL, need (clustered!) B+ tree index to get any efficiency. Range probes on inner; # matches likely to be much higher than for equality joins.  Hash Join, Sort Merge Join not applicable.  With no index, Block NL is the best join method.

5/10/201555 PSU’s CS 587 14.5 Set Operations  INTERSECTION:  INTERSECTION is a special case of join. How is it a special case?  So all the usual join algorithms apply  CROSS PRODUCT  Also a special case of join How?  What join algorithm is best to compute CROSS PRODUCT? Will Sort-Merge or Hash help? 14. Op.Eval.

5/10/201556 PSU’s CS 587 UNION and EXCEPT  They are similar; we’ll do UNION.  The hard part is removing duplicates.  Sorting based approach to union:  Sort both relations (on a key).  Merge sorted relations, discarding duplicates.  Alternative : Merge runs from Pass 0 for both relations.  Hash based approach to union:  Partition R and S using hash function h.  For each S-partition, build in-memory hash table, scan corresponding R-partition and add tuples to table while discarding duplicates.

5/10/201557 PSU’s CS 587 14.6 Aggregates w/o grouping  Example  SELECT AVE(S.age) FROM SAILORS S  In general, requires scanning the relation.  What is the cost for this query?  Given index whose search key includes all attributes in the SELECT or WHERE clauses, can do index-only scan.  If there is an index on age, what is the cost of this query? 14. Op.Eval.

5/10/201558 PSU’s CS 587 Aggregates with grouping  Example  SELECT S.rating, AVE(S.age) FROM SAILORS S GROUP BY S.rating  What is an algorithm based on sorting?  What is the cost of this query, assuming 2-pass sort?  Optimizations? Cost?  What is an algorithm based on hashing?  How much memory is needed?  Cost?  Conclusion: Hash is best  If there is an appropriate index, use index-only alg. 14. Op.Eval.

5/10/201559 PSU’s CS 587 14.7 Impact of Buffering  If several operations are executing concurrently, estimating the number of available buffer pages is guesswork.  Repeated access patterns interact with buffer replacement policy.  e.g., Inner relation is scanned repeatedly in Simple Nested Loop Join. With enough buffer pages to hold inner, replacement policy does not matter. Otherwise, MRU is best, LRU is worst ( sequential flooding ).  What replacement policy is best for Block Nested Loops? Index Nested Loops? Sort-Merge Join sort phase? Sort-Merge Join merge phase? 14. Op.Eval.

Download ppt "5/10/20151 PSU’s CS 587 14. Evaluating Relational Operators  Table Statistics  Operators and example schema  Selection  Projection  Equijoins  General."

Similar presentations