Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mostafa Elhemali Leo Giakoumakis. Problem definition QRel system overview Case Study Conclusion 2.

Similar presentations


Presentation on theme: "Mostafa Elhemali Leo Giakoumakis. Problem definition QRel system overview Case Study Conclusion 2."— Presentation transcript:

1 Mostafa Elhemali Leo Giakoumakis

2 Problem definition QRel system overview Case Study Conclusion 2

3 ⋈ T.b=R.y π T.a,R.z T σ T.c<5 R Merge ⋈ T.b=R.y π T.a,R.z Scan Index T.Ib σ T.c<5 Scan Table R Sort on y ParseOptimize SELECT T.a,R.z FROM T JOIN R on T.b = R.y WHERE T.c < 5 Query processing in SQL Server: 3 ? Profit

4 ⋈ T.b=R.y π T.a,R.z T σ T.c<5 R Merge ⋈ T.b=R.y π T.a,R.z Scan Index T.Ib σ T.c<5 Scan Table R Sort on y ⋈ σ T.c<5 ⋈ ⋈ Merge ⋈ Transformation rules ⋈ Hash ⋈ 4

5 SELECT T.a,R.z FROM T JOIN R on T.b = R.y WHERE T.c < 5 Merge ⋈ T.b=R.y π T.a,R.z Scan Index T.Ib σ T.c<5 Scan Table R Sort on y ⋈ σ T.c<5 ⋈ What we want to test System Input System Output 5

6 Rule to test: Push Aggregates Below Join 6 ⋈ G ⋈ G

7 Manual Approach – Construct equivalent SQL Manually construct equivalent SQL constructs Think of which tables to use Think of exact join predicate, grouping columns, aggregates, etc. 7 ⋈ G ⋈ G SELECT T.a,SUM(T.c) FROM T JOIN R on T.a = R.y GROUP BY T.a

8 Manual Approach – Variation on basic case Project between Aggregation and Join 8 ⋈ G ⋈ G SELECT M.Tb,SUM(M.p) FROM ( SELECT T.b Tb, (T.a + T.c) p FROM T JOIN R on T.b = R.y) M GROUP BY M.Tb π π

9 Manual construction of SQL test cases is hard Manual variation of these cases is harder SQL test cases are over-specified Harder to maintain 9 SELECT M.Tb,SUM(M.p) FROM ( SELECT T.b Tb, (T.a + T.c) p FROM T JOIN R on T.b = R.y) M GROUP BY M.Tb Is the goal to test Aggregation over a sum? Is the choice of T & R significant?

10 Allow testers to write test cases in abstract relational trees: ⋈ σ T.c<5 Yet present SQL Server with concrete SQL queries: SELECT T.a,R.z FROM T JOIN R on T.b = R.y WHERE T.c < 5 10

11 RelOps SqlGenRelGen OpFilter OpJoin … Expression Predicate SELECT … FROM T JOIN T2 ON T.a = T2.b WHERE T.x = 2 T σ T.x=2 ⋈ T.a=T2.b TT2 σ T.x=2 ⋈ T.a=T2.b TT2 11

12 Allow testers to write their relational tree test cases ⋈ σ T.c<5 Represented as OpFilter PredCompare(LT) ExpColumn(T.c)ExpScalar(5) OpInnerJoin 12

13 .NET classes for all relational and scalar operators expressible in SQL Relational: Join, Selection, Sort, … Scalar: Predicates, Expressions Does not represent any operator not expressible in SQL, e.g. semi-join Metadata extraction Basic property derivation Output columns for relational operators Data type for scalar expressions 13

14 Allow testers to write only the essential parts of the relational tree, and automatically fill out the rest ⋈ σ T.c<5 ⋈ T.b=R.y π T.a,R.z T σ T.c<5 R 14

15 Top-down generation of relational tree skeleton, Followed by bottom-up filling out phase 15

16 ⋈ Possible operators π σ ⋈ → G T R Available relations CompareColumnAndScalar CompareTwoColumns Predicate Generators InSubquery π σσ T R … … a,b,cx,y T.b=R.y T.c<5 T.a,R.z 16

17 Tree generation is highly customizable for targeted testing Can start from a partially filled out tree Customizable probability distributions for any random parts, e.g. choice of relational operators, choice of predicates Use of constraints to influence the tree generation 17

18 Trees should be free of logical errors E.g. No Aggregation over XML columns Trees (and subtrees) should not be trivially optimized away Avoid contradicting predicates Operators should yield (many) rows if executed Reasoning: makes them more expensive, lures optimizer into deeper optimizations. Also we use QRel to test all of QP at times Use real data in predicates, e.g. Country = ‘England’ Use PK-FK columns in joins 18

19 Presents the server with proper SQL queries from the relational tree test cases ⋈ T.b=R.y π T.a,R.z T σ T.c<5 R SELECT T.a,R.z FROM T JOIN R on T.b = R.y WHERE T.c < 5 19

20 Subquery and derived table formation SQL clause formation Table and column aliasing (inverse of binding) 20

21 T.b=R.y T.b<5 SUM(T.a),R.z π ⋈ σ R G T.b,SUM(T.a) T SELECTSUM(T1_1.C2) AS C1, T1_2.z AS C2 FROM( SELECTT2_1.b AS C1, SUM(T2_1.a) AS C2 FROMT AS T2_1 GROUP BYT2_1.b HAVINGT2_1.b < 5 ) AS T1_1JOINR AS T1_2 ON T1_1.C1 = T1_2.y 21

22 Rule to test: ApplySJ UA σ1σ1 R U ApplySJ R σ2σ2 σ1σ1 R σ2σ2 ApplySJ = Apply-Semi-Join UA = Union All U = Union R σ OR EXISTS σ1σ1 σ2σ2 Cannot be represented Target rule Preliminary rule 22

23 PredExists a = new PredExists(new OpFilter(new OpGet(), null)); PredExists b = new PredExists(new OpFilter(new OpGet(), null)); PredBinaryOp p = new PredBinaryOp(a, b, PredBinaryOp.LogicOp.Or); OpFilter mainFilter = new OpFilter(new OpGet(), p); RelOps code: … And that’s the basic test case 23

24 Goes into RelGen (using TPC-H database): σ OR EXISTS σσ SUPPLIER Available relations CorrelationPredicate Predicate Generators ORDERS … O_SHIPPRIORITY, O_COMMENT, … SUPPLIER S_NATIONKEY, S_ADDRESS, … S_NATIONKEY >= O_SHIPPRIORITY SUPPLIER S_ADDRESS < O_COMMENT 24

25 σ OR EXISTS σσ S_NATIONKEY >= O_SHIPPRIORITY S_ADDRESS < O_COMMENT SUPPLIER ORDERS SELECT * FROM [ORDERS] AS T1_1 WHERE (EXISTS ( SELECT 1 AS C1 FROM [SUPPLIER] AS T2_1 WHERE T2_1.S_NATIONKEY >= T1_1.O_SHIPPRIORITY)) OR (EXISTS ( SELECT 1 AS C1 FROM [SUPPLIER] AS T2_1 WHERE T2_1.S_ADDRESS < T1_1.O_COMMENT)) Goes into SqlGen: 25

26 More random exploration Embed the basic tree pattern into completely random trees Systematic exploration of various dimensions Number of subquery disjunctives Add scalar disjunctives alongside the relational ones Add more operators in the relational disjunctives … 26

27 How do we verify the correct behavior for semi- random queries? Be creative! 27

28 Rule modeling Do the same transformation in the input query (usually very easy in RelOps) Present the two queries to the optimizer Should get the same plan Only works when the rule output is expressible in SQL Rules on & off Turn the rule under test off Should get a different plan, but same results Only works for non-essential exploration rules 28

29 Testing the Query Optimizer transformation rules using abstract relational trees Utilizing QRel to go from abstract relational trees to concrete SQL queries Future directions Libraries of abstract relational trees More advanced customizations of tree generation Combinatorial techniques for systematic exploration of various trees 29


Download ppt "Mostafa Elhemali Leo Giakoumakis. Problem definition QRel system overview Case Study Conclusion 2."

Similar presentations


Ads by Google