Presentation is loading. Please wait.

Presentation is loading. Please wait.

Rada Chirkova (North Carolina State University) and Chen Li (University of California, Irvine) Materializing Views With Minimal Size To Answer Queries.

Similar presentations


Presentation on theme: "Rada Chirkova (North Carolina State University) and Chen Li (University of California, Irvine) Materializing Views With Minimal Size To Answer Queries."— Presentation transcript:

1 Rada Chirkova (North Carolina State University) and Chen Li (University of California, Irvine) Materializing Views With Minimal Size To Answer Queries

2 Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 Materializing Minimal-Size Views Context: relational databases The problem: minimize the amount of data required to answer queries, by: automatically designing new relations (views), and precomputing and storing (materializing) the new relations Central issue: inventing new views to materialize Applications include: Mediators in data-integration systems “Database as a service” in enterprise computing 2

3 Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 Example: Modified TPC-H Query Q(name,o_date,priority,comment,o_key,quantity, shipmode) :- customer(c_key,name,’building’), order(o_key,c_key,o_date,priority,comment), lineitem(lineno,o_key,quantity,shipmode). V 1 (name,o_date,priority,comment,o_key) :- customer(c_key,name,’building’), order(o_key,c_key,o_date,priority,comment), lineitem(lineno,o_key,quantity,shipmode). V 2 (o_key,quantity,shipmode) :- customer(c_key,name,’building’), order(o_key,c_key,o_date,priority,comment), lineitem(lineno,o_key,quantity,shipmode). 3

4 Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 Partial Answer to the Query Q Name O_Date Priority Comment O_Key Quantity Shipmode Tom 3/14/95 0 close… 134721 26 REG AIR Tom 3/14/95 0 close… 134721 75 REG AIR Tom 3/14/95 0 close… 134721 43 AIR Jack 12/21/94 0 final… 571683 43 MAIL Jack 12/21/94 0 final… 571683 33 AIR 4

5 Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 Minimal-Size Views for the Query Q Q(name,o_date,priority,comment,o_key,quantity, shipmode) :- customer(c_key,name,’building’), order(o_key,c_key,o_date,priority,comment), lineitem(lineno,o_key,quantity,shipmode). V 1 (name,o_date,priority,comment,o_key) :- customer(c_key,name,’building’), order(o_key,c_key,o_date,priority,comment), lineitem(lineno,o_key,quantity,shipmode). V 2 (o_key,quantity,shipmode) :- customer(c_key,name,’building’), order(o_key,c_key,o_date,priority,comment), lineitem(lineno,o_key,quantity,shipmode). 5

6 Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 Questions How do we know that views V 1 and V 2 are minimal-size views for the query Q? On what databases? How to find a set of minimal-size views, given a set of queries and a database: Is the problem decidable? For what inputs? What is the complexity of the problem? Are there good efficient algorithms for finding minimal-size views? 6

7 Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 Preliminaries Two queries are equivalent if they return the same answers on any database. An equivalent rewriting of a query Q in terms of views V is a query that: is defined using the relations in V only, and is equivalent to Q A conjunctive query (view) can be defined using only equality selections, projections, and joins A disjunctive query (view) can be defined as a union of a finite number of conjunctive queries (views) 7

8 Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 Problem Specification Input: Database instance D with schema R Workload Q of queries on D Output (optimal solution): a set V of views, such that: each query in Q has an equivalent rewriting in terms of V, and the total size of the views,  Vi  V size(V i ), is minimal on D 8

9 Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 Assumptions Single database instance Set semantics Finite query workloads Conjunctive queries Disjunctive views and rewritings 9

10 Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 Main Results  Decidability and upper bounds on the complexity of the problem  Relationship between:  a restriction on the language of the queries, and  the language of optimal views  Dynamic-programming algorithm for finding an optimal solution for conjunctive queries (restricted case) 10

11 Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 Conjunctive Views and Rewritings Theorem.QD Theorem. Given a query workload Q and a database D. QD It is possible to construct a finite search space of views that includes all views in all optimal solutions for Q on D. Q The number of views in the search space is at most doubly- exponential in the size of the input query workload Q. Corollary. Corollary. The problem of finding a minimal-size conjunctive viewset is decidable for finite workloads of conjunctive queries, assuming all rewritings are conjunctive. 11

12 Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 Self-Joins in Queries Q 1 (X,Y) :- p(X,Z), p(Z,T), s(Z,Y). // self-join Q 2 (X,Y) :- p(X,Z), r(Z,T), s(Z,Y). // no self-joins Result 1. Result 1. For some databases and queries, there is a set of disjunctive views that is better than any conjunctive solution. Example for a single query with self-joins Result 2. Result 2. The problem of finding an optimal solution in the space of disjunctive views is decidable, assuming conjunctive rewritings. Result 3. Result 3. It is not necessary to consider disjunctive rewritings. Result 4. Result 4. The size of the search space of views is at most triply- exponential in the size of the input query workload. 12

13 Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 Queries Without Self-Joins: The Problem Is in NP 13

14 Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 Queries Without Self-Joins: The Problem Is in NP disjunctive views 13

15 Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 Queries Without Self-Joins: The Problem Is in NP disjunctive views 13 conjunctive views

16 Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 Queries Without Self-Joins: The Problem Is in NP disjunctive views 13 conjunctive views subexpression views

17 Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 Queries Without Self-Joins: The Problem Is in NP disjunctive views 13 conjunctive views subexpression views full-reducer views

18 Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 1. Conjunctive Views Are Enough Theorem. DQ Theorem. Given a database D and a set of queries Q without self-joins. V (D,Q) Suppose a set V of disjunctive views is a solution for (D,Q). V’(D,Q) Then there exists another solution V’ for (D,Q), such that: V’  all views in V’ are conjunctive, and  size (V’)  size (V)  size (V’)  size (V). Corollary. Corollary. For any database and any set of queries without self-joins, some optimal disjunctive solution is a set of conjunctive views. 14

19 Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 What We Have Shown disjunctive views 15 conjunctive views

20 Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 Idea of the Proof  Given: Q(…) :- S 1 (…), S 2 (…), …, S n (…); rewriting P of Q that uses V: V = V 1  V 2  …  V t  Then there exists: V’ = V’ 1  V’ 2  …  V’ t such that:  for some mapping , each V’ i is an image of V i, and  each V’ i alone can replace any V j in the rewriting of Q 16

21 Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 Details of the Proof (1)  P  Q, P = P 1  P 2  P s  There exists a conjunctive query P i : P i  Q  P i (…) :- V i1 (…), …, V ij (…), …, V im (…), G(…).  Fix any V ij in P i ; consider, in P, P r (…) :- V ij (…), …, V ij (…), …, V ij (…), G(…).  Because P r is contained in Q, there exists a mapping  from Q to the expansion of P r  We can always change , to redirect all subgoals of Q that map into subgoals of V ij in P r 17

22 Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 Details of the Proof (1)  P  Q, P = P 1  P 2  P s  There exists a conjunctive query P i : P i  Q  P i (…) :- V i1 (…), …, V ij (…), …, V im (…), G(…).  Fix any V ij in P i ; consider, in P, P r (…) :- V ij (…), …, V ij (…), …, V ij (…), G(…).  Because P r is contained in Q, there exists a mapping  from Q to the expansion of P r  We can always change , to redirect all subgoals of Q that map into subgoals of V ij in P r 17

23 Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 Details of the Proof (1)  P  Q, P = P 1  P 2  P s  There exists a conjunctive query P i : P i  Q  P i (…) :- V i1 (…), …, V ij (…), …, V im (…), G(…).  Fix any V ij in P i ; consider, in P, P r (…) :- V ij (…), …, V ij (…), …, V ij (…), G(…).  Because P r is contained in Q, there exists a mapping  from Q to the expansion of P r  We can always change , to redirect all subgoals of Q that map into subgoals of V ij in P r 17

24 Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 Details of the Proof (1)  P  Q, P = P 1  P 2  P s  There exists a conjunctive query P i : P i  Q  P i (…) :- V i1 (…), …, V ij (…), …, V im (…), G(…).  Fix any V ij in P i ; consider, in P, P r (…) :- V ij (…), …, V ij (…), …, V ij (…), G(…).  Because P r is contained in Q, there exists a mapping  from Q to the expansion of P r  We can always change , to redirect all subgoals of Q that map into subgoals of V ij in P r 17

25 Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 Details of the Proof (2)  We can always change , to redirect all subgoals of Q that map into subgoals of more than one V ij in P r  Then, we can replace P r with P’ r : P r (…) :- V ij (…), …, V ij (…), …, V ij (…), G(…). P’ r (…):- V ij (…), G(…).  And P’ r  Q 18

26 Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 Details of the Proof (3)  Changing , to redirect all subgoals of Q that map into subgoals of V ij in P r : Q(…) :- …, S k (…,W,…), … P r exp (…) :- …, S k (…,Y’,…), …, S k (…,Y,…), … P r (…) :- V ij (…), V ij (…), …, V ij (…), G(…) 19

27 Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 Details of the Proof (3)  Changing , to redirect all subgoals of Q that map into subgoals of V ij in P r : Q(…) :- …, S k (…,W,…), … P r exp (…) :- …, S k (…,Y’,…), …, S k (…,Y,…), … P r (…) :- V ij (…), V ij (…), …, V ij (…), G(…) 19

28 Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 Details of the Proof (3)  Changing , to redirect all subgoals of Q that map into subgoals of V ij in P r : Q(…) :- …, S k (…,W,…), … P r exp (…) :- …, S k (…,Y’,…), …, S k (…,Y,…), … P r (…) :- V ij (…), V ij (…), …, V ij (…), G(…) 19 

29 Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 Details of the Proof (3)  Changing , to redirect all subgoals of Q that map into subgoals of V ij in P r : Q(…) :- …, S k (…,W,…), … P r exp (…) :- …, S k (…,Y’,…), …, S k (…,Y,…), … P r (…) :- V ij (…), V ij (…), …, V ij (…), G(…) 19 

30 Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 Details of the Proof (3)  Changing , to redirect all subgoals of Q that map into subgoals of V ij in P r : Q(…) :- …, S k (…,W,…), … P r exp (…) :- …, S k (…,Y’,…), …, S k (…,Y,…), … P r (…) :- V ij (…), V ij (…), …, V ij (…), G(…) 19 

31 Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 Details of the Proof (3)  Changing , to redirect all subgoals of Q that map into subgoals of V ij in P r : Q(…) :- …, S k (…,W,…), … P r exp (…) :- …, S k (…,Y’,…), …, S k (…,Y,…), … P r (…) :- V ij (…), V ij (…), …, V ij (…), G(…) 19 

32 Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 Details of the Proof (3)  Changing , to redirect all subgoals of Q that map into subgoals of V ij in P r : Q(…) :- …, S k (…,W,…), … P r exp (…) :- …, S k (…,Y’,…), …, S k (…,Y,…), … P r (…) :- V ij (…), V ij (…), …, V ij (…), G(…) 19 ’’ 

33 Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 Details of the Proof (4)  Thus, we can replace P r with P’ r : P r (…) :- V ij (…), …, V ij (…), …, V ij (…), G(…). P’ r (…):- V ij (…), G(…).  And P’ r  Q 20

34 Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 2. Subexpression Views Are Enough Theorem. DQ Theorem. Given a database D and a set of queries Q without self-joins. V (D,Q) Suppose a set V of disjunctive views is a solution for (D,Q). V’(D,Q) Then there exists another solution V’ for (D,Q), such that: V’  all views in V’ are conjunctive subexpression-type, and  size (V’)  size (V)  size (V’)  size (V). Corollary. Corollary. For any database and set of queries without self-joins, some optimal disjunctive solution is a set of conjunctive subexpression- type views. The size of the search space of views is at most singly-exponential in the size of the input query workload 21

35 Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 3. Full-Reducer Views Are Enough A view V is a full-reducer view for a query Q if V and Q have the same body. Theorem. D Theorem. Given a database D and a single query Q without self-joins. V (D,) Suppose a set V of disjunctive views is a solution for (D,Q). V’(D,) Then there exists another solution V’ for (D,Q), such that: V’  all views in V’ are conjunctive full-reducer views for Q, and  size (V’)  size (V)  size (V’)  size (V). Corollary. Corollary. For any database and any query without self-joins, some optimal disjunctive solution is a set of conjunctive full-reducer views. 22

36 Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 Using Full-Reducer Views To Rewrite Sets of Queries For query workloads with more than one query, we can merge optimal full-reducer views for individual queries in the workload - and the number of subgoals in the merged views never exceeds the number of subgoals in full-reducer views. 23

37 Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 What We Have Shown disjunctive views 24 conjunctive views subexpression views full-reducer views

38 Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 The Problem Is in NP 25 Theorem. Theorem. Given a database instance, for any finite workload of conjunctive queries without self-joins, the problem of finding a minimal-size disjunctive viewset is in NP.

39 Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 Generating Minimal-Size Views Input: a conjunctive query without self-joins and a database Output: a minimal-size disjunctive viewset for the query on the database Method: produce a minimal-size set of conjunctive full- reducer views, by doing exhaustive search in the space of the views using a dynamic-programming algorithm (cf. query optimization in System R) The algorithm returns an optimal solution Can be modified to work for non-singleton query workloads 26

40 Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 Heuristics for Generating Views 1.Consider only those views that “cover” up to a fixed number of subgoals of the query 2.Consider only those views that have up to a fixed number of head attributes 3.Apply the algorithm separately to several subsets of subgoals of the query, then combine the solutions 27

41 Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 Main Results  Decidability and upper bounds on the complexity of the problem  Relationship between:  a restriction on the language of the queries, and  the language of optimal views  Dynamic-programming algorithm for finding an optimal solution for conjunctive queries (restricted case) 28

42 Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 Some Directions of Future Work  Rewriting queries in more expressive languages:  built-in predicates  disjunctive queries  …  Using more expressive languages of views and rewritings  Maximally-contained rewritings of queries in terms of views 29

43 Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 Reference Jia Li, Rada Chirkova, and Chen Li. Minimizing Data-Communication Costs by Decomposing Query Results in Client-Server Environments. UCI ICS Technical Report, 2003. http://www-db.ics.uci.edu/pages/raccoon/ 30


Download ppt "Rada Chirkova (North Carolina State University) and Chen Li (University of California, Irvine) Materializing Views With Minimal Size To Answer Queries."

Similar presentations


Ads by Google