Presentation is loading. Please wait.

Presentation is loading. Please wait.

Materializing Views With Minimal Size To Answer Queries

Similar presentations


Presentation on theme: "Materializing Views With Minimal Size To Answer Queries"— Presentation transcript:

1 Materializing Views With Minimal Size To Answer Queries
Rada Chirkova (North Carolina State University) and Chen Li (University of California, Irvine)

2 Materializing Minimal-Size Views
Chirkova, Halevy, and Suciu VLDB-2001 5/9/2019 Materializing Minimal-Size Views Context: relational databases The problem: minimize the amount of data required to answer queries, by: automatically designing new relations (views), and precomputing and storing (materializing) the new relations Central issue: inventing new views to materialize Applications include: Mediators in data-integration systems “Database as a service” in enterprise computing 02/06/02: need to split into two slides? (want to talk about related work - view selection) The talk needs to be 45 minutes! People use db: put data, ask queries (same queries again) - can add or change data => make these queries more efficient Stress that *multiple* queries Will not be discussing: - update costs - or indexing costs The big question: why need to materialize views at all => the ancestor example Why view selection is not a completely satisfactory solution Chirkova and Li Materializing Views with Minimal Size to Answer Queries /09/2003 2 A formal perspective on the view selection problem

3 Example: Modified TPC-H Query
Q(name,o_date,priority,comment,o_key,quantity, shipmode) :- customer(c_key,name,’building’), order(o_key,c_key,o_date,priority,comment), lineitem(lineno,o_key,quantity,shipmode). V1(name,o_date,priority,comment,o_key) :- V2(o_key,quantity,shipmode) :- Chirkova and Li Materializing Views with Minimal Size to Answer Queries /09/2003 3

4 Partial Answer to the Query Q
Name O_Date Priority Comment O_Key Quantity Shipmode Tom 3/14/ close… REG AIR Tom 3/14/ close… REG AIR Tom 3/14/ close… AIR Jack 12/21/ final… MAIL Jack 12/21/ final… AIR Chirkova and Li Materializing Views with Minimal Size to Answer Queries /09/2003 4

5 Minimal-Size Views for the Query Q
Q(name,o_date,priority,comment,o_key,quantity, shipmode) :- customer(c_key,name,’building’), order(o_key,c_key,o_date,priority,comment), lineitem(lineno,o_key,quantity,shipmode). V1(name,o_date,priority,comment,o_key) :- V2(o_key,quantity,shipmode) :- Chirkova and Li Materializing Views with Minimal Size to Answer Queries /09/2003 5

6 Questions How do we know that views V1 and V2 are minimal-size views for the query Q? On what databases? How to find a set of minimal-size views, given a set of queries and a database: Is the problem decidable? For what inputs? What is the complexity of the problem? Are there good efficient algorithms for finding minimal-size views? Chirkova and Li Materializing Views with Minimal Size to Answer Queries /09/2003 6

7 Chirkova, Halevy, and Suciu VLDB-2001
5/9/2019 Preliminaries Two queries are equivalent if they return the same answers on any database. An equivalent rewriting of a query Q in terms of views V is a query that: is defined using the relations in V only, and is equivalent to Q A conjunctive query (view) can be defined using only equality selections, projections, and joins A disjunctive query (view) can be defined as a union of a finite number of conjunctive queries (views) Chirkova and Li Materializing Views with Minimal Size to Answer Queries /09/2003 7 A formal perspective on the view selection problem

8 Problem Specification
Chirkova, Halevy, and Suciu VLDB-2001 5/9/2019 Problem Specification Input: Database instance D with schema R Workload Q of queries on D Output (optimal solution): a set V of views, such that: each query in Q has an equivalent rewriting in terms of V, and the total size of the views, SVi Î V size(Vi), is minimal on D Stress that *multiple* queries Will not be discussing: - update costs - or indexing costs Why rewriting uses only the views? - because can dematerialize all original relations (and some new views can be original relations) Chirkova and Li Materializing Views with Minimal Size to Answer Queries /09/2003 8 A formal perspective on the view selection problem

9 Chirkova, Halevy, and Suciu VLDB-2001
5/9/2019 Assumptions Single database instance Set semantics Finite query workloads Conjunctive queries Disjunctive views and rewritings 02/06/02: the “no indexes” assumption is essential Put weighted sum on slides? Set semantics? Chirkova and Li Materializing Views with Minimal Size to Answer Queries /09/2003 9 A formal perspective on the view selection problem

10 Chirkova, Halevy, and Suciu VLDB-2001
5/9/2019 Main Results Decidability and upper bounds on the complexity of the problem Relationship between: a restriction on the language of the queries, and the language of optimal views Dynamic-programming algorithm for finding an optimal solution for conjunctive queries (restricted case) Chirkova and Li Materializing Views with Minimal Size to Answer Queries /09/2003 10 A formal perspective on the view selection problem

11 Conjunctive Views and Rewritings
Theorem. Given a query workload Q and a database D. It is possible to construct a finite search space of views that includes all views in all optimal solutions for Q on D. The number of views in the search space is at most doubly-exponential in the size of the input query workload Q. Corollary. The problem of finding a minimal-size conjunctive viewset is decidable for finite workloads of conjunctive queries, assuming all rewritings are conjunctive. Chirkova and Li Materializing Views with Minimal Size to Answer Queries /09/2003 11

12 Self-Joins in Queries Q1(X,Y) :- p(X,Z), p(Z,T), s(Z,Y). // self-join
Q2(X,Y) :- p(X,Z), r(Z,T), s(Z,Y). // no self-joins Result 1. For some databases and queries, there is a set of disjunctive views that is better than any conjunctive solution. Example for a single query with self-joins Result 2. The problem of finding an optimal solution in the space of disjunctive views is decidable, assuming conjunctive rewritings. Result 3. It is not necessary to consider disjunctive rewritings. Result 4. The size of the search space of views is at most triply-exponential in the size of the input query workload. Chirkova and Li Materializing Views with Minimal Size to Answer Queries /09/2003 12

13 Queries Without Self-Joins: The Problem Is in NP
Chirkova and Li Materializing Views with Minimal Size to Answer Queries /09/2003 13

14 Queries Without Self-Joins: The Problem Is in NP
disjunctive views Chirkova and Li Materializing Views with Minimal Size to Answer Queries /09/2003 13

15 Queries Without Self-Joins: The Problem Is in NP
disjunctive views conjunctive views Chirkova and Li Materializing Views with Minimal Size to Answer Queries /09/2003 13

16 Queries Without Self-Joins: The Problem Is in NP
disjunctive views conjunctive views subexpression views Chirkova and Li Materializing Views with Minimal Size to Answer Queries /09/2003 13

17 Queries Without Self-Joins: The Problem Is in NP
disjunctive views conjunctive views subexpression views full-reducer views Chirkova and Li Materializing Views with Minimal Size to Answer Queries /09/2003 13

18 1. Conjunctive Views Are Enough
Theorem. Given a database D and a set of queries Q without self-joins. Suppose a set V of disjunctive views is a solution for (D,Q). Then there exists another solution V’ for (D,Q), such that: all views in V’ are conjunctive, and size (V’) £ size (V). Corollary. For any database and any set of queries without self-joins, some optimal disjunctive solution is a set of conjunctive views. Chirkova and Li Materializing Views with Minimal Size to Answer Queries /09/2003 14

19 What We Have Shown disjunctive views conjunctive views 15
Chirkova and Li Materializing Views with Minimal Size to Answer Queries /09/2003 15

20 Idea of the Proof Given: Q(…) :- S1(…), S2(…), …, Sn(…);
rewriting P of Q that uses V: V = V1 È V2 È … È Vt Then there exists: V’ = V’1 È V’2 È … È V’t such that: for some mapping m, each V’i is an image of Vi, and each V’i alone can replace any Vj in the rewriting of Q Chirkova and Li Materializing Views with Minimal Size to Answer Queries /09/2003 16

21 Details of the Proof (1) P º Q, P = P1 È P2 È ... È Ps
There exists a conjunctive query Pi: Pi º Q Pi (…) :- Vi1(…), …, Vij(…), …, Vim(…), G(…). Fix any Vij in Pi; consider, in P, Pr (…) :- Vij(…), …, Vij(…), …, Vij(…), G(…). Because Pr is contained in Q, there exists a mapping b from Q to the expansion of Pr We can always change b, to redirect all subgoals of Q that map into subgoals of Vij in Pr Chirkova and Li Materializing Views with Minimal Size to Answer Queries /09/2003 17

22 Details of the Proof (1) P º Q, P = P1 È P2 È ... È Ps
There exists a conjunctive query Pi: Pi º Q Pi (…) :- Vi1(…), …, Vij(…), …, Vim(…), G(…). Fix any Vij in Pi; consider, in P, Pr (…) :- Vij(…), …, Vij(…), …, Vij(…), G(…). Because Pr is contained in Q, there exists a mapping b from Q to the expansion of Pr We can always change b, to redirect all subgoals of Q that map into subgoals of Vij in Pr Chirkova and Li Materializing Views with Minimal Size to Answer Queries /09/2003 17

23 Details of the Proof (1) P º Q, P = P1 È P2 È ... È Ps
There exists a conjunctive query Pi: Pi º Q Pi (…) :- Vi1(…), …, Vij(…), …, Vim(…), G(…). Fix any Vij in Pi; consider, in P, Pr (…) :- Vij(…), …, Vij(…), …, Vij(…), G(…). Because Pr is contained in Q, there exists a mapping b from Q to the expansion of Pr We can always change b, to redirect all subgoals of Q that map into subgoals of Vij in Pr Chirkova and Li Materializing Views with Minimal Size to Answer Queries /09/2003 17

24 Details of the Proof (1) P º Q, P = P1 È P2 È ... È Ps
There exists a conjunctive query Pi: Pi º Q Pi (…) :- Vi1(…), …, Vij(…), …, Vim(…), G(…). Fix any Vij in Pi; consider, in P, Pr (…) :- Vij(…), …, Vij(…), …, Vij(…), G(…). Because Pr is contained in Q, there exists a mapping b from Q to the expansion of Pr We can always change b, to redirect all subgoals of Q that map into subgoals of Vij in Pr Chirkova and Li Materializing Views with Minimal Size to Answer Queries /09/2003 17

25 Details of the Proof (2) We can always change b, to redirect all subgoals of Q that map into subgoals of more than one Vij in Pr Then, we can replace Pr with P’r: Pr(…) :- Vij(…), …, Vij(…), …, Vij(…), G(…). P’r(…):- Vij(…), G(…). And P’r º Q Chirkova and Li Materializing Views with Minimal Size to Answer Queries /09/2003 18

26 Details of the Proof (3) Changing b, to redirect all subgoals of Q that map into subgoals of Vij in Pr : Q(…) :- …, Sk(…,W,…), … Prexp(…) :- …, Sk(…,Y’,…), …, Sk(…,Y,…), … Pr(…) :- Vij(…), Vij(…), …, Vij(…), G(…) Chirkova and Li Materializing Views with Minimal Size to Answer Queries /09/2003 19

27 Details of the Proof (3) Changing b, to redirect all subgoals of Q that map into subgoals of Vij in Pr : Q(…) :- …, Sk(…,W,…), … Prexp(…) :- …, Sk(…,Y’,…), …, Sk(…,Y,…), … Pr(…) :- Vij(…), Vij(…), …, Vij(…), G(…) Chirkova and Li Materializing Views with Minimal Size to Answer Queries /09/2003 19

28 Details of the Proof (3) Changing b, to redirect all subgoals of Q that map into subgoals of Vij in Pr : Q(…) :- …, Sk(…,W,…), … Prexp(…) :- …, Sk(…,Y’,…), …, Sk(…,Y,…), … Pr(…) :- Vij(…), Vij(…), …, Vij(…), G(…) b Chirkova and Li Materializing Views with Minimal Size to Answer Queries /09/2003 19

29 Details of the Proof (3) Changing b, to redirect all subgoals of Q that map into subgoals of Vij in Pr : Q(…) :- …, Sk(…,W,…), … Prexp(…) :- …, Sk(…,Y’,…), …, Sk(…,Y,…), … Pr(…) :- Vij(…), Vij(…), …, Vij(…), G(…) b Chirkova and Li Materializing Views with Minimal Size to Answer Queries /09/2003 19

30 Details of the Proof (3) Changing b, to redirect all subgoals of Q that map into subgoals of Vij in Pr : Q(…) :- …, Sk(…,W,…), … Prexp(…) :- …, Sk(…,Y’,…), …, Sk(…,Y,…), … Pr(…) :- Vij(…), Vij(…), …, Vij(…), G(…) b Chirkova and Li Materializing Views with Minimal Size to Answer Queries /09/2003 19

31 Details of the Proof (3) Changing b, to redirect all subgoals of Q that map into subgoals of Vij in Pr : Q(…) :- …, Sk(…,W,…), … Prexp(…) :- …, Sk(…,Y’,…), …, Sk(…,Y,…), … Pr(…) :- Vij(…), Vij(…), …, Vij(…), G(…) b Chirkova and Li Materializing Views with Minimal Size to Answer Queries /09/2003 19

32 Details of the Proof (3) Changing b, to redirect all subgoals of Q that map into subgoals of Vij in Pr : Q(…) :- …, Sk(…,W,…), … Prexp(…) :- …, Sk(…,Y’,…), …, Sk(…,Y,…), … Pr(…) :- Vij(…), Vij(…), …, Vij(…), G(…) b’ b Chirkova and Li Materializing Views with Minimal Size to Answer Queries /09/2003 19

33 Details of the Proof (4) Thus, we can replace Pr with P’r:
Pr(…) :- Vij(…), …, Vij(…), …, Vij(…), G(…). P’r(…):- Vij(…), G(…). And P’r º Q Chirkova and Li Materializing Views with Minimal Size to Answer Queries /09/2003 20

34 2. Subexpression Views Are Enough
Theorem. Given a database D and a set of queries Q without self-joins. Suppose a set V of disjunctive views is a solution for (D,Q). Then there exists another solution V’ for (D,Q), such that: all views in V’ are conjunctive subexpression-type, and size (V’) £ size (V). Corollary. For any database and set of queries without self-joins, some optimal disjunctive solution is a set of conjunctive subexpression-type views. The size of the search space of views is at most singly-exponential in the size of the input query workload Chirkova and Li Materializing Views with Minimal Size to Answer Queries /09/2003 21

35 3. Full-Reducer Views Are Enough
A view V is a full-reducer view for a query Q if V and Q have the same body. Theorem. Given a database D and a single query Q without self-joins. Suppose a set V of disjunctive views is a solution for (D,Q). Then there exists another solution V’ for (D,Q), such that: all views in V’ are conjunctive full-reducer views for Q, and size (V’) £ size (V). Corollary. For any database and any query without self-joins, some optimal disjunctive solution is a set of conjunctive full-reducer views. Chirkova and Li Materializing Views with Minimal Size to Answer Queries /09/2003 22

36 Using Full-Reducer Views To Rewrite Sets of Queries
For query workloads with more than one query, we can merge optimal full-reducer views for individual queries in the workload - and the number of subgoals in the merged views never exceeds the number of subgoals in full-reducer views. Chirkova and Li Materializing Views with Minimal Size to Answer Queries /09/2003 23

37 What We Have Shown disjunctive views conjunctive views
subexpression views full-reducer views Chirkova and Li Materializing Views with Minimal Size to Answer Queries /09/2003 24

38 The Problem Is in NP Theorem. Given a database instance, for any finite workload of conjunctive queries without self-joins, the problem of finding a minimal-size disjunctive viewset is in NP. Chirkova and Li Materializing Views with Minimal Size to Answer Queries /09/2003 25

39 Generating Minimal-Size Views
Input: a conjunctive query without self-joins and a database Output: a minimal-size disjunctive viewset for the query on the database Method: produce a minimal-size set of conjunctive full-reducer views, by doing exhaustive search in the space of the views using a dynamic-programming algorithm (cf. query optimization in System R) The algorithm returns an optimal solution Can be modified to work for non-singleton query workloads Chirkova and Li Materializing Views with Minimal Size to Answer Queries /09/2003 26

40 Heuristics for Generating Views
Consider only those views that “cover” up to a fixed number of subgoals of the query Consider only those views that have up to a fixed number of head attributes Apply the algorithm separately to several subsets of subgoals of the query, then combine the solutions Chirkova and Li Materializing Views with Minimal Size to Answer Queries /09/2003 27

41 Chirkova, Halevy, and Suciu VLDB-2001
5/9/2019 Main Results Decidability and upper bounds on the complexity of the problem Relationship between: a restriction on the language of the queries, and the language of optimal views Dynamic-programming algorithm for finding an optimal solution for conjunctive queries (restricted case) Chirkova and Li Materializing Views with Minimal Size to Answer Queries /09/2003 28 A formal perspective on the view selection problem

42 Some Directions of Future Work
Chirkova, Halevy, and Suciu VLDB-2001 5/9/2019 Some Directions of Future Work Rewriting queries in more expressive languages: built-in predicates disjunctive queries Using more expressive languages of views and rewritings Maximally-contained rewritings of queries in terms of views Chirkova and Li Materializing Views with Minimal Size to Answer Queries /09/2003 29 A formal perspective on the view selection problem

43 Chirkova, Halevy, and Suciu VLDB-2001
5/9/2019 Reference Jia Li, Rada Chirkova, and Chen Li. Minimizing Data-Communication Costs by Decomposing Query Results in Client-Server Environments. UCI ICS Technical Report, 2003. Chirkova and Li Materializing Views with Minimal Size to Answer Queries /09/2003 30 A formal perspective on the view selection problem


Download ppt "Materializing Views With Minimal Size To Answer Queries"

Similar presentations


Ads by Google