# 1 Conjunctions of Queries. 2 Conjunctive Queries A conjunctive query is a single Datalog rule with only non-negated atoms in the body. (Note: No negated.

## Presentation on theme: "1 Conjunctions of Queries. 2 Conjunctive Queries A conjunctive query is a single Datalog rule with only non-negated atoms in the body. (Note: No negated."— Presentation transcript:

1 Conjunctions of Queries

2 Conjunctive Queries A conjunctive query is a single Datalog rule with only non-negated atoms in the body. (Note: No negated atoms and no comparisons) A conjunctive query has only EDB in its body We say that a query Q1 is contained in a query Q2 if for all databases D, the result of computing Q1 on D is contained in the result of computing Q2 on D.

3 Example Consider the queries Q1: p(X, Y) :- e(X, Z), e(Z, Y) Q2: p (X, Y) :- e(X, Z), e(Z, Y), e(X, W) It is easy to see that Q2 is contained in Q1 since any mapping that satisfies Q2 also satisfies Q1 Can you prove that Q1 is contained in Q2 ? Which of the queries is faster to compute?

4 Homomorphisms A symbol mapping is a mapping of variables to other variables or to constants and of constants to constants Consider the queries Q1, defined as H1:-B1 and Q2, defined as H2:-B2. A symbol mapping h of the variables and constants in Q1 to those in Q2 is a homomorphism if: –h(H1) = H2 –h(B1) is contained in B2

5 Example Consider the queries Q1: p (X, Y) :- e(X, Z), e(Z, Y) Q2: p(X, Y) :- e(X, Z), e(Z, Y), e(X, W) A homomorphism from Q1 to Q2: –h(X) = X, h(Y) = Y, h(Z) = Z Can you find a homomorphism from Q2 to Q1? –h(X) =, h(Y) =, h(Z) =, h(W) =

6 Containment Theorem: Q1 is contained in Q2 if and only if there is a homomorphism from Q2 to Q1 Proof: (if) Suppose that there is a homomorphism h from Q2 to Q1. –Let D be a database. –Let f be an assignment that satisfies the body of Q1.

7 –Then, Q1 returns f(H1). –We show that f  h is a satisfying assignment of the body of Q2 that returns f(H1). f  h(B2) = f(h(B2)) which is contained in f(B1) which is contained in D  f  h satisfies B2 f  h(H2) = f(h(H2)) = f(H1) –Therefore, Q2 also returns f(H1), as required If (continued)

8 Only If (Only If) Suppose that Q1 is contained in Q2. We show that there is a homomorphism from Q2 to Q1. –Let f be a symbol mapping of the constants and variables in Q1 to distinct constants. –Let D be the database defined by f(B1). –Then if we compute Q1 on D, f(H1) will be returned. –Since Q1 is contained in Q2, when we compute Q2 on D, f(H1) will be returned.

9 Only If (continued) –Let g be the mapping of Q2 that returns f(H1). –We show that f -1  g is a homomorphism from Q2 to Q1. –Note that f -1 is well defined since f -1 is injective –f -1  g (B2) = f -1 (g(B2)) is contained in f -1 (D) which is equal to B1 –f -1  g (H2) = f -1 (g(H2)) = f -1 (f(H1)) = H1 –Therefore, there is a homomorphism from Q2 to Q1

10 An Optimization We can optimize the query p 2 (X, Y) :- e(X, Z), e(Z, Y), e(X, W) by removing the last atom In General: Given a query Q: For each atom a the body of Q Let Q’ be Q without a If Q’ is equivalent to Q, then remove a from Q Note that is is sufficient to check if Q’ is contained in Q

11 Containment with FDs Consider the queries Q1: p(X, Y) :- e(X, X), e(X, Y) Q2: p (X, X) :- e(X, X) Then Q2 is contained in Q1. However Q1 is not contained in Q2. What about if we know that the first column in e functionally determines the second column?

12 The Chase Given a query and a set of FDs, we apply a chase of the FDs to the query by finding any contradiction to a FD and “fixing it” by equating the tail. For example, we chase with e:\$1  \$2 p(X, Y) :- e(X, X), e(X, Y)  p(X, X) :- e(X, X), e(X, X)

13 Checking Containment In order to check containment we first apply a chase to both queries and then check for a homomorphism. Can you prove that this process is correct?

14 Checking for a Lossless Join Suppose R=(C,D,E), R1=(C,D), R2=(D,E) and F={C  D, D  E} We want to check if for all instances r of R, r =  R1 (r)   R2 (r) A row (A1, A2, A3) is in  R1 (r)   R2 (r) if there is some row (A1, A2) in  R1 (r) and some row (A2, A3) in  R2 (r). A row (A1, A2, A3) is in  R1 (r)   R2 (r) if there is some row (A1, A2, B1) in r and some row (B2, A2, A3) in r.

15 Checking for a Lossless Join We can express the rules stated above by Q1: p(A1,A2,A3) :- r(A1, A2, B1), r(B2, A2, A3) Q1 expresses the set of rows in  R1 (r)   R2 (r) We can express the set of rows in r by Q2: p(A1,A2,A3) :- r(A1, A2, A3) So, r =  R1 (r)   R2 (r) if and only if Q1 is equivalent to Q2. Note that Q2 is always contained in Q1. To check containment of Q1 in Q2, apply the chase and then look for a homomorphism from Q2 to Q1. There will be a homomorphism iff one of the atoms in the body of Q1 contains only a-s.

16 Checking for a Lossless Join Suppose R=(C,D,E), R1=(C,E), R2=(D,E) and F={C  D, D  E} Create the queries: Q1: p(A1,A2,A3) :- r(A1, B1, A3), r(B2, A2, A3) Q2: p(A1,A2,A3) :- r(A1, A2, A3) Note that Q2 is contained in Q1 Q1 is not contained in Q2, even after applying a chase, since after applying the chase there is no rows with only a-s.

17 General Algorithm Given a relation R with m attributes and a decomposition R1,...,Rn: –For each Ri, create an atom r with Aj in place j if the j-th attribute of R is in Ri. Put unused variables in the rest of the places. –Create a rule with head p(A1,...,Am) and all the atoms from before in the body –Create a rule p(A1,...,Am):-r(A1,...,Am) –There is a lossless join iff the queries are equivalent

Download ppt "1 Conjunctions of Queries. 2 Conjunctive Queries A conjunctive query is a single Datalog rule with only non-negated atoms in the body. (Note: No negated."

Similar presentations