2005lav-iii1 The Infomaster system & the inverse rules algorithm  The InfoMaster system  The inverse rules algorithm  A side trip – equivalence & containment.

Slides:



Advertisements
Similar presentations
8.2 Integration by parts.
Advertisements

1 Datalog: Logic Instead of Algebra. 2 Datalog: Logic instead of Algebra Each relational-algebra operator can be mimicked by one or several Database Logic.
1 Decidable Containment of Recursive Queries Diego Calvanese, Giuseppe De Giacomo, Moshe Y. Vardi presented by Axel Polleres
Relational Calculus and Datalog
Theory of Computation CS3102 – Spring 2014 A tale of computers, math, problem solving, life, love and tragic death Nathan Brunelle Department of Computer.
SLD-resolution Introduction Most general unifiers SLD-resolution
CSE 636 Data Integration Conjunctive Queries Containment Mappings / Canonical Databases Slides by Jeffrey D. Ullman.
2005conjunctive-ii1 Query languages II: equivalence & containment (Motivation: rewriting queries using views)  conjunctive queries – CQ’s  Extensions.
Information Integration Using Logical Views Jeffrey D. Ullman.
Lecture 11: Datalog Tuesday, February 6, Outline Datalog syntax Examples Semantics: –Minimal model –Least fixpoint –They are equivalent Naive evaluation.
CPSC 504: Data Management Discussion on Chandra&Merlin 1977 Laks V.S. Lakshmanan Dept. of CS UBC.
Standard Logical Equivalences
1 Conjunctions of Queries. 2 Conjunctive Queries A conjunctive query is a single Datalog rule with only non-negated atoms in the body. (Note: No negated.
2005lav-ii1 Local as View: Some refinements  IM: Filtering irrelevant sources  Views with restricted access patterns  A summary of IM.
1 Introduction to Computability Theory Lecture12: Decidable Languages Prof. Amos Israeli.
Constraint Logic Programming Ryan Kinworthy. Overview Introduction Logic Programming LP as a constraint programming language Constraint Logic Programming.
1 9. Evaluation of Queries Query evaluation – Quantifier Elimination and Satisfiability Example: Logical Level: r   y 1,…y n  r’ Constraint.
Winter 2002Arthur Keller – CS 18014–1 Schedule Today: Feb. 26 (T) u Datalog. u Read Sections Assignment 6 due. Feb. 28 (TH) u Datalog and SQL.
Inference and Resolution for Problem Solving
2005certain1 Views as Incomplete Databases – Certain & Possible Answers  Views – an incomplete representation  Certain and possible answers  Complexity.
CSE 636 Data Integration Datalog Rules / Programs / Negation Slides by Jeffrey D. Ullman.
2005lav-iv1  On the Inverse rules algorithm It is guaranteed to compute the certain answers But, what about its efficiency? As presented, it computes.
1 Reverse of a Regular Language. 2 Theorem: The reverse of a regular language is a regular language Proof idea: Construct NFA that accepts : invert the.
2005conjunctive1 Query languages, equivalence & containment  conjunctive queries – CQ’s  More expressive languages.
Information Integration Using Logical Views Jeffrey D. Ullman.
1 Query Planning with Limited Source Capabilities Chen Li Stanford University Edward Y. Chang University of California, Santa Barbara.
Credit: Slides are an adaptation of slides from Jeffrey D. Ullman 1.
2005lav-i1 Local as View: First steps  Introduction and an example  Rewriting queries using views  The Information Manifold system.
CS5371 Theory of Computation Lecture 12: Computability III (Decidable Languages relating to DFA, NFA, and CFG)
Finite State Machines Data Structures and Algorithms for Information Processing 1.
Knowledge & Reasoning Logical Reasoning: to have a computer automatically perform deduction or prove theorems Knowledge Representations: modern ways of.
Objectives for Section 11.2 Derivatives of Exp/Log Functions
Presenter: Dongning Luo Sept. 29 th 2008 This presentation based on The following paper: Alon Halevy, “Answering queries using views: A Survey”, VLDB J.
MATH 224 – Discrete Mathematics
Recursive query plans for Data Integration Oliver Michael By Rajesh Kanisetti.
The Relational Model: Relational Calculus
1 Homework #7 (Models of Computation, Spring, 2001) Due: Section 1; April 16 (Monday) Section 2; April 17 (Tuesday) 2. Covert the following context-free.
Conjunctive normal form: any formula of the predicate calculus can be transformed into a conjunctive normal form. Def. A formula is said to be in conjunctive.
1 Chapter 8 Inference and Resolution for Problem Solving.
Slide 1 Propositional Definite Clause Logic: Syntax, Semantics and Bottom-up Proofs Jim Little UBC CS 322 – CSP October 20, 2014.
Datalog Inspired by the impedance mismatch in relational databases. Main expressive advantage: recursive queries. More convenient for analysis: papers.
Chapter 4 Additional Derivative Topics Section 2 Derivatives of Exponential and Logarithmic Functions.
Answering Queries Using Views LMSS’95 Laks V.S. Lakshmanan Dept. of Comp. Science UBC.
CSE 636 Data Integration Conjunctive Queries Containment Mappings / Canonical Databases Slides by Jeffrey D. Ullman Fall 2006.
ELIMINATING LEFT RECURSIVENESS. Abbreviation. “cfg” stands for “context free grammar” Definition. A cfg is left recursive if it contains a production.
CS Introduction to AI Tutorial 8 Resolution Tutorial 8 Resolution.
3.3 Product and Quotient Rule Fri Sept 25 Do Now Evaluate each 1) 2) 3)
LDK R Logics for Data and Knowledge Representation Propositional Logic: Reasoning First version by Alessandro Agostini and Fausto Giunchiglia Second version.
Automated Reasoning Systems For first order Predicate Logic.
1 Reasoning with Infinite stable models Piero A. Bonatti presented by Axel Polleres (IJCAI 2001,
Inference in First Order Logic. Outline Reducing first order inference to propositional inference Unification Generalized Modus Ponens Forward and backward.
SchemaLog – A Visual Perspective CPSC 534B Laks V.S. Lakshmanan UBC (names of schema components abbreviated.)
CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration.
1 Section 7.1 First-Order Predicate Calculus Predicate calculus studies the internal structure of sentences where subjects are applied to predicates existentially.
Answering Queries Using Views Presented by: Mahmoud ELIAS.
CS589 Principles of DB Systems Fall 2008 Lecture 4d: Recursive Datalog with Negation – What is the query answer defined to be? Lois Delcambre
Extensions of Datalog Wednesday, February 13, 2001.
CS589 Principles of DB Systems Fall 2008 Lecture 4c: Query Language Equivalence Lois Delcambre
Inverse Trigonometric Functions: Differentiation & Integration (5. 6/5
Goal for this lecture Demonstrate how we can prove that one query language is more expressive than (i.e., “contained in” as described in the book) another.
Ch. 6 – The Definite Integral
Containment Mappings Canonical Databases Sariaya’s Algorithm
Turing Machines Acceptors; Enumerators
CS 1502 Formal Methods in Computer Science
Local-as-View Mediators
Richard Anderson Lecture 10 Minimum Spanning Trees
Logic Based Query Languages
INVERSE FUNCTIONS.
Datalog Inspired by the impedance mismatch in relational databases.
Presentation transcript:

2005lav-iii1 The Infomaster system & the inverse rules algorithm  The InfoMaster system  The inverse rules algorithm  A side trip – equivalence & containment of Datalog programs

2005lav-iii2  The InfoMaster system A LAV system implemented (~same time) by a PhD student at Stanford (AI) Same basic idea of defining sources as views over global schema A different algorithm, the inverse rule algorithm, proved later (97) to solve all the problems mentioned above, even for recursive Datalog queries (Used for integrating many internal data sources in Stanford)

2005lav-iii3  The Inverse rules algorithm The idea: Invert a view definition v(..) :- body to obtain rules that define the db relations in terms of the view Then combine with the given query Example : A db: a graph represented by the edge relation e(x, y) A view: v(X, Y) :- e(X, Z), e(Z, Y) // 2-steps only A recursive query: Q: (transitive closure) q(X, Y) :- e(X, Y) q(X, Y) :- e(X, Z), q(Z, Y)

2005lav-iii4 Step 1 – invert the view definition : The view: v(X, Y) :- e(X, Z), e(Z, Y) With exist: v(X, Y) :- exists Z. e(X, Z), e(Z, Y) Skolemize : v(X, Y) :- e(X, f(X,Y)), e(f(X,Y), Y) Invert: e(X, f(X,Y)) :- v(X, Y) e(f(X, Y), Y) :- v(X, Y) Step 2: Add the rules of Q: q(X, Y) :- e(X, Y) q(X, Y) :- e(X, Z), q(Z, Y) The query plan

2005lav-iii5 Assume the db is: 1.What are the facts in the view? v(a, c), v(b, d), v(c, e) 2. What are the db “facts” derived from the view? e(a, f(a,c)), e(f(a,c), c), e(b, f(b,d)), e(f(b,d), d), e(c, f(c,e)), e(f(c,e), e) a cbd e G: a cbd e f(a,c)f(b,d)f(c,e)

2005lav-iii6 3. What is the result of the combined set of rules on this db? -- the rules for q now compute a transitive closure: dist. 1 : q(X, Y) :- e(X, Y) q(a, f(a,c)), q(f(a,c), c), q(b, f(b,d)), q(f(b,d), d), q(c, f(c,e)), q(f(c,e), e), dist. 2 q(X, Y) :- e(X, Z), q(Z, Y) q(a,c), q(b,d), q(c, e), q(f(a,c), f(c,e)), dist. 3, dist. 4 : let’s do it 4. The facts w/o function symbols are the answer! q(a,c), q(b,d), q(c, e), q(a, e) a cbd e f(a,c)f(b,d)f(c,e)

2005lav-iii7 Note: The program above looks like an expensive way for answering a query; why? 1.We compute a full representative of the db. 2.Although in computing the view a join was evaluated, it is now re-evaluated.

2005lav-iii8 The Algorithm (for a set of views defined by CQ’s, a Datalog query P): For each view rule, with head vars X, replace each existential variable y in body by f(X), using a different function symbol for each variable, in each rule Invert the rules, to a set of rules that define the body atoms (db preds) in terms of the views: Add the program P : Compute Project on the atoms w/o function symbols: Note: rules of P that use db predicates not mentioned in the views are dropped first – these cannot derive answers from the views

2005lav-iii9 Notes: The program is not Datalog: It contains function symbols  it is a logic program but, in its evaluation, function symbols will not be nested  The evaluation on finite sources will terminate It is possible to eliminate the function symbols, to obtain a Datalog program (proof deferred) If the query is UCQ/nr-datalog, so is the query plan: the part (*) added to Q is a collection of UCQ’s (*) the inverted program is just one non-recursive layer – the db facts are computed by UCQ queries

2005lav-iii10 Thm: For every CQ view definitions V, Datalog query P on V, the program is a maximally contained query plan for P That is, for a db D with view extensions v1,…,vn, If P’ is any contained datalog plan, then (proof deferred) Thm: can be constructed in time polynomial in the size of V, Q (compare to the NP-completeness of finding a rewriting in previous approach)

2005lav-iii11 Thm : Given CQ view definitions V, Datalog query P on V, it is undecidable if there is an equivalent datalog query plan (proof omitted) But, if there is one, then the inverse rules algorithms is one, constructible in poly time in the size of V and Q! (given the program generated by the algorithm, we do not know, and cannot know if it is an equivalent rewriting!) What about the case that the query is CQ, or UCQ?