Datalog Inspired by the impedance mismatch in relational databases. Main expressive advantage: recursive queries. More convenient for analysis: papers.

Slides:



Advertisements
Similar presentations
1 Datalog: Logic Instead of Algebra. 2 Datalog: Logic instead of Algebra Each relational-algebra operator can be mimicked by one or several Database Logic.
Advertisements

1 Decidable Containment of Recursive Queries Diego Calvanese, Giuseppe De Giacomo, Moshe Y. Vardi presented by Axel Polleres
Relational Calculus and Datalog
CSE 636 Data Integration Conjunctive Queries Containment Mappings / Canonical Databases Slides by Jeffrey D. Ullman.
Lecture 11: Datalog Tuesday, February 6, Outline Datalog syntax Examples Semantics: –Minimal model –Least fixpoint –They are equivalent Naive evaluation.
Logic.
Virtual Data Integration Helena Galhardas DEI IST (based on the slides of the course: CIS 550 – Database & Information Systems, Univ. Pennsylvania, Zachary.
1 Recursive SQL, Deductive Databases, Query Evaluation Book Chapter of Ramankrishnan and Gehrke DBMS Systems, 3 rd ed.
1 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke Deductive Databases Chapter 25.
L8-S1 Datalog Queries 2003 SJSU -- CmpE Database Design Dr. M.E. Fayad, Professor Computer Engineering Department, Room #283I College of Engineering San.
Constraint Logic Programming Ryan Kinworthy. Overview Introduction Logic Programming LP as a constraint programming language Constraint Logic Programming.
1 9. Evaluation of Queries Query evaluation – Quantifier Elimination and Satisfiability Example: Logical Level: r   y 1,…y n  r’ Constraint.
1 Data Definition in SQL So far we have see the Data Manipulation Language, DML Next: Data Definition Language (DDL) Data types: Defines the types. Data.
Winter 2002Arthur Keller – CS 18014–1 Schedule Today: Feb. 26 (T) u Datalog. u Read Sections Assignment 6 due. Feb. 28 (TH) u Datalog and SQL.
Conjunctive Queries, Datalog, and Recursion Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems October 23, 2003 Some slide.
FDImplication: 1 Functional Dependencies (FDs) Let r(R) be a relation and let t  r, then the restriction of t to X  R, written t[X], is the projection.
CSE 636 Data Integration Datalog Rules / Programs / Negation Slides by Jeffrey D. Ullman.
1 Datalog Logical Rules Recursion SQL-99 Recursion.
2005conjunctive1 Query languages, equivalence & containment  conjunctive queries – CQ’s  More expressive languages.
Chapter 5 Other Relational Languages By Cui, Can B.
Embedded SQL Direct SQL is rarely used: usually, SQL is embedded in some application code. We need some method to reference SQL statements. But: there.
Logical Rules Recursion
Cs5611 Recursive SQL, Deductive Databases, Query Evaluation Slides based on book chapter, By Ramankrishnan and Gehrke DBMS Systems, 3 rd ed.
1 Datalog Logical Rules Recursion. 2 Logic As a Query Language uIf-then logical rules have been used in many systems. wMost important today: EII (Enterprise.
Credit: Slides are an adaptation of slides from Jeffrey D. Ullman 1.
Deductive Databases Chapter 25
SQL (almost end) April 26 th, Agenda HAVING clause Views Modifying views Reusing views.
Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical.
DEDUCTIVE DATABASE.
Databases 1 8th lecture. Topics of the lecture Multivalued Dependencies Fourth Normal Form Datalog 2.
Recursive query plans for Data Integration Oliver Michael By Rajesh Kanisetti.
Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical.
Computing & Information Sciences Kansas State University Thursday, 08 Feb 2007CIS 560: Database System Concepts Lecture 11 of 42 Thursday, 08 February.
1 Querying Infinite Databases Safety of Datalog Queries over infinite Databases (Sagiv and Vardi ’90) Queries and Computation on the Web (Abiteboul and.
CSE 636 Data Integration Conjunctive Queries Containment Mappings / Canonical Databases Slides by Jeffrey D. Ullman Fall 2006.
Datalog –Another query language –cleaner – closer to a “logic” notation, prolog – more convenient for analysis – can express queries that are not expressible.
Chapter 5 Notes. P. 189: Sets, Bags, and Lists To understand the distinction between sets, bags, and lists, remember that a set has unordered elements,
1 Data Models and Query Languages CSE 590DB, Winter 1999 Theory of Databases Zack Ives January 10, 1999.
Outline Logistics (Project) & Review First Order Predicate Calculus Relational Algebra Datalog Information Integration Softbots Query Containment Rewriting.
Computing & Information Sciences Kansas State University Wednesday, 17 Sep 2008CIS 560: Database System Concepts Lecture 9 of 42 Wednesday, 18 September.
Row Types in SQL-3 Row types define types for tuples, and they can be nested. CREATE ROW TYPE AddressType{ street CHAR(50), city CHAR(25), zipcode CHAR(10)
Lu Chaojun, SJTU 1 Extended Relational Algebra. Bag Semantics A relation (in SQL, at least) is really a bag (or multiset). –It may contain the same tuple.
Database Management Systems Course Faculty of Computer Science Technion – Israel Institute of Technology Lecture 5: Queries in Logic.
Datalog Another formalism for expressing queries: - cleaner - closer to a “logic” notation - more convenient for analysis - equivalent in power to relational.
ICS 321 Fall 2011 Algebraic and Logical Query Languages (ii) Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at.
Security and User Authorization in SQL. Lu Chaojun, SJTU 2 Security Two aspects: –Users only see the data they’re supposed to; –Guard against malicious.
603 Database Systems Senior Lecturer: Laurie Webster II, M.S.S.E.,M.S.E.E., M.S.BME, Ph.D., P.E. Lecture 26 A First Course in Database Systems.
CS589 Principles of DB Systems Fall 2008 Lecture 4d: Recursive Datalog with Negation – What is the query answer defined to be? Lois Delcambre
Extensions of Datalog Wednesday, February 13, 2001.
1 Datalog with negation Adapted from slides by Jeff Ullman.
1 Finite Model Theory Lecture 9 Logics and Complexity Classes (cont’d)
CS589 Principles of DB Systems Fall 2008 Lecture 4c: Query Language Equivalence Lois Delcambre
CS589 Principles of DB Systems Spring 2014 Unit 2: Recursive Query Processing Lecture 2-1 – Naïve algorithm for recursive queries Lois Delcambre (slides.
Extracting Schema from Semistructured Data
Datalog Rules / Programs / Negation Slides by Jeffrey D. Ullman
Modifying the Database
Containment Mappings Canonical Databases Sariaya’s Algorithm
Semantics of Datalog With Negation
Logic for Artificial Intelligence
Cse 344 April 11th – Datalog.
CSE 344: Section 5 Datalog February 1st, 2018.
Lecture 10: Relational Algebra (continued), Datalog
Cse 344 January 29th – Datalog.
Motivation for Datalog
Lecture 9: Relational Algebra (continued), Datalog
Where are we? Until now: Modeling databases (ODL, E/R): all about the schema Now: Manipulating the data: queries, updates, SQL Then: looking inside -
Logic Based Query Languages
Datalog Inspired by the impedance mismatch in relational databases.
CS589 Principles of DB Systems Fall 2008 Lecture 4d: Recursive Datalog with Negation – What is the query answer defined to be? Lois Delcambre
Rules Programs Negation
Presentation transcript:

Datalog Inspired by the impedance mismatch in relational databases. Main expressive advantage: recursive queries. More convenient for analysis: papers look better. Without recursion but with negation it is equivalent in power to relational algebra Has affected real practice: (e.g., recursion in SQL3, magic sets transformations).

Datalog Concepts Atoms Datalog rules, datalog programs EDB predicates, IDB predicates Conjunctive queries Recursion Built-in predicates Negated atoms, stratified programs. Semantics: least fixpoint.

Predicates and Atoms - Relations are represented by predicates - Tuples are represented by atoms. Purchase( “joe”, “bob”, “Nike Town”, “Nike Air”, 2/2/98) - arithmetic, built-in, atoms: X Z/2 - negated atoms: NOT Product(“Linux OS”, $100, “Microsoft”)

Datalog Rules and Queries A datalog rule has the following form: head :- atom1, atom2, …., atom,… Examples: PerformingComp(name) :- Company(name,sp,c), sp > $50 AmericanProduct(prod) :- Product(prod,pr,cat,mak), Company(mak, sp,“USA”) All the variables in the head must appear in the body. A single rule can express exactly select-from-where queries.

Datalog Terminology A datalog program is a set of datalog rules. A program with a single rule is a conjunctive query. We distinguish EDB predicates and IDB predicates: EDB’s are stored in the database, appear only in the bodies IDB’s are intensionally defined, appear in both bodies and heads

The Meaning of Datalog Rules AmericanProduct(prod) :- Product(prod,pr,cat,mak), Company(mak, sp,“USA”) Consider every assignment from the variables in the body to the constants in the database. If each of the atoms in the body is in the database, then the tuple for the head is in the relation of the head.

More Examples CREATE VIEW Seattle-view AS SELECT buyer, seller, product, store FROM Person, Purchase WHERE Person.city = “Seattle” AND Person.per-name = Purchase.buyer SeattleView(buyer,seller,product,store) :- Person(buyer, “Seattle”, phone), Purchase(buyer, seller, product, store).

More Examples (negation, union) SeattleView(buyer,seller,product,store) :- Person(buyer, “Seattle”, phone), Purchase(buyer, seller, product, store) not Purchase(buyer, seller, product, “The Bon”) Q5(buyer) :- Purchase(buyer, “Joe”, prod, store) Q5(buyer) :- Purchase(buyer, seller, store, prod), Product(prod, price, cat, maker) Company(maker, sp, country), sp > 50.

Defining Views SeattleView(buyer,seller,product,store) :- Person(buyer, “Seattle”, phone), Purchase(buyer, seller, product, store) not Purchase(buyer, seller, product, “The Bon”) Q6(buyer) :- SeattleView(buyer, “Joe”, prod, store) Q6(buyer) :- SeattleView(buyer, seller, store, prod), Product(prod, price, cat, maker) Company(maker, sp, country), sp > 50.

Meaning of Datalog Programs Repeat the following until you cannot derive any new facts: Consider every assignment from the variables in the body to the constants in the database. If each of the atoms in the body is made true by the assignment, then, add the tuple for the head into the relation of the head. Start with the facts in the EDB and iteratively derive facts for IDBs.

Transitive Closure Suppose we are representing a graph by a relation Edge(X,Y): Edge(a,b), Edge (a,c), Edge(b,d), Edge(c,d), Edge(d,e) a b c d e I want to express the query: Find all nodes reachable from a.

Recursion in Datalog Path( X, Y ) :- Edge( X, Y ) Path( X, Y ) :- Path( X, Z ), Path( Z, Y ). Semantics: evaluate the rules until a fixedpoint: Iteration #0: Edge: {(a,b), (a,c), (b,d), (c,d), (d,e)} Path: {} Iteration #1: Path: {(a,b), (a,c), (b,d), (c,d), (d,e)} Iteration #2: Path gets the new tuples: (a,d), (b,e), (c,e) Iteration #3: Path gets the new tuple: (a,e) Iteration #4: Nothing changes -> We stop. Note: number of iterations depends on the data. Cannot be anticipated by only looking at the query!

Model-Theoretic Semantics An interpretation is an assignment of extensions to the EDB and IDB rules. An interpretation of a DB is a model for it whenever it satisfies the rules: –I.e., if you apply the rules, you get nothing new. The least fixpoint model of a datalog program is also the intersection of all of its models (for a given EDB extension).

Built in Predicates Rules may include atoms with built-in predicates: ExpensiveProduct(X) :- Product(X,Y,P) & P > $100 But: we need to restrict the use of built-in atoms in rules. P(X) :- R(X) & X<Y What does this mean? We could use active domain semantics, but that’s problematic. Hence, we require that every variable that appears in a built-in atom also appears in a relational atom.

Negated Subgoals Rules may include negated subgoals, but in restricted forms: P(X,Y) :- Between(X,Y,Z) & NOT Direct(X,Z) Bad: P(X, Y) :- R(X) & NOT S(Y) Bad but ok: P(X) :- R(X) & NOT S(X,Y) We’ll rewrite as: S’(X) :- S(X,Y) P(X) :- R(X) & NOT S’(X)

Stratified Rules A predicate P depends – (+) on a predicate Q if: Q appears negated (positive) in a rule defining P. If there is a cycle in the dependency graph that involves a - edge, the datalog program is not stratified. Example: p(X) :- r(X) & NOT q(X) q(X) :- r(X) & NOT p(X) Suppose r has the tuple {1}.

Subtleties with Stratified Rules Example: p(X) :- r(X) q(X) :- s(X) & NOT p(X). Suppose: R = {1}, and S = {1,2} One solution: P = {1} and Q = {2} Another solution: P={1,2} and Q={}. Perfect model semantics: apply the rules stratum after stratum.

Deductive Databases General idea: some relations are stored (extensional), others are defined by datalog queries (intensional). Many research projects (MCC, Stanford, Wisconsin) [Great Ph.D theses!] SQL3 realized that recursion is useful, and added linear recursion. Hard problem: optimizing datalog performance. Ideas from deductive databases made it into the mainstream.