Winter 2002Arthur Keller – CS 18014–1 Schedule Today: Feb. 26 (T) u Datalog. u Read Sections 10.1-10.2. Assignment 6 due. Feb. 28 (TH) u Datalog and SQL.

Slides:



Advertisements
Similar presentations
1 Datalog: Logic Instead of Algebra. 2 Datalog: Logic instead of Algebra Each relational-algebra operator can be mimicked by one or several Database Logic.
Advertisements

Relational Calculus and Datalog
CSE 636 Data Integration Conjunctive Queries Containment Mappings / Canonical Databases Slides by Jeffrey D. Ullman.
1 541: Relational Calculus. 2 Relational Calculus  Comes in two flavours: Tuple relational calculus (TRC) and Domain relational calculus (DRC).  Calculus.
Winter 2002Arthur Keller – CS 1806–1 Schedule Today: Jan. 22 (T) u SQL Queries. u Read Sections Assignment 2 due. Jan. 24 (TH) u Subqueries, Grouping.
1 Datalog Logical Rules Recursion SQL-99 Recursion.
Winter 2002Arthur Keller – CS 18015–1 Schedule Today: Feb. 28 (TH) u Datalog and SQL Recursion, ODL. u Read Sections , Project Part 6.
Conjunctive Queries, Datalog, and Recursion Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems October 23, 2003 Some slide.
CSE 636 Data Integration Datalog Rules / Programs / Negation Slides by Jeffrey D. Ullman.
Winter 2002Arthur Keller – CS 1809–1 Schedule Today: Jan. 31 (TH) u Constraints. u Read Sections , Project Part 3 due. Feb. 5 (T) u Triggers,
1 Datalog Logical Rules Recursion SQL-99 Recursion.
Embedded SQL Direct SQL is rarely used: usually, SQL is embedded in some application code. We need some method to reference SQL statements. But: there.
Winter 2002Arthur Keller – CS 1807–1 Schedule Today: Jan. 24 (TH) u Subqueries, Grouping and Aggregation. u Read Sections Project Part 2 due.
Logical Rules Recursion
Fall 2001Arthur Keller – CS 1804–1 Schedule Today Oct. 4 (TH) Functional Dependencies and Normalization. u Read Sections Project Part 1 due. Oct.
1 Datalog Logical Rules Recursion. 2 Logic As a Query Language uIf-then logical rules have been used in many systems. wMost important today: EII (Enterprise.
Credit: Slides are an adaptation of slides from Jeffrey D. Ullman 1.
Deductive Databases Chapter 25
Rutgers University Relational Calculus 198:541 Rutgers University.
Database Systems Logical Query Languages assoc. prof., dr. Vladimir Dimitrov web: is.fmi.uni-sofia.bg.
Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical.
Databases 1 8th lecture. Topics of the lecture Multivalued Dependencies Fourth Normal Form Datalog 2.
SCUHolliday6–1 Schedule Today: u SQL Queries. u Read Sections Next time u Subqueries, Grouping and Aggregation. u Read Sections And then.
Recursive query plans for Data Integration Oliver Michael By Rajesh Kanisetti.
The Relational Model: Relational Calculus
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Relational Calculus Chapter 4, Section 4.3.
1 CS 430 Database Theory Winter 2005 Lecture 6: Relational Calculus.
Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical.
CSE 544 Relational Calculus Lecture #2 January 11 th, Dan Suciu , Winter 2011.
Computing & Information Sciences Kansas State University Thursday, 08 Feb 2007CIS 560: Database System Concepts Lecture 11 of 42 Thursday, 08 February.
Datalog Inspired by the impedance mismatch in relational databases. Main expressive advantage: recursive queries. More convenient for analysis: papers.
CSE 636 Data Integration Conjunctive Queries Containment Mappings / Canonical Databases Slides by Jeffrey D. Ullman Fall 2006.
Datalog –Another query language –cleaner – closer to a “logic” notation, prolog – more convenient for analysis – can express queries that are not expressible.
Chapter 5 Notes. P. 189: Sets, Bags, and Lists To understand the distinction between sets, bags, and lists, remember that a set has unordered elements,
Databases 1 Second lecture.
1 Introduction to SQL Database Systems. 2 Why SQL? SQL is a very-high-level language, in which the programmer is able to avoid specifying a lot of data-manipulation.
Row Types in SQL-3 Row types define types for tuples, and they can be nested. CREATE ROW TYPE AddressType{ street CHAR(50), city CHAR(25), zipcode CHAR(10)
Lu Chaojun, SJTU 1 Extended Relational Algebra. Bag Semantics A relation (in SQL, at least) is really a bag (or multiset). –It may contain the same tuple.
SCUHolliday - coen 1787–1 Schedule Today: u Subqueries, Grouping and Aggregation. u Read Sections Next u Modifications, Schemas, Views. u Read.
Datalog Another formalism for expressing queries: - cleaner - closer to a “logic” notation - more convenient for analysis - equivalent in power to relational.
ICS 321 Fall 2011 Algebraic and Logical Query Languages (ii) Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at.
Security and User Authorization in SQL. Lu Chaojun, SJTU 2 Security Two aspects: –Users only see the data they’re supposed to; –Guard against malicious.
Database Management Systems, R. Ramakrishnan1 Relational Calculus Chapter 4, Part B.
CS589 Principles of DB Systems Fall 2008 Lecture 4d: Recursive Datalog with Negation – What is the query answer defined to be? Lois Delcambre
Extensions of Datalog Wednesday, February 13, 2001.
1 Datalog with negation Adapted from slides by Jeff Ullman.
CS589 Principles of DB Systems Fall 2008 Lecture 4c: Query Language Equivalence Lois Delcambre
Relational Calculus Chapter 4, Section 4.3.
CS589 Principles of DB Systems Spring 2014 Unit 2: Recursive Query Processing Lecture 2-1 – Naïve algorithm for recursive queries Lois Delcambre (slides.
Schedule Today: Jan. 28 (Mon) Jan. 30 (Wed) Next Week Assignments !!
Slides are reused by the approval of Jeffrey Ullman’s
CPSC-310 Database Systems
Datalog Rules / Programs / Negation Slides by Jeffrey D. Ullman
CPSC-310 Database Systems
Schedule Today: Next After that Subqueries, Grouping and Aggregation.
CPSC-608 Database Systems
Semantics of Datalog With Negation
CSE 344: Section 5 Datalog February 1st, 2018.
Cse 344 January 29th – Datalog.
Motivation for Datalog
Logic Based Query Languages
Datalog Inspired by the impedance mismatch in relational databases.
CPSC-608 Database Systems
CPSC-608 Database Systems
CPSC-608 Database Systems
Relational Calculus Chapter 4, Part B 7/1/2019.
Instructor: Zhe He Department of Computer Science
Select-From-Where Statements Multirelation Queries Subqueries
Rules Programs Negation
Presentation transcript:

Winter 2002Arthur Keller – CS 18014–1 Schedule Today: Feb. 26 (T) u Datalog. u Read Sections Assignment 6 due. Feb. 28 (TH) u Datalog and SQL Recursion, ODL. u Read Sections , Project Part 6 due. Mar. 5 (T) u More ODL, OQL. u Read Sections 9.1. Assignment 7 due. Mar. 7 (TH) u More OQL. u Read Sections

Winter 2002Arthur Keller – CS 18014–2 Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical rules form the basis for many information-integration systems and applications.

Winter 2002Arthur Keller – CS 18014–3 Datalog Example Likes(drinker, beer) Sells(bar, beer, price) Frequents(drinker, bar) Happy(d) <- Frequents(d,bar) AND Likes(d,beer) AND Sells(bar,beer,p) Above is a rule. Left side = head. Right side = body = AND of subgoals. Head and subgoals are atoms. u Atom = predicate and arguments. u Predicate = relation name or arithmetic predicate, e.g. <. u Arguments are variables or constants. Subgoals (not head) may optionally be negated by NOT.

Winter 2002Arthur Keller – CS 18014–4 Meaning of Rules Head is true of its arguments if there exist values for local variables (those in body, not in head) that make all of the subgoals true. If no negation or arithmetic comparisons, just natural join the subgoals and project onto the head variables. Example Above rule equivalent to Happy(d) = π drinker (Frequents Likes Sells)

Winter 2002Arthur Keller – CS 18014–5 Evaluation of Rules Two, dual, approaches: 1. Variable-based: Consider all possible assignments of values to variables. If all subgoals are true, add the head to the result relation. 2. Tuple-based: Consider all assignments of tuples to subgoals that make each subgoal true. If the variables are assigned consistent values, add the head to the result. Example: Variable-Based Assignment S(x,y) <- R(x,z) AND R(z,y) AND NOT R(x,y) R = AB 12 23

Winter 2002Arthur Keller – CS 18014–6 Only assignments that make first subgoal true: 1. x  1, z  x  2, z  3. In case (1), y  3 makes second subgoal true. Since (1,3) is not in R, the third subgoal is also true. u Thus, add (x,y) = (1,3) to relation S. In case (2), no value of y makes the second subgoal true. Thus, S = AB 13

Winter 2002Arthur Keller – CS 18014–7 Example: Tuple-Based Assignment Trick: start with the positive (not negated), relational (not arithmetic) subgoals only. S(x,y) <- R(x,z) AND R(z,y) AND NOT R(x,y) R = AB Four assignments of tuples to subgoals: R(x,z)R(z,y)(1,2) (1,2)(2,3) (2,3)(1,2)(2,3) Only the second gives a consistent value to z. That assignment also makes NOT R(x,y) true. Thus, (1,3) is the only tuple for the head.

Winter 2002Arthur Keller – CS 18014–8 Safety A rule can make no sense if variables appear in funny ways. Examples S(x) <- R(y) S(x) <- NOT R(x) S(x) <- R(y) AND x < y In each of these cases, the result is infinite, even if the relation R is finite. To make sense as a database operation, we need to require three things of a variable x (= definition of safety). If x appears in either 1. The head, 2. A negated subgoal, or 3. An arithmetic comparison, then x must also appear in a nonnegated, “ordinary” (relational) subgoal of the body. We insist that rules be safe, henceforth.

Winter 2002Arthur Keller – CS 18014–9 Datalog Programs A collection of rules is a Datalog program. Predicates/relations divide into two classes: u EDB = extensional database = relation stored in DB. u IDB = intensional database = relation defined by one or more rules. A predicate must be IDB or EDB, not both. u Thus, an IDB predicate can appear in the body or head of a rule; EDB only in the body.

Winter 2002Arthur Keller – CS 18014–10 Example Convert the following SQL (Find the manufacturers of the beers Joe sells): Beers(name, manf) Sells(bar, beer, price) SELECT manf FROM Beers WHERE name IN( SELECT beer FROM Sells WHERE bar = 'Joe''s Bar' ); to a Datalog program. JoeSells(b) <- Sells('Joe''s Bar', b, p) Answer(m) <- JoeSells(b) AND Beers(b,m) Note: Beers, Sells = EDB; JoeSells, Answer = IDB.

Winter 2002Arthur Keller – CS 18014–11 Expressive Power of Datalog Nonrecursive Datalog = (classical) relational algebra. u See discussion in text. Datalog simulates SQL select-from-where without aggregation and grouping. Recursive Datalog expresses queries that cannot be expressed in SQL. But none of these languages have full expressive power (Turing completeness).

Winter 2002Arthur Keller – CS 18014–12 Recursion IDB predicate P depends on predicate Q if there is a rule with P in the head and Q in a subgoal. Draw a graph: nodes = IDB predicates, arc P  Q means P depends on Q. Cycles if and only if recursive. Recursive Example Sib(x,y) <- Par(x,p) AND Par(y,p) AND x <> y Cousin(x,y) <- Sib(x,y) Cousin(x,y) <- Par(x,xp) AND Par(y,yp) AND Cousin(xp,yp)

Winter 2002Arthur Keller – CS 18014–13 Iterative Fixed-Point Evaluates Recursive Rules Start IDB = ø Change to IDB? Apply rules to IDB, EDB yes no done

Winter 2002Arthur Keller – CS 18014–14 Example EDB Par = Note, because of symmetry, Sib and Cousin facts appear in pairs, so we shall mention only (x,y) when both (x,y) and (y,x) are meant. ad ecb hgf i kj

Winter 2002Arthur Keller – CS 18014–15 SibCousin Initial  Round 1(b,c), (c,e)  add:(g,h), (j,k) Round 2(b,c), (c,e) add:(g,h), (j,k) Round 3(f,g), (f,h) add:(g,i), (h,i) (i,k) Round 4(k,k) add:(i,j)

Winter 2002Arthur Keller – CS 18014–16 Stratified Negation Negation wrapped inside a recursion makes no sense. Even when negation and recursion are separated, there can be ambiguity about what the rules mean, and some one meaning must be selected. Stratified negation is an additional restraint on recursive rules (like safety) that solves both problems: 1. It rules out negation wrapped in recursion. 2. When negation is separate from recursion, it yields the intuitively correct meaning of rules (the stratified model).

Winter 2002Arthur Keller – CS 18014–17 Problem with Recursive Negation Consider: P(x) <- Q(x) AND NOT P(x) Q = EDB = {1,2}. Compute IDB P iteratively? u Initially, P = . u Round 1: P = {1,2}. u Round 2: P = , etc., etc.

Winter 2002Arthur Keller – CS 18014–18 Strata Intuitively: stratum of an IDB predicate = maximum number of negations you can pass through on the way to an EDB predicate. Must not be  in “stratified” rules. Define stratum graph: u Nodes = IDB predicates. u Arc P  Q if Q appears in the body of a rule with head P. u Label that arc “–” if Q is in a negated subgoal. Example P(x) <- Q(x) AND NOT P(x) P –

Winter 2002Arthur Keller – CS 18014–19 Example Which target nodes cannot be reached from any source node? Reach(x) <- Source(x) Reach(x) <- Reach(y) AND Arc(y,x) NoReach(x) <- Target(x) AND NOT Reach(x) NoReach Reach –

Winter 2002Arthur Keller – CS 18014–20 Computing Strata Stratum of an IDB predicate A = maximum number of “–” arcs on any path from A in the stratum graph. Examples For first example, stratum of P is . For second example, stratum of Reach is 0; stratum of NoReach is 1. Stratified Negation A Datalog program is stratified if every IDB predicate has a finite stratum. Stratified Model If a Datalog program is stratified, we can compute the relations for the IDB predicates lowest-stratum-first.

Winter 2002Arthur Keller – CS 18014–21 Example Reach(x) <- Source(x) Reach(x) <- Reach(y) AND Arc(y,x) NoReach(x) <- Target(x) AND NOT Reach(x) EDB:  Source = {1}.  Arc = {(1,2), (3,4), (4,3)}.  Target = {2,3}. First compute Reach = {1,2} (stratum 0). Next compute NoReach = {3} source target

Winter 2002Arthur Keller – CS 18014–22 Is the Stratified Solution “Obvious”? Not really. There is another model that makes the rules true no matter what values we substitute for the variables.  Reach = {1,2,3,4}.  NoReach = . Remember: the only way to make a Datalog rule false is to find values for the variables that make the body true and the head false.  For this model, the heads of the rules for Reach are true for all values, and in the rule for NoReach the subgoal NOT Reach(x) assures that the body cannot be true.