Lu Chaojun, SJTU 1 Extended Relational Algebra. Bag Semantics A relation (in SQL, at least) is really a bag (or multiset). –It may contain the same tuple.

Slides:



Advertisements
Similar presentations
1 Datalog: Logic Instead of Algebra. 2 Datalog: Logic instead of Algebra Each relational-algebra operator can be mimicked by one or several Database Logic.
Advertisements

Union, Intersection, Difference (subquery) UNION (subquery) produces the union of the two relations. Similarly for INTERSECT, EXCEPT = intersection and.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification.
Ver 1,12/09/2012Kode :CCs 111,Sistem basis DataFASILKOM Chapter 5: Other Relational Languages Database System Concepts, 5th Ed. ©Silberschatz, Korth and.
Chapter 6 Additional Relational Operations Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2008.
Algebraic and Logical Query Languages Spring 2011 Instructor: Hassan Khosravi.
1 Lecture 12: Further relational algebra, further SQL
1 Database Systems Relations as Bags Grouping and Aggregation Database Modification.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification.
Relational Operations on Bags Extended Operators of Relational Algebra.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 52 Database Systems I Relational Algebra.
Oct 28, 2003Murali Mani Relational Algebra B term 2004: lecture 10, 11.
Relational Algebra on Bags A bag is like a set, but an element may appear more than once. –Multiset is another name for “bag.” Example: {1,2,1,3} is a.
Winter 2002Arthur Keller – CS 18014–1 Schedule Today: Feb. 26 (T) u Datalog. u Read Sections Assignment 6 due. Feb. 28 (TH) u Datalog and SQL.
CSE 636 Data Integration Datalog Rules / Programs / Negation Slides by Jeffrey D. Ullman.
Fall 2001Arthur Keller – CS 1806–1 Schedule Today (TH) Bags and SQL Queries. u Read Sections Project Part 2 due. Oct. 16 (T) Duplicates, Aggregation,
CPSC-608 Database Systems Fall 2011 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #3.
Relational Operations on Bags Extended Operators of Relational Algebra.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 6 The Relational Algebra and Relational Calculus.
Chapter 5 Other Relational Languages By Cui, Can B.
Embedded SQL Direct SQL is rarely used: usually, SQL is embedded in some application code. We need some method to reference SQL statements. But: there.
Winter 2002Arthur Keller – CS 1807–1 Schedule Today: Jan. 24 (TH) u Subqueries, Grouping and Aggregation. u Read Sections Project Part 2 due.
1 Relational Algebra and Calculus Yanlei Diao UMass Amherst Feb 1, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
Credit: Slides are an adaptation of slides from Jeffrey D. Ullman 1.
Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical.
©Silberschatz, Korth and Sudarshan4.1Database System Concepts Chapter 4: SQL Basic Structure Set Operations Aggregate Functions Null Values Nested Subqueries.
1 Relational Algebra and Calculus Chapter 4. 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database.
Relational Algebra Instructor: Mohamed Eltabakh 1.
CSE314 Database Systems More SQL: Complex Queries, Triggers, Views, and Schema Modification Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E Pearson.
The Relational Model: Relational Calculus
CSE314 Database Systems The Relational Algebra and Relational Calculus Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E Pearson Ed Slide Set.
Logical Query Languages Motivation: 1.Logical rules extend more naturally to recursive queries than does relational algebra. u Used in SQL recursion. 2.Logical.
Datalog Inspired by the impedance mismatch in relational databases. Main expressive advantage: recursive queries. More convenient for analysis: papers.
1 Relational Algebra and Calculas Chapter 4, Part A.
From Professor Ullman, Relational Algebra.
Chapter 5 Notes. P. 189: Sets, Bags, and Lists To understand the distinction between sets, bags, and lists, remember that a set has unordered elements,
Chapter 5 Relational Algebra and Relational Calculus Pearson Education © 2009.
SCUHolliday - coen 1785–1 Schedule Today: u Relational Algebra. u Read Chapter 5 to page 199. Next u SQL Queries. u Read Sections And then u Subqueries,
Databases : Relational Algebra - Complex Expression 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof.
More Relation Operations 2015, Fall Pusan National University Ki-Joune Li.
More Relation Operations 2014, Fall Pusan National University Ki-Joune Li.
1 Algebra of Queries Classical Relational Algebra It is a collection of operations on relations. Each operation takes one or two relations as its operand(s)
1 CSCE Database Systems Anxiao (Andrew) Jiang The Database Language SQL.
©Silberschatz, Korth and Sudarshan3.1Database System Concepts Extended Relational-Algebra-Operations Generalized Projection Aggregate Functions Outer Join.
Slide 6- 1 Additional Relational Operations Aggregate Functions and Grouping A type of request that cannot be expressed in the basic relational algebra.
1 CS 430 Database Theory Winter 2005 Lecture 5: Relational Algebra.
SCUHolliday - coen 1787–1 Schedule Today: u Subqueries, Grouping and Aggregation. u Read Sections Next u Modifications, Schemas, Views. u Read.
More SQL (and Relational Algebra). More SQL Extended Relational Algebra Outerjoins, Grouping/Aggregation Insert/Delete/Update.
ICS 321 Fall 2011 Algebraic and Logical Query Languages (ii) Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at.
1 Database Design: DBS CB, 2 nd Edition Relational Algebra: Basic Operations & Algebra of Bags Ch. 5.
1. Chapter 2: The relational Database Modeling Section 2.4: An algebraic Query Language Chapter 5: Algebraic and logical Query Languages Section 5.1:
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 6: Formal Relational.
More SQL: Complex Queries, Triggers, Views, and Schema Modification
Slides are reused by the approval of Jeffrey Ullman’s
Datalog Rules / Programs / Negation Slides by Jeffrey D. Ullman
Relational Model By Dr.S.Sridhar, Ph.D.(JNUD), RACI(Paris, NICE), RMR(USA), RZFM(Germany)
An Algebraic Query Language
Relational Algebra - Part 1
Schedule Today: Next After that Subqueries, Grouping and Aggregation.
Chapter 3: Relational Model III
IST 210: Organization of Data
Operators Expression Trees Bag Model of Data
Instructor: Mohamed Eltabakh
More Relation Operations
Algebraic and Logical Query Languages pp.54 is added
Logic Based Query Languages
5.1 Relational Operations on Bags
Datalog Inspired by the impedance mismatch in relational databases.
More SQL Extended Relational Algebra Outerjoins, Grouping/Aggregation
Presentation transcript:

Lu Chaojun, SJTU 1 Extended Relational Algebra

Bag Semantics A relation (in SQL, at least) is really a bag (or multiset). –It may contain the same tuple more than once –There is no specified order (unlike a list). Select, project, and join work for bags as well as sets. –Just work on a tuple-by-tuple basis, and don't eliminate duplicates. Lu Chaojun, SJTU 2

Why Bags? Efficient implementation –e.g. projection, union –Q: How to eliminate duplicates? Some queries use bags –e.g. Aggregate Find the average grades Lu Chaojun, SJTU 3

Bag Union R  S: Sum the times an element appears in the two bags, i.e. if t appears n/m times in R/S, then t appears n+m times in R  S. Example { 1,2, 1}  { 1,2, 3} = { 1,1,1,2,2, 3}. 4 Lu Chaojun, SJTU

Bag Intersection R  S: Take the minimum of the number of occurrences in each bag, i.e. t appears min(n,m) times in R  S. Example { 1,2, 1}  { 1,2, 3,3} = { 1,2 }. 5 Lu Chaojun, SJTU

Bag Difference R  S: Proper-subtract the number of occurrences in the two bags, i.e. t appears max(0, n  m) times in R  S. Example { 1,2, 1}  { 1,2, 3,3} = { 1 }. 6 Lu Chaojun, SJTU

Other Operators on Bags Projection, selection, product, join –No duplicate elimination 7 Lu Chaojun, SJTU

Extensions to Relational Model Not a part of the formal relational model, but appear in real query languages like SQL. –Modification: insert, delete, update. –Aggregation: count, sum, average –Views –Null values 8 Lu Chaojun, SJTU

Extended RA Duplicate-elimination operator Sorting operator Extended projection Grouping-and-aggregation operator Outerjoin operator 9 Lu Chaojun, SJTU

Duplicate Elimination  ( R) = relation with one copy of each tuple that appears one or more times in R. 10 Lu Chaojun, SJTU

Aggregation Operators These are not relational operators; rather they summarize a column in some way. Five standard operators: Sum, Average, Count, Min, and Max. 11 Lu Chaojun, SJTU

Grouping Operator  L (R), where L is a list of elements that are either –Individual ( grouping) attributes or –Of the form  (A), where  is an aggregation operator and A the attribute to which it is applied. Example  sno,AVG(grade) (SC) 12 Lu Chaojun, SJTU

Grouping Operator(cont.)  L (R) is computed by: 1. Group R according to all the grouping attributes on list L. 2. Within each group, compute  (A), for each element  (A) on list L. 3. Result is the relation that consists of one tuple for each group. The components of that tuple are the values associated with each element of L for that group. 13 Lu Chaojun, SJTU

Extended Projection Allow the columns in the projection to be functions of one or more columns in the argument relation. Example  name,2011  age (Student) 14 Lu Chaojun, SJTU

Sorting  L (R) = list of tuples of R, ordered according to attributes on list L. Note that result type is outside the normal types (set or bag) for relational algebra. –Consequence:  cannot be followed by other relational operators. 15 Lu Chaojun, SJTU

Outerjoin The normal join can lose information, because a tuple that doesn't join with any from the other relation becomes dangling. The null value can be used to pad dangling tuples so they appear in the join. Outerjoin operator: o Variations: theta-outerjoin, left- and right- outerjoin (pad only dangling tuples from the left (resp., right). 16 Lu Chaojun, SJTU

A Logic for Relations Datalog Lu Chaojun, SJTU 17

Introduction A query language for relational model may be based on –Algebra: relational algebra –Logic: relational calculus e.g. Datalog More natural for recursive queries 18 Lu Chaojun, SJTU

Predicates and Atoms RDB vs. Datalog RDB Datalog relation R( ) predicate R( ) attributes(tuples) arguments x schema R(X) (relational)atom R(x) tuple t  R R(t) is TRUE –R(x) is a boolean-valued function if x contains variables; proposition otherwise. 19 Lu Chaojun, SJTU

Arithmetic Atoms Comparison between two arithmetic expressions exp1  exp2 –Predicate  (exp1,exp2) –infinite and unchanging relation 20 Lu Chaojun, SJTU

Datalog Rules Example Happy(sno)  S(sno,n,a,d) AND SC(sno,cno,g) AND g>=95 AND C(cno,cn) AND cn=‘Database’ Rules: Head  Body –Head: relational atom –Body: AND of subgoals Subgoal: atom or NOT atom Atom: P(arg), P is relation name or arithmetic predicate; arg may be variable or constant –  : if Or :- 21 Lu Chaojun, SJTU

Datalog Rules (cont.) Query: a collection of one or more rules Result: a relation appearing in rule heads –Designate the intended answer when there are more than one relation in rule heads 22 Lu Chaojun, SJTU

Meaning of Datalog Rules Meaning I: –Assign possible values to variables in the rule –If the assignment makes all the subgoals TRUE, then it forms a tuple of the result relation. Meaning II: –Consider consistent assignment of tuples for each nonnegated, relational subgoals. (see safety) –Then consider the negated, relational subgoals and the arithmetic subgoals, to see if the assignment of values to variables makes them all TRUE. If yes, a tuple is added to the result relation. 23 Lu Chaojun, SJTU

Example: Meaning I S(x,y)  R(x,z) AND R(z,y) AND NOT R(x,y) Consider all possible assignments: R: A B 1. x=1, z=2 make R(x,z) TRUE 1 2 y=3 make R(z,y) TRUE 2 3 NOT R(x,y) TRUE thus add (1,3) to S; S: C D 2. x=2, z=3 make R(x,z) TRUE 1 3 no y make R(z,y) TRUE 24 Lu Chaojun, SJTU

Example: Meaning II S(x,y)  R(x,z) AND R(z,y) AND NOT R(x,y) Consider consistent assignment of tuples: R: A B 1. t 1 for R(x,z), t 1 for R(z,y) t t 1 for R(x,z), t 2 for R(z,y) t t 2 for R(x,z), t 1 for R(z,y) 4. t 2 for R(x,z), t 2 for R(z,y) S: C D 1 3 only case 2 is a consistent assignment 25 Lu Chaojun, SJTU

Safety Every variable in the rule must appear in some nonnegated relational subgoal. To make the result a finite relation. Example: safety violation 1. S(x)  R(y) x not in subgoal 2. S(x)  NOT R(x) x not in nonnegated subgoal 3. S(x)  R(y) AND x < y x not in relational subgoal 26 Lu Chaojun, SJTU

Datalog Program -- Query A collection of rules Predicates/Relations are divided into two classes: –Extensional Relations/Predicates: stored in DB –Intensional Relations/Predicates: defined by rules EDB predicates can’t appear in the head, only in body; IDB predicates can appear in head, body, or both. 27 Lu Chaojun, SJTU

Datalog Rules Applied to Bags When there are no negated relational subgoals: –Meaning I for evaluating Datalog rules applies to bags as well as sets –But for bags, Meaning II is simpler for evaluating. When there are negated relational subgoals: –There is not a clearly defined meaning under the bag model. 28 Lu Chaojun, SJTU

From RA to Datalog R  S I(x)  R(x) AND S(x) R  S I(x)  R(x) I(x)  S(x) R  S I(x)  R(x) AND NOT S(x)  A (R) I(a)  R(a,b) 29 Lu Chaojun, SJTU

From RA to Datalog(cont.)  F (R) I(x)  R(x) AND F  C1 AND C2 (R) I(x)  R(x) AND C1 AND C2  C1 OR C2 (R) I(x)  R(x) AND C1 I(x)  R(x) AND C2 R  S I(x,y)  R(x) AND S(y) R S I(x,y,z)  R(x,y) AND S(y,z) 30 Lu Chaojun, SJTU

Multiple Operations in Datalog Create IDB predicates for intermediate relations Example A(x,y,z)  R(x,y,z) AND x > 10 B(x,y,z)  R(x,y,z) AND y = ‘ok’ C(x,y,z)  A(x,y,z) AND B(x,y,z) D(x,z)  C(x,y,z) 31 Lu Chaojun, SJTU

Expressive Power of Datalog Non-recursive Datalog = RA Datalog simulates SQL SELECT-FROM- WHERE without aggregation and grouping Recursive Datalog is more powerful than RA and SQL None is full in expressive power (Turing completeness) 32 Lu Chaojun, SJTU

End