Relational Operations on Bags Extended Operators of Relational Algebra.

Slides:



Advertisements
Similar presentations
Union, Intersection, Difference (subquery) UNION (subquery) produces the union of the two relations. Similarly for INTERSECT, EXCEPT = intersection and.
Advertisements

Two-Pass Algorithms Based on Sorting
SQL CSET 3300.
CS4432: Database Systems II Query Operator & Algebraic Expressions 1.
Algebraic and Logical Query Languages Spring 2011 Instructor: Hassan Khosravi.
1 Lecture 12: Further relational algebra, further SQL
1 Database Systems Relations as Bags Grouping and Aggregation Database Modification.
Relational Algebra.
Midterm Review II. Redundancy. –Information may be repeated unnecessarily in several tuples. –E.g. length and filmType. Update anomalies. –We may change.
Relational Operations on Bags Extended Operators of Relational Algebra.
Query Execution Since our SQL queries are very high level the query processor does a lot of processing to supply all the details. An SQL query is translated.
Subqueries Example Find the name of the producer of ‘Star Wars’.
Oct 28, 2003Murali Mani Relational Algebra B term 2004: lecture 10, 11.
Relational Algebra on Bags A bag is like a set, but an element may appear more than once. –Multiset is another name for “bag.” Example: {1,2,1,3} is a.
Fall 2001Arthur Keller – CS 1806–1 Schedule Today (TH) Bags and SQL Queries. u Read Sections Project Part 2 due. Oct. 16 (T) Duplicates, Aggregation,
CPSC-608 Database Systems Fall 2011 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #3.
Operations in the Relational Model These operation can be expressed in an algebra, called “relational algebra”. In this algebra relations are the operands.
Relational Algebra Chapter 4 - part I. 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database.  Relational.
Joins Natural join is obtained by: R NATURAL JOIN S; Example SELECT * FROM MovieStar NATURAL JOIN MovieExec; Theta join is obtained by: R JOIN S ON Example.
Chapter 5 Algebraic and Logical Query Languages pp.54 is added Pp 61 updated.
16.2 ALGEBRAIC LAWS FOR IMPROVING QUERY PLANS Ramya Karri ID: 206.
Winter 2002Arthur Keller – CS 1807–1 Schedule Today: Jan. 24 (TH) u Subqueries, Grouping and Aggregation. u Read Sections Project Part 2 due.
Instructor: Mohamed Eltabakh
1 Relational Algebra Operators Expression Trees Bag Model of Data.
1 More SQL Extended Relational Algebra Outerjoins, Grouping/Aggregation Insert/Delete/Update.
Nov 18, 2003Murali Mani Relational Algebra B term 2004: lecture 10, 11.
Murali Mani Relational Algebra. Murali Mani What is Relational Algebra? Defines operations (data retrieval) for relational model SQL’s DML (Data Manipulation.
Relational Algebra Basic Operations Algebra of Bags.
Databases 1 Seventh lecture. Topics of the lecture Extended relational algebra Normalization Normal forms 2.
From Professor Ullman, Relational Algebra.
Database Management Systems Chapter 5 The Relational Algebra Instructor: Li Ma Department of Computer Science Texas Southern University, Houston October,
1 Relational Algebra Operators Expression Trees. 2 What is an “Algebra” uMathematical system consisting of: wOperands --- variables or values from which.
1 Lecture 2 Relational Algebra Based on
Chapter 5 Notes. P. 189: Sets, Bags, and Lists To understand the distinction between sets, bags, and lists, remember that a set has unordered elements,
Advanced Relational Algebra & SQL (Part1 )
Extended Operators in SQL and Relational Algebra Zaki Malik September 11, 2008.
Databases : Relational Algebra - Complex Expression 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof.
More Relation Operations 2015, Fall Pusan National University Ki-Joune Li.
More Relation Operations 2014, Fall Pusan National University Ki-Joune Li.
1 Algebra of Queries Classical Relational Algebra It is a collection of operations on relations. Each operation takes one or two relations as its operand(s)
1 CSCE Database Systems Anxiao (Andrew) Jiang The Database Language SQL.
CS 405G: Introduction to Database Systems Instructor: Jinze Liu Fall 2009.
1 CS 430 Database Theory Winter 2005 Lecture 5: Relational Algebra.
Relational Algebra BASIC OPERATIONS 1 DATABASE SYSTEMS AND CONCEPTS, CSCI 3030U, UOIT, COURSE INSTRUCTOR: JAREK SZLICHTA.
Lu Chaojun, SJTU 1 Extended Relational Algebra. Bag Semantics A relation (in SQL, at least) is really a bag (or multiset). –It may contain the same tuple.
SCUHolliday - coen 1787–1 Schedule Today: u Subqueries, Grouping and Aggregation. u Read Sections Next u Modifications, Schemas, Views. u Read.
More SQL (and Relational Algebra). More SQL Extended Relational Algebra Outerjoins, Grouping/Aggregation Insert/Delete/Update.
1 Introduction to Database Systems, CS420 Relational Algebra.
1 Database Design: DBS CB, 2 nd Edition Relational Algebra: Basic Operations & Algebra of Bags Ch. 5.
1. Chapter 2: The relational Database Modeling Section 2.4: An algebraic Query Language Chapter 5: Algebraic and logical Query Languages Section 5.1:
Subqueries CIS 4301 Lecture Notes Lecture /23/2006.
1 Introduction to Database Systems, CS420 SQL JOIN, Aggregate, Grouping, HAVING and DML Clauses.
1 Database Design: DBS CB, 2 nd Edition SQL: Select-From-Where Statements & Multi-relation Queries & Subqueries Ch. 6.
Basic Operations Algebra of Bags
Slides are reused by the approval of Jeffrey Ullman’s
Outerjoins, Grouping/Aggregation Insert/Delete/Update
Databases : More about SQL
Schedule Today: Next After that Subqueries, Grouping and Aggregation.
Database Design and Programming
CS 405G: Introduction to Database Systems
IST 210: Organization of Data
Operators Expression Trees Bag Model of Data
More Relation Operations
Basic Operations Algebra of Bags
More Complex Operators
Algebraic and Logical Query Languages pp.54 is added
Chapter 2: Intro to Relational Model
5.1 Relational Operations on Bags
More SQL Extended Relational Algebra Outerjoins, Grouping/Aggregation
Relational Algebra Chapter 4 - part I.
Presentation transcript:

Relational Operations on Bags Extended Operators of Relational Algebra

Relational Algebra on Bags A bag is like a set, but an element may appear more than once. –Multiset is another name for “bag.” Example: –{1,2,1,3} is a bag. –{1,2,3} is also a bag that happens to be a set. Bags also resemble lists, but order in a bag is unimportant. –Example: {1,2,1} = {1,1,2} as bags, but [1,2,1] != [1,1,2] as lists.

Why bags? SQL is actually a bag language. SQL will eliminate duplicates, but usually only if you ask it to do so explicitly. Some operations, like projection or union, are much more efficient on bags than sets. –Why?

Operations on Bags Selection applies to each tuple, so its effect on bags is like its effect on sets. Projection also applies to each tuple, but as a bag operator, we do not eliminate duplicates. Products and joins are done on each pair of tuples, so duplicates in bags have no effect on how we operate.

Example: Bag Selection R(AB )S(BC )  A+B<5 (R) =AB 12

Example: Bag Projection R(A,B )S(B,C )  A (R) =A Bag projection yields always the same number of tuples as the original relation.

Example: Bag Product Each copy of the tuple (1,2) of R is being paired with each tuple of S. So, the duplicates do not have an effect on the way we compute the product. R(A,B )S(B,C ) R  S =AR.BS.BC

Bag Union Union, intersection, and difference need new definitions for bags. An element appears in the union of two bags the sum of the number of times it appears in each bag. Example: {1,2,1}  {1,1,2,3,1} = {1,1,1,1,1,2,2,3}

Bag Intersection An element appears in the intersection of two bags the minimum of the number of times it appears in either. Example: {1,2,1}  {1,2,3} = {1,2}.

Bag Difference An element appears in difference A – B of bags as many times as it appears in A, minus the number of times it appears in B. –But never less than 0 times. Example: {1,2,1} – {1,2,3} = {1}.

Beware: Bag Laws != Set Laws Not all algebraic laws that hold for sets also hold for bags. Example Set union is idempotent, meaning that S  S = S. However, for bags, if x appears n times in S, then it appears 2n times in S  S. Thus S  S != S in general.

The Extended Algebra 1.  : eliminate duplicates from bags. 2.  : sort tuples. 3.Extended projection: arithmetic, duplication of columns. 4.  : grouping and aggregation. 5.OUTERJOIN: avoids “dangling tuples” = tuples that do not join with anything.

Example: Duplicate Elimination R =AB  (R) = AB R 1 :=  (R 2 ) R 1 consists of one copy of each tuple that appears in R 2 one or more times.

Sorting R 1 :=  L (R 2 ). –L is a list of some of the attributes of R 2. R 1 is the list of tuples of R 2 sorted first on the value of the first attribute on L, then on the second attribute of L, and so on.  is the only operator whose result is neither a set nor a bag.

Example: Extended Projection R =AB  A+B  C,A  A1,A  A2 (R) =CA1A Using the same  L operator, we allow the list L to contain arbitrary expressions involving attributes, for example: 1.Arithmetic on attributes, e.g., A+B. 2.Duplicate occurrences of the same attribute.

Aggregation Operators They apply to entire columns of a table and produce a single result. The most important examples: –SUM –AVG –COUNT –MIN –MAX

Example: Aggregation R =AB SUM(A) = 7 COUNT(A) = 3 MAX(B) = 4 MIN(B) = 2 AVG(B) = 3

Grouping Operator R 1 :=  L (R 2 ) L is a list of elements that are either: 1.Individual (grouping ) attributes. 2.AGG(A), where AGG is one of the aggregation operators and A is an attribute.

 L (R) Group R according to all the grouping attributes on list L. –That is, form one group for each distinct list of values for those attributes in R. Within each group, compute AGG(A) for each aggregation on list L. Result has grouping attributes and aggregations as attributes. One tuple for each list of values for the grouping attributes and their group’s aggregations.

Example: Grouping/Aggregation R =ABC  A,B,AVG(C) (R) = ?? First, group R : ABC Then, average C within groups: ABAVG(C)

Example: Grouping/Aggregation StarsIn(title, year, starName) For each star who has appeared in at least three movies give the earliest year in which he or she appeared. –First we group, using starName as a grouping attribute. –Then, we compute the MIN(year) for each group. –Also, we need to compute the COUNT(title) aggregate for each group, for filtering out those stars with less than three movies.  ctTitle>3 [  starName,MIN(year)  minYear,COUNT(title)  ctTitle (StarsIn)]

Outerjoin Motivation Suppose we join R S. A tuple of R that has no tuple of S with which it joins is said to be dangling. –Similarly for a tuple of S. –We loose dangling tuples. Outerjoin Preserves dangling tuples by padding them with a special NULL symbol in the result.

Example: Outerjoin R = ABS =BC (1,2) joins with (2,3), but the other two tuples are dangling. R S =ABC NULL NULL67

Problems R(A,B) = {(0,1), (2,3), (0,1), (2,4), (3,4)} S(B,C) = {(0,1), (2,4), (2,5), (3,4), (0,2), (3,4)}  A,SUM(B) (R) R S

Problems Product(maker, model, type) PC(model, speed, ram, hd, rd, price) Laptop(model, speed, ram, hd, screen, price) Printer(model, color, type, price) Find the manufacturers who sell exactly three different models of PC. Find those manufacturers of at least two different computers (PC or Laptops) with speed of at least 700.