CS 440 Database Management Systems

Slides:



Advertisements
Similar presentations
Union, Intersection, Difference (subquery) UNION (subquery) produces the union of the two relations. Similarly for INTERSECT, EXCEPT = intersection and.
Advertisements

SQL Group Members: Shijun Shen Xia Tang Sixin Qiang.
SQL Queries Principal form: SELECT desired attributes FROM tuple variables –– range over relations WHERE condition about tuple variables; Running example.
Winter 2002Arthur Keller – CS 1806–1 Schedule Today: Jan. 22 (T) u SQL Queries. u Read Sections Assignment 2 due. Jan. 24 (TH) u Subqueries, Grouping.
SQL CSET 3300.
CS411 Database Systems Kazuhiro Minami 06: SQL. Join Expressions.
1 Database Systems Relations as Bags Grouping and Aggregation Database Modification.
1 Introduction to SQL Multirelation Queries Subqueries Slides are reused by the approval of Jeffrey Ullman’s.
CPSC-608 Database Systems Fall 2011 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #3.
CPSC-608 Database Systems Fall 2008 Instructor: Jianer Chen Office: HRBB 309B Phone: Notes #3.
Winter 2002Arthur Keller – CS 1807–1 Schedule Today: Jan. 24 (TH) u Subqueries, Grouping and Aggregation. u Read Sections Project Part 2 due.
1 More SQL Extended Relational Algebra Outerjoins, Grouping/Aggregation Insert/Delete/Update.
CPSC-608 Database Systems Fall 2011 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #2.
Chapter 6 Notes. 6.1 Simple Queries in SQL SQL is not usually used as a stand-alone language In practice there are hosting programs in a high-level language.
SCUHolliday6–1 Schedule Today: u SQL Queries. u Read Sections Next time u Subqueries, Grouping and Aggregation. u Read Sections And then.
Databases : SQL-Introduction 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof. Jeffrey D. Ullman distributes.
Constraints on Relations Foreign Keys Local and Global Constraints Triggers Following lecture slides are modified from Jeff Ullman’s slides
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Relational Algebra.
Databases 1 Second lecture.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Database Management Systems Chapter 4 Relational Algebra.
1 CSCE Database Systems Anxiao (Andrew) Jiang The Database Language SQL.
1 Introduction to SQL Database Systems. 2 Why SQL? SQL is a very-high-level language, in which the programmer is able to avoid specifying a lot of data-manipulation.
Himanshu GuptaCSE 532-SQL-1 SQL. Himanshu GuptaCSE 532-SQL-2 Why SQL? SQL is a very-high-level language, in which the programmer is able to avoid specifying.
SCUHolliday - coen 1787–1 Schedule Today: u Subqueries, Grouping and Aggregation. u Read Sections Next u Modifications, Schemas, Views. u Read.
More SQL (and Relational Algebra). More SQL Extended Relational Algebra Outerjoins, Grouping/Aggregation Insert/Delete/Update.
1 Introduction to Database Systems, CS420 SQL JOIN, Aggregate, Grouping, HAVING and DML Clauses.
1 Database Design: DBS CB, 2 nd Edition SQL: Select-From-Where Statements & Multi-relation Queries & Subqueries Ch. 6.
Select-From-Where Statements Multirelation Queries Subqueries
CPSC-310 Database Systems
CS 440 Database Management Systems
Relational Database Systems 1
Schedule Today: Jan. 28 (Mon) Jan. 30 (Wed) Next Week Assignments !!
Slides are reused by the approval of Jeffrey Ullman’s
CPSC-310 Database Systems
Computational Biology
Outerjoins, Grouping/Aggregation Insert/Delete/Update
Foreign Keys Local and Global Constraints Triggers
Databases : More about SQL
CPSC-310 Database Systems
Relational Algebra Chapter 4 1.
Schedule Today: Next After that Subqueries, Grouping and Aggregation.
Introduction to Database Systems, CS420
CPSC-608 Database Systems
06a: SQL-1 The Basics– Select-From-Where
CPSC-608 Database Systems
Relational Algebra Chapter 4, Part A
Database Design and Programming
CPSC-310 Database Systems
Database Models Relational Model
Relational Algebra 461 The slides for this text are organized into chapters. This lecture covers relational algebra, from Chapter 4. The relational calculus.
CPSC-310 Database Systems
CS 405G: Introduction to Database Systems
IST 210: Organization of Data
LECTURE 3: Relational Algebra
Relational Algebra Chapter 4 1.
CPSC-310 Database Systems
IT 244 Database Management System
Relational Algebra Chapter 4, Sections 4.1 – 4.2
CPSC-608 Database Systems
CPSC-608 Database Systems
CPSC-608 Database Systems
More SQL Extended Relational Algebra Outerjoins, Grouping/Aggregation
SQL – Constraints & Triggers
CPSC-608 Database Systems
CENG 351 File Structures and Data Managemnet
CPSC-608 Database Systems
Relational Algebra Chapter 4 - part I.
Instructor: Zhe He Department of Computer Science
Select-From-Where Statements Multirelation Queries Subqueries
Presentation transcript:

CS 440 Database Management Systems Review of Relational Model and SQL

Relational Database Management Conceptual Design Physical Layer Schema Entity Relationship(ER) Model Relational Model Files and Indexes

Relational Database Management Conceptual Design Physical Layer Schema Entity Relationship(ER) Model Relational Model Files and Indexes

Relational Database Management Conceptual Design Physical Layer Schema Entity Relationship(ER) Model Relational Model Files and Indexes

Relational Database Management Relational Model & SQL Conceptual Design Physical Layer Schema Entity Relationship(ER) Model Relational Model Files and Indexes

Relational Model Relational model defines: a way of organizing data: relations operations to query and/or manipulate the data Much easier to use than procedural languages. Say what you want instead of how to do Everything is a relation. Both data and query

Relation: example Relation name Attribute names Book Title Price Category Year MySQL $102.1 computer 2001 Cell biology $201.69 biology 1954 French cinema $53.99 art 2002 NBA History $63.65 sport 2010 tuples

Relation Attributes Each relation must have keys Atomic values atomic types: string, integer, real, date, … Each relation must have keys Attributes without duplicate values A relation does not contain duplicate tuples. Reordering tuples does not change the relation. Reordering attributes does not change the relation.

Database Schema vs. Database Instance Schema of a Relation Names of the relation and their attributes. E.g.: Person (Name, Address, SSN) Types of the attributes Constraints on the values of the attributes Schema of the database Set of relation schemata Employment(Company, SSN)

Database Schema vs. Database Instance Schema: Book(Title, Price, Category, Year) Instance: Title Price Category Year MySQL $102.1 computer 2001 Cell biology $201.69 biology 1954 French cinema $53.99 art 2002 NBA History $63.65 sport 2010

Relational algebra: operations on relations Basic operations: Selection ( ) Selects a subset of rows from relation. Projection ( ) Deletes unwanted columns from relation. Cross-product ( ) Allows us to combine two relations. Set-difference ( ) Tuples in reln. 1, but not in reln. 2. Union ( ) Tuples in reln. 1 and in reln. 2. Additional operations: Intersection, join, … : Not essential, but (very!) useful. Since each operation returns a relation, operations can be composed. (Algebra is “closed”.) 6

Example Schema Beers(name, manf) Bars(name, addr, license) Drinkers(name, addr, phone) Likes(drinker, beer) Sells(bar, beer, price) Frequents(drinker, bar)

Projection Deletes attributes that are not in projection list. Schema of result contains exactly the fields in the projection list, with the same names that they had in the (only) input relation. 7

Selection Selects rows that satisfy selection condition. Schema of result identical to schema of (only) input relation. Result relation can be the input for another relational algebra operation! (Operator composition.) 8

Union, Intersection, Set-Difference All of these operations take two input relations, which must be union-compatible: Same number of fields. `Corresponding’ fields have the same type. What is the schema of result? 9

Cross-Product Each row of S1 is paired with each row of R1. Result schema has one field per field of S1 and R1. 10

Joins 11

Joins Result schema same as that of cross-product. Fewer tuples than cross-product, might be able to compute more efficiently If the condition is equality, it is called equi-join. Natural Join: Equijoin on all common fields. 11

SQL A declarative language for querying data stored in relational databases implements relational algebra with slight modifications. Many standards: SQL92, SQL99, … We focus on the core functionalities.

The Basic Form SELECT returned attribute(s) FROM relation(s) WHERE conditions on the tuples of the table(s) Apply the WHERE clause’s conditions on all relations in the tables in the FROM clause. Return the values of the attributes in the SELECT clause. One or more

Single Relation Query What beers are made by Anheuser-Busch? SELECT name FROM Beers WHERE manf = ‘Anheuser-Busch’;

Using * What beers are made by Anheuser-Busch? SELECT * FROM Beers WHERE manf = ‘Anheuser-Busch’;

WHERE clause May have complex conditions Logical operators: OR, AND, NOT Comparison operators: <, >, =, <>,… Types specific operators: LIKE, …

Null Values Some tuples may not contain any value for some of their attributes The operator did not enter the data The operator did not know the value … Ex: We do not know Fred’s salary. Put 0.0  Fred is not on unpaid leave! Databases use null value for these cases

A value not like any other value! A tuple in Sells relation: SELECT * FROM Sells WHERE price < 0.0 OR price >= 0.0 Does not return Joe Bar.

A value not like any other value! A tuple in Sells relation: SELECT * FROM Sells WHERE price IS NULL

Multi Relation Query: Join Find relations between different types of entities: have more business value! Ex: Using relations Likes(drinker, beer) and Frequents(drinker, bar), find the beers liked by at least one person who frequents Joe Bar. SELECT Likes.beer FROM Likes, Frequents WHERE Frequents.bar = ‘Joe Bar’ AND Frequents.drinker = Likes.drinker;

Join Queries Generally, require processing large number of tuples  time consuming. Relational Database Management Systems (RDBMS) have ways to process them efficiently We talk more about this later in the course

Subqueries SQL queries that appear in WHERE or FROM parts of another query. Example: Using Sells(bar, beer, price), find the bars that serve Miller for the same price Joe Bar charges for Bud. Figure out Joe’s price for Bud : JoePrice Find bars that offer Miller at price = JoePrice

Subqueries SELECT bar FROM Sells WHERE beer=‘Miller’ AND price= (SELECT price WHERE bar= ‘Joe Bar’ AND beer = ‘Bud’); Subquery

Subqueries: ALL, ANY We like to compare a value to a set of values Example: Using Sells(bar, beer, price), find the bars that serve Miller for a cheaper price than the price that every bar charges for Bud. Figure out the set of all prices for Bud : BudPrice. Find the bars that offer Miller at a cheaper price than all values in BudPrice.

Subqueries: ALL, ANY SELECT bar FROM Sells WHERE beer=‘Miller’ AND price < ALL (SELECT price WHERE beer = ‘Bud’); What if we use ANY instead of ALL? Returns the bars that serve Miller for a cheaper price than the price that at least one bar charges for Bud. Subquery

Subqueries: IN We like to check if the result of a subquery contains a particular value. Example: Using Beers(name, manf) and Likes(drinker, beer) find the manf of each beer John likes. SELECT manf FROM Beers WHERE name IN (SELECT beer FROM Likes WHERE drinker=‘John’); A set of beers

Subqueries: Exists We like to check if a subquery has any result. Example: Using Beers(name, manf), find the beers that are the only beer made by their manufacturers. SELECT name FROM Beers b1 WHERE NOT EXISTS (SELECT * FROM Beers WHERE manf=b1.manf AND name <> b1.name);

Bag versus Set Duplicates are allowed in bags. {a, a, b, b, b} vs. {a, b} Generally, the results of SQL queries are bags. SELECT name FROM Beers;

Removing Duplicates Use DISTINCT SELECT DISTINCT name FROM Beers;

Set Operations R UNION S R INTERSECT S R EXCEPT S Returns the union between tuples of relation R and tuples of relation S. R INTERSECT S Returns the tuples common between relation R and relation S. R EXCEPT S Returns the tuples found in relation R but not in relation S.

Set Operations: Example Using relations Likes(drinker, beer), Sells(bar, beer, price), and Frequents(drinker, bar), find the drinkers and beers such that The drinker likes the beer, and The drinker frequents at least one bar that sells the beer “and” shows that we should compute intersection.

Set operations: Example (SELECT * FROM Likes) INTERSECT (SELECT drinker, beer FROM Sells, Frequents WHERE Frequents.bar=Sells.bar); The drinker likes the beer The drinker frequents at the bar that sells the beer

Set Operations The results of set operations in SQL do not have any duplicate tuples. We can force them not to remove duplicates by ALL. .. INTERSECT …  .. INTERSECT ALL … .. UNION …  .. UNION ALL … .. DIFFERENCE …  .. DIFFERENCE ALL …

Aggregation functions Compute some value based on the values of an attribute. Example functions: Count, Sum, Avg, Min, Max Each RDBMS may define additional functions. Example: Using Bars(name, addr, license), find the number of bars. Select Count(name) From Bars;

Aggregation functions Using Distinct, aggregation functions ignore duplicates. Example: Using Likes(drinker, beer), find the number of drinkers who like Bud Lite. Select Count( Distinct drinker) From Likes Where beer =‘Bud Lite’;

Aggregation functions Generally, aggregation functions do not consider NULL values. Select Count(price) From Sells Where bar=‘Joe Bar’; Select Count(beer) Select Count(*) The number of priced beers sold by Joe Bar. The number of beers sold by Joe Bar. The number of beers sold by Joe Bar.

Aggregation functions over groups We want to aggregate values for groups of tuples. Example: Using Sells(bar, beer, price) find the minimum price of each beer. Group tuples in Sells based on beer. Compute Min over the prices in each group of tuples.

Group by Example: Using Sells(bar, beer, price) find the minimum price of each beer. Select beer, Min(price) As minprice From Sells Group By beer; optional

Group by Select beer, Min(price),bar You may use multiple attributes for grouping. The attributes in the Select clause are either aggregated values or attributes in the Group By clause. Select beer, Min(price),bar From Sells Group By beer; Exceptions in some RDBMS, e.g., MySQL 5.7. Generally, Group By does not sort the groups. There are exceptions, e.g., older versions of MySQL, but do not trust them! error

Grouping attributes from different relations. Example: Using Likes(drinker, beer) and Sells(bar, beer, price), for each drinker find the minimum price of every beer he/she likes. Select drinker, beer, Min(price) As minprice From Likes, Sells Where Likes.beer = Sells.beer Group By drinker, beer;

Filtering groups We may filter out some groups using their attributes’ values. Select beer, Min(price) As minprice From Sells Where bar=‘Red Lion’ or bar=‘Big Horse’ Group By beer;

Filtering groups based on aggregated values Example: Using Sells(bar, beer, price), find the minimum price of each beer whose maximum price is less than 11. Select beer, Min(price) As minprice From Sells Where Max(price) < 11 Group By beer error

Having clause We use Having clauses to filter out groups based on their aggregated values. Select beer, Min(price) As minprice From Sells Group By beer Having Max(price) < 11

Having clause We may use aggregated values over attributes other than the ones in the Group By clause. Example: Using Sells(bar, beer, price), find the minimum price of each beer sold in more than three bars. Select beer, Min(price) As minprice From Sells Group By beer Having Count(bar) > 3

Having clause may act as a Where clause Example: Using Sells(bar, beer, price), find the minimum price of Bud or beers whose maximum price is less than 11. Select beer, Min(price) From Sells Group By beer Having (Max(price) < 11) Or (beer=‘Bud’) It works only for the attributes in the Group By clause. Having (Max(price) < 11) Or (bar=‘Red Lion’) error

Sorting the output Example: Using Sells(bar, beer, price) find the minimum price of each beer whose maximum price is at least 15 and sort the results according to beers’ names. Select beer, Min(price) As minprice From Sells Group By beer Having Max(price) >= 15 Order By beer;

Sorting the output One may use Desc to change the sort order. Previous example in descending order of beers’ names: Select beer, Min(price) As minprice From Sells Group By beer Having Max(price) >= 15 Order By beer Desc; You may use Order By without Group By and Having. Example: Using Sells(bar, beer, price), provide a list of beer prices sorted by bars’ and beers’ names. Select bar, beer, price Order By bar, beer;