Presentation is loading. Please wait.

Presentation is loading. Please wait.

Relational Database Systems 1

Similar presentations


Presentation on theme: "Relational Database Systems 1"— Presentation transcript:

1 Relational Database Systems 1
Instructor: Prof. James Cheng Acknowledgement: The slides are extracted from and modified based on the slides provided by Prof. Sourav S. Bhowmick from Nanyang Technological University.

2 “Relational databases are the foundation of western civilization.”
Bruce Lindsay IBM Fellow IBM Almaden Research Center

3 Topics to be covered ER model Relational Algebra SQL
Storage and Index Structures Query Processing and Query Optimization

4 Topics to be covered ER model Relational Algebra SQL
Storage and Index Structures Query Processing and Query Optimization

5 Entity/Relationship Model (ER Model)
The First Step Analyze Analysis of information that should be stored in the database Relationships Relationships between the components of information Entity/Relationship Model (ER Model) A popular approach – Entity/Relationship Model

6 Purpose of E/R Diagram Design database informally Graphical
The E/R model allows us to sketch the design of a database informally. Describing the schema of databases Graphical Designs are pictures called entity-relationship diagrams. Conversion to implementation Mechanical ways to convert E/R diagrams to real implementations

7 The Process Ideas E/R Design Relational Schema RDBMS

8 Example Meaning Bars sell some beers Drinkers like some beers
Drinkers visit some bars Bars Beers Sells manf name addr Drinkers Likes Visits ID Relations Beers(name, manf) Bars(name, addr) Drinkers(ID, name, addr) Sells(bar,beer) Visits(drinker,bar) Likes(drinker,beer) addr

9 Topics to be covered ER model Relational Algebra SQL
Storage and Index Structures Query Processing and Query Optimization

10 Place in the big picture
Declarative query language Algebra Implementation Relational Algebra SQL, relational calculus

11 Core Relational Algebra
Union, Intersection, and Difference Usual set operations, but require both operands have the same relation schema. Selection Projection Picking certain rows. Picking certain columns. Products & Joins Renaming Compositions of relations. Renaming of relations and attributes.

12 Union Union operator Rule
Builds a relation consisting of all tuples appearing in either or both of two specified relations. Combines all rows from two tables, excluding duplicate rows Rule Tables must have the same attribute characteristics

13 Example R1 R2 Name Addr favBeer Pauline hku Heineken Joe cuhk Bud Name
Eve Harry cuhk Tiger R1  R2 Name Addr favBeer Pauline hku Heineken Eve Joe cuhk Bud Harry Tiger

14 Intersection Intersection operator Results
Builds a relation consisting of all tuples appearing in both of two specified relations Results Yields only the rows that appear in both tables

15 Example R1 R2 Name Addr favBeer Pauline hku Heineken Joe cuhk Bud Name
Eve Harry cuhk Tiger R1  R2 Name Addr favBeer Pauline hku Heineken

16 Difference Difference operator Results
Builds a relation consisting of all tuples appearing in first relation but not the second. Results It subtracts one table from the other

17 Example R1 R2 Name Addr favBeer Pauline hku Heineken Joe cuhk Bud Name
Eve Harry cuhk Tiger R1 - R2 Name Addr favBeer Joe cuhk Bud

18 Selection Selection operator Representation
Extracts specified tuples (rows) from a specified relation (table). Returns all tuples which satisfy a condition Representation R1 = sc(R2) C is a condition (as in “if” statements) that refers to attributes of R2. R1 is all those tuples of R2 that satisfy C.

19 Example Bar Beer Price Joe’s Heineken 8.00 Bud 7.60 Sells Bar Beer
Ku De Ta Miller 9.00 Bud 7.60 Clinic Harry’s Tiger 9.50 JoeMenu := Bar=“Joe’s”(Sells)

20 Projection Projection operator Representation
Extracts specified attributes (columns) from a specified relation. Representation R1 := P L (R2) L is a list of attributes from the schema of R2. R1 is constructed by looking at each tuple of R2, extracting the attributes on list L, in the order specified, and creating from those components a tuple for R1. Eliminate duplicate tuples, if any.

21 Example Sells Bar Beer Price Joe’s Heineken 8.00 Ku De Ta Miller 9.00
Bud 7.60 Clinic Harry’s Tiger 9.50 Beer Price Heineken 8.00 Miller 9.00 Bud 7.60 Tiger 9.50 Prices := Beer,Price(Sells)

22 Cartesian Product Cartesian Product Representation
Builds a relation from two specified relations consisting of all possible concatenated pairs of tuples, one from each of the two relations. Representation R3 := R1 × R2 Pair each tuple t1 of R1 with each tuple t2 of R2. The concatenation “t1 t2” is a tuple of R3. Schema of R3 is the attributes of R1 and R2, in order. Beware! Beware of attribute A of the same name in R1 and R2: use R1.A and R2.A.

23 Example A B B C A R1.B R2.B C R1 R2 1 2 5 6 3 3 4 7 8 R1 × R2 1 2 3 4

24 Theta-Join Join Representation
Builds a relation from two specified relations consisting of all possible concatenated pairs, one from each of the two relations, such that in each pair the two tuples satisfy some condition. Representation R3 := R1 ⋈ C R2 Take the product R1 × R2. Then apply C to the result. R1 ⋈ C R2 = C (R1 × R2)

25 Equi-Join The Condition C Equi-Join
C can be any boolean-valued condition. Historic versions of this operator allowed only A theta B, where theta was =, <, etc.; hence the name “theta-join.” Equi-Join If C is a conjunction of equality then it is called an equi-join

26 Example Sells Bars Bar Beer Price Joe’s Heineken 8.00 Bud 7.60
Pump Room Name Addr Joe’s Scotts Rd Pump Room Clark Quay Harry’s Esplanade BarInfo :=Sells ⋈ Sell.Bar = Bars.Name Bars Bar Beer Price Name Addr Joe’s Heineken 8.00 Scotts Rd Bud 7.60 Pump Room Clark Quay

27 Natural Join Natural Join Representation Connects two relations by:
Equating attributes of the same name, and Projecting out one copy of each pair of equated attributes. Representation R3 := R1 ⋈ R2

28 Example Sells Bars Bar Beer Price Joe’s Heineken 8.00 Bud 7.60
Pump Room Bar Addr Joe’s Scotts Rd Pump Room Clark Quay Harry’s Esplanade BarInfo := Sells ⋈ Bars Bar Beer Price Addr Joe’s Heineken 8.00 Scotts Rd Bud 7.60 Pump Room Clark Quay

29 Expression Trees Structure Example Leaves are operands
Variables standing for relations. Interior nodes are operators Applied to their child or children. Example Using the relations Bars(name, addr) and Sells(bar, beer, price), find the names of all the bars that are either on Nathan Road or sell Bud for less than $7.

30 Sequence of Assignments
Example Using the relations Bars(name, addr) and Sells(bar, beer, price), find the names of all the bars that are either on Nathan Road or sell Bud for less than $7. Sequence of Assignments R1 := price < 7 AND beer=“bud” (Sells) R2 := addr=“Nathan Rd” (Bars) R3 := bar(R1) R4:= name(R2) R5:= ρ name(R3) R6:= R5  R4

31 Sequence of Assignments
Expression Tree Sequence of Assignments R1 := price < 7 AND beer=“bud” (Sells) R2 := addr=“Nathan Rd” (Bars) R3 := bar(R1) R4:= name(R2) R5:= ρ name(R3) R6:= R5  R4 UNION RENAMER(name) PROJECTname PROJECTbar SELECTaddr = “Nathan Rd.” SELECT price<7 AND beer=“Bud” Bars Sells

32 Relational Algebra on Bags
SQL SQL, the most important query language for relational databases is actually a bag language. SQL will eliminate duplicates, but usually only if you ask it to do so explicitly. Efficiency Some operations, like projection, are much more efficient on bags than sets.

33 Example - Projection Sells Beer Heineken Miller Bud Tiger Bar Beer
Price Joe’s Heineken 8.00 Ku De Ta Miller 9.00 Bud 7.60 Pump Room Harry’s Tiger 9.50 Beer Heineken Miller Bud Tiger Beers := Beer(Sells)

34 Example - Union R2 R1 Name Addr favBeer Pauline hku Heineken Eve Harry
cuhk Tiger Name Addr favBeer Pauline hku Heineken Joe cuhk Bud Name Addr favBeer Pauline hku Heineken Eve Joe cuhk Bud Harry Tiger R1  R2 Name Addr favBeer Pauline hku Heineken Eve Joe cuhk Bud Harry Tiger

35 Topics to be covered ER model Relational Algebra SQL
Storage and Index Structures Query Processing and Query Optimization

36 Operations on Relations
What we want to do on the relations? Retrieve Insert Delete Update SQL Structured Query Language (SQL) is the standard query language for relational databases. It first became an official standard in 1986 as defined by the American National Standards Institute (ANSI). All major database vendors conform to the SQL standard with minor variations in syntax (different dialects).

37 SQL Declarative Language Not a complete programming language
SQL is a declarative language (non-procedural). A SQL query specifies what to retrieve but not how to retrieve it. Not a complete programming language It does not have control or iteration commands.

38 Aspects of SQL Data Manipulation Language (DML)
Perform queries Perform updates Focus of this course Data Definition Language (DDL) Creates databases, tables, indices Create views Specify authorization Specify integrity constraints Embedded SQL Wrap a Turing-complete programming language around DML to do more sophisticated queries/updates

39 Principle Form of SQL Basic Structure of SQL
SELECT desired attributes (A1, A2, … , An) FROM one or more tables (R1, R2, … , Rm) WHERE condition about tuples of the tables (P) Mapping to Relational Algebra Π A1, A2, …, An (σP (R1 × R2 × … × Rm))

40 Our Running Example Relational Database Beers(name, manf)
Bars(name, addr, license) Drinkers(name, addr, phone) Likes(drinker, beer) Sells(bar, beer, price) Frequents(drinker, bar)

41 Example Query Using Beers(name, manf), what beers are made by Anheuser-Busch? SELECT name FROM Beers WHERE manf = `Anheuser-Busch’;

42 Results Name Manf Name Beers Heineken Dutch Bud Anheuser-Busch
Michelob Beck’s Beer Bremen Bud Lite Name Bud Michelob Bud Lite

43 * In SELECT Clause Name Manf Name Manf SELECT * FROM Beers
WHERE manf = `Anheuser-Busch’; Beers Name Manf Heineken Dutch Bud Anheuser-Busch Michelob Beck’s Beer Bremen Bud Lite Name Manf Bud Anheuser-Busch Michelob Bud Lite

44 Multi-Relation Queries
Motivation Queries often combine data from more than one relation. We can address several relations in one query by listing them all in the FROM clause. Distinguish attributes of the same name by “<relation>.<attribute>” Query Using Likes(drinker, beer) and Frequents(drinker, bar), find the beers liked by at least one person who frequents Bar X SELECT beer FROM Likes AS L, Frequents AS F WHERE bar=`Bar X’ AND F.drinker = L.drinker;

45 Example Drinker Beer Drinker Bar Beer Likes Melissa Heineken Sean Bud
……… ………… Sally Frequents Drinker Bar Sally MOS Bar X ……. ……………. Melissa Beer Heineken

46 Explicit Tuple Variables
Motivation A query may use two copies of the same relation. Distinguish copies by following the relation name by the name of a tuple-variable, in the FROM clause. An option to rename relations this way Query SELECT b1.name, b2.name FROM Beers b1, Beers b2 WHERE b1.manf = b2.manf AND b1.name < b2.name; From Beers(name, manf), find all pairs of beers by the same manufacturer. Do not produce pairs like (Bud, Bud). Produce pairs in alphabetic order, e.g. (Bud, Miller)

47 Example Beers b2 Beers b1 SELECT b1.name, b2.name
Manf Heineken Dutch Bud Anheuser-Busch Beck’s Beer Bremen Bud Lite Name Manf Heineken Dutch Bud Anheuser-Busch Beck’s Beer Bremen Bud Lite Beers b2 Beers b1 SELECT b1.name, b2.name FROM Beers b1, Beers b2 WHERE b1.manf = b2.manf AND b1.name < b2.name; True False

48 Subqueries SELECT Clause FROM Clause SQL WHERE Clause SQL SQL

49 Example Query Subqueries
From Sells(bar, beer, price), find the bars that serve Heineken for the same price Bar X charges for Bud. Subqueries Find the price Bar X charges for Bud. Find the bars that serve Heineken at that price.

50 Scalar Subquery SELECT bar FROM Sells WHERE beer = ‘Heineken’ AND
price = (SELECT price FROM Sells WHERE bar = ‘Bar X’ AND beer = ‘Bud’);

51 Example Bar Beer Price Price Bar SELECT price FROM Sells
WHERE bar = `Bar X’ AND beer = `Bud’; Sells Bar Beer Price Clinic Heineken 8.00 Bud 6.60 Bar X 7.90 MOS Price 7.90 SELECT bar FROM Sells WHERE beer = `Heineken’ AND price = 7.90; Bar MOS

52 Operators inTable Subqueries
EXISTS <tuple> IN <relation> is true if and only if the tuple is a member of the relation. EXISTS( <relation> ) is true if and only if the <relation> is not empty. Returns true if the nested query has 1 or more tuples. ANY ALL x = ANY( <relation>) is a boolean cond. meaning that x equals at least one tuple in the relation. x <> ALL(<relation>) is true if and only if for every tuple t in the relation, x is not equal to t. Note Any of the comparison operators (<, <=, =, etc.) can be used. The keyword NOT can proceed any of the operators (s NOT IN R)

53 Union, Intersection, Difference
Usefulness They are generally used to combine the results of two separate SQL queries. UNION, INTERSECT, EXCEPT Syntax ( subquery ) UNION ( subquery ) ( subquery ) INTERSECT ( subquery ) ( subquery ) EXCEPT ( subquery )

54 Bag Semantics for SQL Difference between Relational Algebra & SQL
Relations in SQL are bags instead of sets. Default for SELECT-FROM-WHERE is bag Default for UNION, INTERSECT, and EXCEPT is set How to change the default? Force set semantics with DISTINCT after SELECT Force bag semantics with ALL after UNION, etc. Why? When doing projection in relational algebra, it is easier to avoid eliminating duplicates. When doing intersection or difference, it is most efficient to sort the relations first (eliminate the duplicates then).

55 Example: DISTINCT Query
From Sells(bar, beer, price), find all the different prices charged for beers SELECT DISTINCT price FROM Sells; Note Without DISTINCT, each price would be listed as many times as there were bar/beer pairs at that price.

56 ORDER BY Clause Ordering Tuples Order of Sorted Attributes
The query result returned is not ordered on any attribute by default. We can order the data using the ORDER BY 'ASC' sorts the data in ascending order, and 'DESC' sorts it in descending order. The default is 'ASC'. Order of Sorted Attributes The first attribute specified is sorted on first, then the second attribute is used to break any ties, etc. What about NULL? NULL is normally treated as less than all non-null values.

57 Example Query Using Beers(name, manf, price), list the beers (and their prices) that are made by Anheuser-Busch? List the more expensive beers first, and sort beers with the same price in ascending order according to their names. SELECT name, price FROM Beers WHERE manf = `Anheuser-Busch’ ORDER BY price DESC, name ASC;

58 Join Expressions Joins in SQL Natural Join Product
SQL provides a number of expression forms that act like varieties of join in relational algebra. But using bag semantics, not set semantics. These expressions can be stand-alone queries or used in place of relations in a FROM clause. Natural Join R NATURAL JOIN S; Example: Likes NATURAL JOIN Serves; Product R CROSS JOIN S;

59 Theta Join Syntax Example Inner Join R JOIN S ON <condition>
A theta-join using <condition> for selection. Example Using Drinkers(name, addr) and Frequents(drinker, bar): Drinkers JOIN Frequents ON name = drinker; Inner Join General form for Equijoin R INNER JOIN S USING (<attribute list>) equi-join on <attribute list> Likes INNER JOIN Frequents USING (drinker);

60 Outer Join Syntax Different Variants
R OUTER JOIN S is the core of an outerjoin expression. Different Variants Optional NATURAL in front of OUTER. Optional ON <condition> after JOIN. Optional LEFT, RIGHT, or FULL before OUTER. LEFT = pad dangling tuples of R only. RIGHT = pad dangling tuples of S only. FULL = pad both; this choice is the default.

61 Aggregate Functions Five functions Rules
COUNT - returns the # of values in a column SUM - returns the sum of the values in a column AVG - returns the average of the values in a column MIN - returns the smallest value in a column MAX - returns the largest value in a column Rules COUNT, MAX, and MIN apply to all types of fields SUM and AVG apply to only numeric fields. Except for COUNT(*) all functions ignore nulls. COUNT(*) returns the number of rows in the table. Use DISTINCT to eliminate duplicates.

62 Example Query From Sells(bar, beer, price), find the average price of Bud SELECT AVG(price) FROM Sells WHERE beer = `Bud’;

63 Example – Duplicate Elimination
Query From Sells(bar, beer, price), find the number of different prices charged for Bud SELECT COUNT(DISTINCT price) FROM Sells WHERE beer = `Bud’;

64 Grouping Motivation GROUP BY
In many cases, we want to apply the aggregate functions to subgroups of tuples in a relation Each subgroup of tuples consists of the set of tuples that have the same value for the grouping attribute(s) The function is applied to each subgroup independently GROUP BY clause GROUP BY We may follow a SELECT-FROM-WHERE expression by GROUP BY and a list of attributes. The relation that results from the SELECT-FROM-WHERE is grouped according to the values of all those attributes, and any aggregation is applied only within each group.

65 Example Query From Sells(bar, beer, price), find the average price of each beer SELECT beer, AVG(price) FROM Sells GROUP BY beer;

66 Results Bar Beer Price Bar Beer Price Beer Avg(Price) Sells Sells
Joe’s Heineken 8.00 Sky Bar Miller 9.00 Tiger 7.60 Bar X Harry’s 9.50 Bar Beer Price Sky Bar Miller 9.00 Joe’s Heineken 8.00 Bar X Tiger 7.60 Harry’s 9.50 Beer Avg(Price) Miller 9.00 Heineken 8.00 Tiger 8.55

67 HAVING Clauses Syntax and Semantics
HAVING <condition> may follow a GROUP BY clause. If so, the condition applies to each group, and groups not satisfying the condition are eliminated.

68 Example: Having Query From Sells(bar, beer, price) and Beers(name, manf), find the average price of those beers that are either served in at least three bars or are manufactured by Pete’s. Beer groups with at least 3 non-NULL bars and also beer groups where the manufacturer is Pete’s. SELECT beer, AVG(price) FROM Sells GROUP BY beer; HAVING COUNT(bar)>= 3 OR beer IN (SELECT name FROM beers WHERE manf = ‘Pete’’s’);

69 SQL Summary Evaluation SQL SELECT <attribute list>
FROM <table list> [WHERE (condition)] [GROUP BY <grouping attributes>] [HAVING <group condition>] [ORDER BY <attribute list>] Evaluation A query is evaluated by first applying the WHERE-clause, then GROUP BY and HAVING, and finally the SELECT-clause Clauses in square brackets ([,]) are optional.

70 More SQL Database Modification Creation & Deletion of Tables
Reference: Hector Garcia-Molina, Jeffrey Ullman, Jenifer Widom. Database Systems - the Complete Book , Second Edition(Prentice Hall)


Download ppt "Relational Database Systems 1"

Similar presentations


Ads by Google