Lecture 07: Relational Algebra

Slides:



Advertisements
Similar presentations
COMP 5138 Relational Database Management Systems Semester 2, 2007 Lecture 5A Relational Algebra.
Advertisements

Relational Algebra Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY courtesy of Joe Hellerstein for some slides.
Tallahassee, Florida, 2014 COP4710 Database Systems Relational Algebra Fall 2014.
D ATABASE S YSTEMS I R ELATIONAL A LGEBRA. 22 R ELATIONAL Q UERY L ANGUAGES Query languages (QL): Allow manipulation and retrieval of data from a database.
1 Relational Algebra. Motivation Write a Java program Translate it into a program in assembly language Execute the assembly language program As a rough.
Relational Algebra Ch. 7.4 – 7.6 John Ortiz. Lecture 4Relational Algebra2 Relational Query Languages  Query languages: allow manipulation and retrieval.
1 Relational Algebra Lecture #9. 2 Querying the Database Goal: specify what we want from our database Find all the employees who earn more than $50,000.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Relational Algebra Chapter 4, Part A Modified by Donghui Zhang.
1 Relational Algebra & Calculus. 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database.  Relational.
Relational Algebra Maybe -- SQL. Confused by Normal Forms ? 3NF BCNF 4NF If a database doesn’t violate 4NF (BCNF) then it doesn’t violate BCNF (3NF) !
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 52 Database Systems I Relational Algebra.
FALL 2004CENG 351 File Structures and Data Managemnet1 Relational Algebra.
1 Relational Algebra. 2 Relational Query Languages Query languages: Allow manipulation and retrieval of data from a database. Relational model supports.
Lecture #3 Functional Dependencies Normalization Relational Algebra Thursday, October 12, 2000.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Relational Algebra Chapter 4, Part A.
1 Lecture 07: Relational Algebra. 2 Outline Relational Algebra (Section 6.1)
Relational Algebra Chapter 4 - part I. 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database.  Relational.
Relational Schema Design (end) Relational Algebra Finally, querying the database!
One More Normal Form Consider the dependencies: Product Company Company, State Product Is it in BCNF?
Relation Decomposition A, A, … A 12n Given a relation R with attributes Create two relations R1 and R2 with attributes B, B, … B 12m C, C, … C 12l Such.
Relational Algebra.
Lecture 3: Relational Algebra and SQL Tuesday, March 25, 2008.
Lecture 05 Structured Query Language. 2 Father of Relational Model Edgar F. Codd ( ) PhD from U. of Michigan, Ann Arbor Received Turing Award.
1 Introduction to Database Systems CSE 444 Lecture 20: Query Execution: Relational Algebra May 21, 2008.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Relational Algebra.
CS 4432query processing1 CS4432: Database Systems II Lecture #11 Professor Elke A. Rundensteiner.
1 Relational Algebra & Calculus Chapter 4, Part A (Relational Algebra)
Transactions, Relational Algebra, XML February 11 th, 2004.
CSE 544: Relational Operators, Sorting Wednesday, 5/12/2004.
1 Lecture 7: Normal Forms, Relational Algebra Monday, 10/15/2001.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Database Management Systems Chapter 4 Relational Algebra.
Relational Algebra 2. Relational Algebra Formalism for creating new relations from existing ones Its place in the big picture: Declartive query language.
Advanced Relational Algebra & SQL (Part1 )
Database Management Systems, R. Ramakrishnan1 Relational Algebra Module 3, Lecture 1.
1 CS 430 Database Theory Winter 2005 Lecture 5: Relational Algebra.
Lecture 13: Relational Decomposition and Relational Algebra February 5 th, 2003.
The Relational Model Lecture 16. Today’s Lecture 1.The Relational Model & Relational Algebra 2.Relational Algebra Pt. II [Optional: may skip] 2 Lecture.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 6 The Relational Algebra and Relational Calculus.
1 Lecture 10: Database Design and Relational Algebra Monday, October 20, 2003.
Riyadh Philanthropic Society For Science Prince Sultan College For Woman Dept. of Computer & Information Sciences CS 340 Introduction to Database Systems.
1 CSE544: Lecture 7 XQuery, Relational Algebra Monday, 4/22/02.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Relational Algebra Chapter 4, Part A.
Relational Algebra COMP3211 Advanced Databases Nicholas Gibbins
©Silberschatz, Korth and Sudarshan2.1Database System Concepts - 6 th Edition Chapter 8: Relational Algebra.
Lecture 14: Relational Algebra Projects XML?
Relational Algebra.
COMP3017 Advanced Databases
COP4710 Database Systems Relational Algebra.
Relational Algebra at a Glance
Relational Algebra Chapter 4 1.
Lecture 8: Relational Algebra
Relational Algebra Chapter 4, Part A
Database Management Systems (CS 564)
Relational Algebra 461 The slides for this text are organized into chapters. This lecture covers relational algebra, from Chapter 4. The relational calculus.
Relational Algebra.
Relational Algebra 1.
Lecture 18 The Relational Model.
April 6th – relational algebra
LECTURE 3: Relational Algebra
Relational Algebra Chapter 4 1.
Relational Algebra Chapter 4, Sections 4.1 – 4.2
RDBMS RELATIONAL DATABASE MANAGEMENT SYSTEM.
Lecture 33: The Relational Model 2
Relational Algebra Friday, 11/14/2003.
CS639: Data Management for Data Science
Lecture 3: Relational Algebra and SQL
Syllabus Introduction Website Management Systems
CENG 351 File Structures and Data Managemnet
Relational Algebra & Calculus
Lecture 11: Functional Dependencies
Presentation transcript:

Lecture 07: Relational Algebra

Outline Relational Algebra (Section 6.1)

Declarative query language Relational Algebra Formalism for creating new relations from existing ones Its place in the big picture: Declarative query language Algebra Implementation Relational algebra SQL, relational calculus

Relational Algebra Five operators: Derived or auxiliary operators: Union:  Difference: - Selection: s Projection: P Cartesian Product:  Derived or auxiliary operators: Intersection, complement Joins (natural,equi-join, theta join, semi-join) Renaming: r

1. Union and 2. Difference R1  R2 Example: R1 – R2 Example: ActiveEmployees  RetiredEmployees R1 – R2 Example: AllEmployees − RetiredEmployees

What about Intersection ? It is a derived operator R1  R2 = R1 – (R1 – R2) Also expressed as a join (will see later) Example UnionizedEmployees  RetiredEmployees

3. Selection Returns all tuples which satisfy a condition Notation: sc(R) Examples sSalary > 40000 (Employee) sname = “Smith” (Employee) The condition c can be =, <, , >, , <> [in SQL: SELECT * FROM Employee WHERE Salary > 40000]

Find all employees with salary more than $40,000. s Salary > 40000 (Employee)

4. Projection Eliminates columns, then removes duplicates Notation: P A1,…,An (R) Example: project to social-security number and names: P SSN, Name (Employee) Output schema: Answer(SSN, Name) [In SQL: SELECT DISTINCT SSN, Name FROM Employee]

P SSN, Name (Employee)

5. Cartesian Product Combine each tuple in R1 with each tuple in R2 Notation: R1  R2 Example: Employee  Dependents Very rare in practice; mainly used to express joins [In SQL: SELECT * FROM R1, R2]

Relational Algebra Five operators: Derived or auxiliary operators: Union:  Difference: - Selection: s Projection: P Cartesian Product:  Derived or auxiliary operators: Intersection, complement Joins (natural,equi-join, theta join, semi-join) Renaming: r

Renaming Changes the schema, not the instance Schema: R(A1, …, An ) Notation: r B1,…,Bn (R) Example: rLastName, SocSocNo (Employee) Output schema: Answer(LastName, SocSocNo) [in SQL: SELECT Name AS LastName, SSN AS SocSocNo FROM Employee]

LastName, SocSocNo (Employee) Renaming Example Employee Name SSN John 999999999 Tony 777777777 LastName, SocSocNo (Employee) LastName SocSocNo John 999999999 Tony 777777777

Natural Join Notation: R1 ⋈ R2 Meaning: R1 ⋈ R2 = PA(sC(R1  R2)) Where: The selection sC checks equality of all common attributes The projection eliminates the duplicate common attributes [in SQL: SELECT DISTINCT R1.A, R1. B, R2.C FROM R1, R2 WHERE R1.B = R2.B Schema: R1(A,B), R2(B,C)]

Natural Join Example Employee Name SSN John 999999999 Tony 777777777 Dependents SSN Dname 999999999 Emily 777777777 Joe Employee Dependents = PName, SSN, Dname(s SSN=SSN2(Employee x rSSN2, Dname(Dependents)) Name SSN Dname John 999999999 Emily Tony 777777777 Joe

Natural Join R= S= R ⋈ S= A B X Y Z V B C Z U V W A B C X Z U V Y W

Natural Join Given the schemas R(A, B, C, D), S(A, C, E), what is the schema of R ⋈ S ? Given R(A, B, C), S(D, E), what is R ⋈ S ? Given R(A, B), S(A, B), what is R ⋈ S ?

Theta Join A join that involves a predicate R1 ⋈ q R2 = s q (R1  R2) Here q can be any condition

Eq-join A theta join where q is an equality R1 ⋈A=B R2 = s A=B (R1  R2) Example: Employee ⋈SSN=SSN Dependents Most useful join in practice (difference to natural join?)

Semijoin R ⋉ S = P A1,…,An (R ⋈ S) Where A1, …, An are the attributes in R Example: Employee ⋉ Dependents

Semijoins in Distributed Databases Semijoins are used in distributed databases Dependents Employee SSN Dname Age . . . SSN Name . . . network Employee ⋈ssn=ssn (s age>71 (Dependents)) T = P SSN s age>71 (Dependents) R = Employee ⋉ T Answer = R ⋈ Dependents

Complex RA Expressions P name buyer-ssn=ssn pid=pid seller-ssn=ssn P ssn P pid sname=fred sname=gizmo Person Purchase Person Product

Application: Query Rewriting for Optimization Reserves Sailors sid=sid bid=100 rating > 5 sname Reserves Sailors sid=sid bid=100 sname rating > 5 (Scan; write to temp T1) temp T2) - If predicates for selection have significant overlap, then it will require unnecessary computation and memory consumption. The earlier we process selections, less tuples we need to manipulate higher up in the tree (predicate pushdown) Disadvantages?

Algebraic Laws (Examples) Commutative and Associative Laws R ∩ S = S ∩ R, R ∩ (S ∩ T) = (R ∩ S) ∩ T R S = S R, R (S T) = (R S) T Laws involving selection s C AND C’(R) = s C(s C’(R)) = s C(R) ∩ s C’(R) s C (R S) = s C (R) S When C involves only attributes of R Laws involving projections PM(PN(R)) = PM,N(R)

Operations on Bags A bag = a set with repeated elements All operations need to be defined carefully on bags {a,b,b,c}{a,b,b,b,e,f,f}={a,a,b,b,b,b,b,c,e,f,f} {a,b,b,b,c,c} – {b,c,c,c,d} = {a,b,b,d} sC(R): preserve the number of occurrences PA(R): no duplicate elimination Cartesian product, join: no duplicate elimination Important ! Relational Engines work on bags, not sets !

Finally: RA has Limitations ! Cannot compute “transitive closure” Find all direct and indirect relatives of Fred Cannot express in RA !!! Need to write C program Name1 Name2 Relationship Fred Mary Father Joe Cousin Bill Spouse Nancy Lou Sister

Formulating queries in RA Consider a database for student enrollment for courses, and books used in the courses STUDENT (SSN, Name, Major, Bdate) COURSE (Course#, Cname, Dept) ENROLL (SSN, Course#, Quarter, Grade) BOOK_ADOPTION (Course#, Quarter, Book_ISBN) TEXT (Book_ISBN, Book_Title, Publisher, Author)

Formulating queries in RA Specify the following queries in relational algebra List the number of courses (Course#) taken by all students named ‘John Smith’ in Winter 1999 (i.e., Quarter = W99) List any department which has all its adopted books published by ‘BC Publishing’

Formulating Queries in RA PCourse# (s Quarter=W99 ((s Name= ‘John Smith’ (STUDENT) ⋈ ENROLL)) OtherDept = PDept ((s Publisher <> ‘PS Publishers’ (BOOK_ADOPTION ⋈ TEXT)) ⋈ COURSE) AllDept = PDept (BOOK_ADOPTION ⋈ COURSE) Answer = AllDept - OtherDept And how will you express it in SQL? WHY?