Presentation is loading. Please wait.

Presentation is loading. Please wait.

From the Calculus to the Structured Query Language Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 19, 2007.

Similar presentations


Presentation on theme: "From the Calculus to the Structured Query Language Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 19, 2007."— Presentation transcript:

1 From the Calculus to the Structured Query Language Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 19, 2007 Some slide content courtesy of Susan Davidson & Raghu Ramakrishnan

2 2 Administrivia  Recall: no class next Monday 9/24 – special TA office hours instead  SQL discussion continues 9/26  Preparation for Homework 2 (handed out next week)  To test your SQL queries, we have Oracle set up on eniac.seas.upenn.edu  Go to: www.seas.upenn.edu/~zives/cis550/oracle-faq.html Click on “create Oracle account” link Enter your login info so you’ll get an Oracle accountwww.seas.upenn.edu/~zives/cis550/oracle-faq.html

3 3 Recall Last Time  Which students have taken more than one course from the same professor? { |  sid,cid,fid,cid2. ( STUDENTS ^ Takes ^ Teaches ^ Takes ^ Teaches ^ cid  cid2)} OR { |  sid,cid,fid. ( STUDENTS ^ Takes ^ Teaches ^  cid2 ( Takes ^ Teaches ^ cid  cid2))}

4 4 Algebra vs. Calculus  We’ve claimed that the calculus (when safe) and the algebra are equivalent  Thus (core) SQL => calculus  algebra makes sense  Let’s look more closely at this… SELECT * FROM STUDENT, Takes, COURSE WHERE STUDENT.sid = Takes.sID AND Takes.cID = cid STUDENT Takes COURSE Calculus

5 5 Translating from RA to DRC  Core of relational algebra: , , , x, -  We need to work our way through the structure of an RA expression, translating each possible form.  Let TR[e] be the translation of RA expression e into DRC.  Relation names: For the RA expression R, the DRC expression is { |  R}

6 6 Selection: TR[   R]  Suppose we have   (e’), where e’ is another RA expression that translates as: TR[e’]= { | p}  Then the translation of  c (e’) is { | p  ’} where  ’ is obtained from  by replacing each attribute with the corresponding variable  Example: TR[  #1=#2  #4>2.5 R] (if R has arity 4) is { |  R  x 1 =x 2  x 4 >2.5}

7 7 Projection: TR[  i 1,…,i m (e)]  If TR[e]= { | p} then TR[  i 1,i 2,…,i m (e)]= { |  x j 1,x j 2, …, x j k.p}, where x j 1,x j 2, …, x j k are variables in x 1,x 2, …, x n that are not in x i 1,x i 2, …, x i m  Example: With R as before,  #1,#3 (R)={ |  x 2,x 4.  R}

8 8 Union: TR[R 1  R 2 ]  R 1 and R 2 must have the same arity  For e 1  e 2, where e 1, e 2 are algebra expressions TR[e 1 ]={ |p} and TR[e 2 ]={ |q}  Relabel the variables in the second: TR[e 2 ]={ |q’}  This may involve relabeling bound variables in q to avoid clashes TR[e 1  e 2 ]={ |p  q’}.  Example: TR[R 1  R 2 ] = { |  R 1   R 2

9 9 Other Binary Operators  Difference: The same conditions hold as for union If TR[e 1 ]={ |p} and TR[e 2 ]={ |q} Then TR[e 1 - e 2 ]= { |p  q}  Product: If TR[e 1 ]={ |p} and TR[e 2 ]={ |q} Then TR[e 1  e 2 ]= { | p  q}  Example: TR[R  S]= { |  R   S }

10 10 What about the Tuple Relational Calculus?  We’ve been looking at the Domain Relational Calculus  The Tuple Relational Calculus is nearly the same, but variables are at the level of a tuple, not an attribute  {Q | 9 S  COURSES, 9 T 2 Takes (S.cid = T.cid Æ Q.cid = S.cid Æ Q.exp-grade = T.exp-grade)}

11 11 Tuple Relational Calculus (in More Detail) Queries of form: {T | p} Predicate: boolean expression over T x attribs  Expressions: T x  RT X.a op T Y.b T X.a op constconst op T X.a T.a = T x.a where op is , , , , ,  T x,… are tuple variables, T x.a, … are attributes  Complex expressions: e 1  e 2, e 1  e 2,  e, and e 1  e 2  Universal and existential quantifiers predicate

12 12 Domain Relational Calculus to Tuple Relational Calculus  { | 9 cid, sem, cid, sid ( 2 COURSE Æ 2 Takes}  { | 9 s1, s2 ( 2 COURSE Æ 9 cid2, s3, s4 ( 2 COURSE Æ (cid > cid2)))}

13 13 Mini-Quiz on the Relational Calculus How do you write:  TRC: Which faculty teach every course?

14 14 Limitations of the Relational Algebra / Calculus Can’t do:  Aggregate operations (sum, count)  Recursive queries (arbitrary # of joins)  Complex (non-tabular) structures  Most of these are expressible in SQL, OQL, XQuery – using other special operators  Sometimes we even need the power of a Turing- complete programming language

15 15 Summary  Can translate relational algebra into relational calculus  DRC and TRC are slightly different syntaxes but equivalent  Given syntactic restrictions that guarantee safety of DRC query, can translate back to relational algebra  These are the principles behind initial development of relational databases  SQL is close to calculus; query plan is close to algebra  Great example of theory leading to practice!

16 16 Basic SQL: A Friendly Face Over the Tuple Relational Calculus SELECT [DISTINCT] {T 1.attrib, …, T 2.attrib} FROM {relation} T 1, {relation} T 2, … WHERE {predicates} Let’s do some examples, which will leverage your knowledge of the relational calculus…  Faculty ids  Course IDs for courses with students expecting a “C”  Courses taken by Jill select-list from-list qualification

17 17 Our Example Data Instance sidname 1Jill 2Qun 3Nitin fidname 1Ives 2Saul 8Martin sidexp-gradecid 1A550-0105 1A700-1005 3C501-0105 cidsubjsem 550-0105DBF05 700-1005AIS05 501-0105ArchF05 fidcid 1550-0105 2700-1005 8501-0105 STUDENT Takes COURSE PROFESSOR Teaches

18 18 Some Nice Features  SELECT *  All STUDENTs  AS  As a “range variable” (tuple variable): optional  As an attribute rename operator  Example:  Which students (names) have taken more than one course from the same professor?

19 19 Expressions in SQL  Can do computation over scalars (int, real or string) in the select-list or the qualification  Show all student IDs decremented by 1  Strings:  Fixed (CHAR(x)) or variable length (VARCHAR(x))  Use single quotes: ’A string’  Special comparison operator: LIKE  Not equal: <>  Typecasting:  CAST(S.sid AS VARCHAR(255))

20 20 Set Operations  Set operations default to set semantics, not bag semantics: (SELECT … FROM … WHERE …) {op} (SELECT … FROM … WHERE …)  Where op is one of:  UNION  INTERSECT, MINUS/EXCEPT (many DBs don’t support these last ones!)  Bag semantics: ALL

21 21 Exercise  Find all students who have taken DB but not AI  Hint: use EXCEPT

22 22 Nested Queries in SQL  Simplest: IN/NOT IN  Example: Students who have taken subjects that have (at any point) been taught by Martin

23 23 Correlated Subqueries  Most common: EXISTS/NOT EXISTS  Find all students who have taken DB but not AI

24 24 Universal and Existential Quantification  Generally used with subqueries:  {op} ANY, {op} ALL  Find the students with the best expected grades

25 25 Table Expressions  Can substitute a subquery for any relation in the FROM clause: SELECT S.sid FROM (SELECT sid FROM STUDENT WHERE sid = 5) S WHERE S.sid = 4 Notice that we can actually simplify this query! What is this equivalent to?

26 26 Aggregation  GROUP BY SELECT {group-attribs}, {aggregate-operator}(attrib) FROM {relation} T 1, {relation} T 2, … WHERE {predicates} GROUP BY {group-list}  Aggregate operators  AVG, COUNT, SUM, MAX, MIN  DISTINCT keyword for AVG, COUNT, SUM

27 27 Some Examples  Number of students in each course offering  Number of different grades expected for each course offering  Number of (distinct) students taking AI courses

28 28 What If You Want to Only Show Some Groups?  The HAVING clause lets you do a selection based on an aggregate (there must be 1 value per group): SELECT C.subj, COUNT(S.sid) FROM STUDENT S, Takes T, COURSE C WHERE S.sid = T.sid AND T.cid = C.cid GROUP BY subj HAVING COUNT(S.sid) > 5  Exercise: For each subject taught by at least two professors, list the minimum expected grade

29 29 Aggregation and Table Expressions  Sometimes need to compute results over the results of a previous aggregation: SELECT subj, AVG(size) FROM ( SELECT C.cid AS id, C.subj AS subj, COUNT(S.sid) AS size FROM STUDENT S, Takes T, COURSE C WHERE S.sid = T.sid AND T.cid = C.cid GROUP BY cid, subj) GROUP BY subj

30 30 Something to Ponder  Tables are great, but…  Not everyone is uniform – I may have a cell phone but not a fax  We may simply be missing certain information  We may be unsure about values  How do we handle these things?


Download ppt "From the Calculus to the Structured Query Language Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 19, 2007."

Similar presentations


Ads by Google