Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 5.  Data Manipulation Language (DML): subset of SQL which allows users to create queries and to insert, delete and modify rows.  Data Definition.

Similar presentations


Presentation on theme: "Chapter 5.  Data Manipulation Language (DML): subset of SQL which allows users to create queries and to insert, delete and modify rows.  Data Definition."— Presentation transcript:

1 Chapter 5

2  Data Manipulation Language (DML): subset of SQL which allows users to create queries and to insert, delete and modify rows.  Data Definition Language (DDL): subset of SQL which supports the creation, deletion and modifications for table and views.  Triggers and Advanced Integrity Constraints: actions that executed by the DBMS whenever changes to the database meet conditions specified in the trigger.  Embedded and Dynamic SQL: allow SQL code to be called from a host language such as C or COBOL.  Client-Server Execution and Remote Database Access: controls how client application program can connect to an SQL database server, or access data from a database over a network.  Transaction management: commands allow a user to explicitly control aspects of how a transaction is to be executed.  Security: Provide mechanisms to control user’s access to data objects such as tables and views.  Advanced features: such as object-oriented features, recursive queries, decision support queries, data mining, spatial data, text and XML data management

3  Select [DISTINCT] select-list  From from-list  Where qualification Select should specify columns to be retained in the results From should specify a cross-product of tables Where specify selection conditions on the tables mentioned in the from clause. Example: Sailor(sid: integer, sname: string, rating: integer, age: real) Boats(bid: integer, bname: string, color: string) Reservation(sid: integer, bid: integer, day:date)

4  We will use these instances of the Sailors and Reserves relations in our examples.  If the key for the Reserves relation contained only the attributes sid and bid, how would the semantics differ? R1 S1 S2

5  relation-list A list of relation names (possibly with a range-variable after each name).  target-list A list of attributes of relations in relation-list  qualification Comparisons (Attr op const or Attr1 op Attr2, where op is one of ) combined using AND, OR and NOT.  DISTINCT is an optional keyword indicating that the answer should not contain duplicates. Default is that duplicates are not eliminated! SELECT [DISTINCT] target-list FROM relation-list WHERE qualification

6  Find the names and ages of all sailors  Select DISTINCT S.sname, S.age  From Sailor.S  The result without using DISTINCT is: SELECT [DISTINCT] target-list FROM relation-list WHERE qualification

7  Semantics of an SQL query defined in terms of the following conceptual evaluation strategy: ◦ Compute the cross-product of relation-list. ◦ Discard resulting tuples if they fail qualifications. ◦ Delete attributes that are not in target-list. ◦ If DISTINCT is specified, eliminate duplicate rows.  This strategy is probably the least efficient way to compute a query! An optimizer will find more efficient strategies to compute the same answers.

8 Instance R3 of Reserve Instance S4 of Sailors sidbidday 2210110/10/96 5810311/12/96 sidsnameratingage 22dustin745.0 31lubber855.5 58rusty1035.0 Find the names of sailors who have reserved boat number 103. SELECT S.sname FROM Sailors S, Reserves R WHERE S.sid=R.sid AND R.bid=103 The first step is to construct S4xR3 which shows the following results:

9 The second step is to apply the qualification S.sid R.sid AND R.bid=103. This step eliminates all but the last row from the instance shown in the following figure. The third step is to eliminate unwanted columns, so only sname appears in the select clause sname rusty

10  Really needed only if the same attribute (sid) appears twice in the FROM clause. The previous query can also be written as: SELECT S.sname FROM Sailors S, Reserves R WHERE S.sid=R.sid AND bid=103 SELECT sname FROM Sailors, Reserves WHERE Sailors.sid=Reserves.sid AND bid=103 It is good style, however, to use range variables always! OR

11  The join of Sailors and Reserves ensures that for each selected sname, the sailor has made some reservation. SELECT S.sname FROM Sailors S, Reserves R WHERE S.sid=R.sid

12  This query contains a join of two tables, followed by a selection on the color of boats. SELECT R.sid FROM Boats B, Reserves R WHERE B.bid = R.bid AND B.color = ‘red’

13  This query contains a join of three tables, followed by a selection on the color of boats.  The join with Sailors allow us to find the name of the sailor who according to reserves tuple R, has reserved a red boat scribed by tuple B. SELECT S.sname FROM Sailors S, Reserves R, Boats B WHERE S.sid=R.sid AND B.bid = R.bid AND B.color = ‘red’;

14  Illustrates use of arithmetic expressions and string pattern matching: Find triples (of ages of sailors and two fields defined by expressions) for sailors whose names begin and end with B and contain at least three characters.  AS and = are two ways to name fields in result.  LIKE is used for string matching. `_’ stands for exactly one character and `%’ stands for 0 or more arbitrary characters.  Thus, ‘-AB%’ denote a pattern marching every string that contains at least three characters with the second and third characters being A and B respectively.  ‘B-%B’ names begins and ends with B and has at least three characters. SELECT S.age, age1 = S.age-5, 2*S.age AS age2 FROM Sailors S WHERE S.sname LIKE ‘B_%B’

15  The query is easily expressed using the OR connective in the WHERE. However, the following query which is identical except for the use of ‘and’ rather ‘or’ turns out to be much more difficult:  Find the names of sailors who’ve reserved a red and a green boat  Why?  If we replace the use of OR in the previous query by AND, we would retrieve the names of sailors who have reserved a boat that is both red and green.  The integrity constraint that bid is a key for Boats tells us that the same boat cannot have two colors.  So what will be the result of this query?  Empty SELECT S.sname FROM Sailors S, Boats B, Reserves R WHERE S.sid=R.sid AND R.bid=B.bid AND (B.color=‘red’ OR B.color=‘green’)

16  We can think of R1 and B1 as rows that prove that sailor S.sid has reserved a red boat. R2 and B2 similarly prove that the same sailor has reserved a green boat.  S.sname is not included in the results unless five such rows S, R1, B1, R2, and B2 are found. But, the previous query is difficult to understand, thus, a better solution foe these two queries is to use UNION and INTERSECT. A correct statement of the above query using AND will be as follow: SELECT S.sname FROM Sailors S, Reserves R1, Boats B1, Reserves R2, Boats B2, WHERE S.sid=R1.sid AND R1.bid=B1.bid AND S.sid=R2.sid AND R2.bid=B2.bid AND (B1.color=‘red’ AND B2.color=‘green’)

17  UNION : Can be used to compute the union of any two union- compatible sets of tuples (which are themselves the result of SQL queries).  Also available: EXCEPT (What do we get if we replace UNION by EXCEPT ?) SELECT S.sname FROM Sailors S, Boats B, Reserves R WHERE S.sid=R.sid AND R.bid=B.bid AND (B.color=‘red’ OR B.color=‘green’) SELECT S.sname FROM Sailors S, Boats B, Reserves R WHERE S.sid=R.sid AND R.bid=B.bid AND B.color=‘red’ UNION SELECT S2.sname FROM Sailors S, Boats B2, Reserves R2 WHERE S2.sid=R2.sid AND R2.bid=B2.bid AND B2.color=‘green’

18  INTERSECT : Can be used to compute the intersection of any two union-compatible sets of tuples.  Included in the SQL/92 standard, but some systems don’t support it.  Now try the same query but for sid not sname. SELECT S.sname FROM Sailors S, Boats B1, Reserves R1, Boats B2, Reserves R2 WHERE S.sid=R1.sid AND R1.bid=B1.bid AND S.sid=R2.sid AND R2.bid=B2.bid AND (B1.color=‘red’ AND B2.color=‘green’) SELECT S.sname FROM Sailors S, Boats B, Reserves R WHERE S.sid=R.sid AND R.bid=B.bid AND B.color=‘red’ INTERSECT SELECT S2.sname FROM Sailors S2, Boats B2, Reserves R2 WHERE S2.sid=R2.sid AND R2.bid=B2.bid AND B2.color=‘green’

19  Sailors 22, 64 and 31 have reserved red boats. Sailors 22, 74 and 31 have reserved green boats. Hence the answer contains just the 64. But do we need all relations in this query???? So do it. SELECT S.sid FROM Boats B, Reserves R WHERE R.bid=B.bid AND B.color=‘red’ EXCEPT SELECT S2.sid FROM Boats B2, Reserves R2 WHERE R2.bid=B2.bid AND B2.color=‘green’ SELECT S.sid FROM Sailors S, Boats B, Reserves R WHERE S.sid=R.sid AND R.bid=B.bid AND B.color=‘red’ EXCEPT SELECT S2.sid FROM Sailors S2, Boats B2, Reserves R2 WHERE S2.sid=R2.sid AND R2.bid=B2.bid AND B2.color=‘green’

20  The first part of the union returns the sids 58 and 71. The second part returns 22 and 31. The answer is, therefore, the set of sids 22, 31, 58 and 71 SELECT S.sid FROM Sailors S, WHERE S.rating = 10 UNION SELECT R.sid FROM Reserves R WHERE R.bid = 104;

21  Is a query that has another query embedded within it; the embedded query is called subquery.  A very powerful feature of SQL: a WHERE clause can itself contain an SQL query! (Actually, so can FROM and HAVING clauses.)  To understand semantics of nested queries, think of a nested loops evaluation: For each Sailors tuple, check the qualification by computing the subquery.

22  The nested subquery computes the (multi) set of sids for sailors who have reserved boat 103 (22,31 and 74 on instances R2 and S3) and the top query retrieve the names of sailors whose sid is in this set.  To find sailors who’ve not reserved #103, use NOT IN.  The best way to understand a nested query is to think of it in terms of conceptual strategy. In our example, the strategy consists of examining rows of Sailors and for each row, evaluating the subquery over Reserves.  Construct the cross-product of the tables in the FROM clause of the top level query as before. For each row in the cross product, recompute the subquery. SELECT S.sname FROM Sailors S WHERE S.sid IN ( SELECT R.sid FROM Reserves R WHERE R.bid=103) Find names of sailors who’ve reserved boat #103:

23  The innermost subquery find the set of bids of red boats (102 and 104). The subquery one level above finds the set of sids of sailors who have reserved one of these boats (22,31 and 64).  The top level query finds the names of sailors whose sid is in this set of sids. So we get Dustin, Lubber and Horatio.  Q: SELECT S.sname FROM Sailors S WHERE S.sid IN (SELECT R.sid FROM Reserves R WHERE R.bid IN (Select B.bid FROM Boats B WHERE B.color = ‘red’) Find names of sailors who’ve reserved a red boat:  The subquery might itself contain another nested subquery, in which case we apply the same idea one more time, leading to an evaluation strategy with several levels of nested loop.

24  EXISTS is another set comparison operator, like IN. It test whether a set is nonempty.  Thus for each Sailor row S we test whether the set of Reserves rows R such that R.bid =103 AND S.sid =R.sid is non empty. If so, Sailor S has reserved boat 103, and we retrieve the name.  The occurrence of S in the subquery is called correlation and such queries are called correlated queries. SELECT S.sname FROM Sailors S WHERE EXISTS ( SELECT * FROM Reserves R WHERE R.bid=103 AND S.sid=R.sid) Find names of sailors who’ve reserved boat #103:  In the nested query, the inner subquery has been completely independent of the outer query. The inner sub query could depend on the rows currently being examined in the outer query.

25  We’ve already seen IN, EXISTS and UNIQUE. Can also use NOT IN, NOT EXISTS and NOT UNIQUE.  Also available: op ANY, op ALL, where op is one of the arithmetic comparison operator  Find sailors whose rating is greater than that of some sailor called Horatio:  What this query will compute ?  On instance Sailors, this computes the sids 31, 32, 58, 71 and 74.  What if the query was: Find sailors whose rating is better than every sailor called Horatio?  Just replace ANY with ALL. What is the result?  Sids 58 and 71. SELECT * FROM Sailors S WHERE S.rating > ANY (SELECT S2.rating FROM Sailors S2 WHERE S2.sname=‘Horatio’)

26  Find the sailors with the highest rating  The outer WHERE condition is satisfied only when S.rating is greater than or equal to each of these rating values, that is when it is the largest rating value.  In the instance S3, the condition is satisfied only for rating 10. and the answer is sids: 58 and 71 SELECT S.sid FROM Sailors S WHERE S.rating >= ALL ( SELECT S2.rating FROM Sailors S2)

27  Find the names of sailors who have reserved both a red and a green boats.  The query can be understood as follow: Find all sailors who have reserved a red boat and further have sids that are included in the set of sids of Sailors who have reserved a green boats. This formulation of the query illustrates how queries involving INTERSECT can be rewritten using IN  Queries using EXCEPT can be similarly rewritten by using NOT IN.  Find the sids of Sailors who have reserved red boats but not green? SELECT S.sname FROM Sailors S, Reserves R, Boats B WHERE S.sid=R.sid AND R.bid=B.bid AND B.color=‘red’ AND S.sid IN (SELECT S2.sid FROM Sailors S2, Boats B2, Reserves R2 WHERE S2.sid=R2.sid AND R2.bid=B2.bid AND B2.color=‘green’

28  It can be used to perform some computation and summarization. COUNT (*) COUNT ( [ DISTINCT ] A) SUM ( [ DISTINCT ] A) AVG ( [ DISTINCT ] A) MAX (A) MIN (A) Find the average age of sailors with a rating of 10. SELECT AVG (S.age) FROM Sailors S WHERE S.rating=10 Count the number of Sailors: SELECT COUNT (*) FROM Sailors S; What is the answer? 10 Find the name and age of oldest Sailor SELECT S.sname, MAX (S.age) FROM Sailors S This query is illegal, we must use GROUP BY Find the name and age of oldest Sailor SELECT S.sname, S.age FROM Sailors S WHERE S.age=( SELECT MAX (S2.age) FROM Sailors S2) single column

29  The first query is illegal! (We’ll look into the reason a bit later, when we discuss GROUP BY.)  The third query is equivalent to the second query, and is allowed in the SQL/92 standard, but is not supported in some systems. SELECT S.sname, MAX (S.age) FROM Sailors S SELECT S.sname, S.age FROM Sailors S WHERE S.age = ( SELECT MAX (S2.age) FROM Sailors S2) SELECT S.sname, S.age FROM Sailors S WHERE ( SELECT MAX (S2.age) FROM Sailors S2) = S.age

30  So far, we’ve applied aggregate operators to all (qualifying) tuples. Sometimes, we want to apply them to each of several groups of tuples.  Consider: Find the age of the youngest sailor for each rating level. ◦ In general, we don’t know how many rating levels exist, and what the rating values for these levels are! ◦ Suppose we know that rating values go from 1 to 10; we can write 10 queries that look like this (!): SELECT MIN (S.age) FROM Sailors S WHERE S.rating = i For i = 1, 2,..., 10:

31  The target-list contains (i) attribute names (ii) terms with aggregate operations (e.g., MIN (S.age)). ◦ The attribute list (i) must be a subset of grouping-list. Intuitively, each answer tuple corresponds to a group, and these attributes must have a single value per group. (A group is a set of tuples that have the same value for all attributes in grouping-list.) SELECT [DISTINCT] target-list FROM relation-list WHERE qualification GROUP BY grouping-list HAVING group-qualification

32  Find the age of the youngest sailor for each rating level: Select S.rating, Min (S.age) From Sailor S Group By S.rating

33  The cross-product of relation-list is computed  Tuples that fail qualification are discarded, `unnecessary’ fields are deleted, and  The remaining tuples are partitioned into groups by the value of attributes in grouping-list.  The group-qualification is then applied to eliminate some groups. Expressions in group- qualification must have a single value per group! ◦ In effect, an attribute in group-qualification that is not an argument of an aggregate op also appears in grouping-list. (SQL does not exploit primary key semantics here!)  One answer tuple is generated per qualifying group.

34 SELECT S.rating, MIN (S.age) AS minage FROM Sailors S WHERE S.age >= 18 GROUP BY S.rating HAVING COUNT (*) > 1 Answer relation: Sailors instance:

35

36 HAVING COUNT (*) > 1 AND EVERY (S.age <=60)

37 SELECT S.rating, MIN (S.age) AS minage FROM Sailors S WHERE S.age >= 18 AND S.age <= 60 GROUP BY S.rating HAVING COUNT (*) > 1 Answer relation: Sailors instance:

38  Grouping over a join of three relations.  What do we get if we remove B.color=‘red’ from the WHERE clause and add a HAVING clause with this condition?  What if we drop Sailors and the condition involving S.sid? SELECT B.bid, COUNT (*) AS scount FROM Sailors S, Boats B, Reserves R WHERE S.sid=R.sid AND R.bid=B.bid AND B.color=‘red’ GROUP BY B.bid

39  SQL was an important factor in the early acceptance of the relational model; more natural than earlier, procedural query languages.  Relationally complete; in fact, significantly more expressive power than relational algebra.  Even queries that can be expressed in RA can often be expressed more naturally in SQL.  Many alternative ways to write a query; optimizer should look for most efficient evaluation plan. ◦ In practice, users need to be aware of how queries are optimized and evaluated for best results.

40 Consider the following schema: Flights(flno: integer, from: string, to: string, distance: integer, departs: time, arrives: time, price: real) Aircraft(aid: integer, aname: string, cruisingrange: integer) Certified(eid: integer, aid: integer) Employees(eid: integer, ename: string, salary: integer) Note that the Employees relation describes pilots and other kinds of employees as well; every pilot is certified for some aircraft, and only pilots are certified to fly. Write each of the following queries in SQ: 1. Find the names of aircraft such that all pilots certified to operate them have salaries more than $80,000. 2. For each pilot who is certified for more than three aircraft, find the eid and the maximum cruisingrange of the aircraft for which she or he is certified. 3. Find the names of pilots whose salary is less than the price of the cheapest route from Los Angeles to Honolulu. 4. For all aircraft with cruisingrange over 1000 miles, find the name of the aircraft and the average salary of all pilots certified for this aircraft. 6. Find the aids of all aircraft that can be used on routes from Los Angeles to Chicago

41 sid snameratingage 18jones330.0 41jonah656.0 22ahab744.0 63mobynull15.0 An Instance of Sailors Write SQL queries to compute the average rating, using AVG; the sum of the ratings, using SUM; and the number of ratings, using COUNT.


Download ppt "Chapter 5.  Data Manipulation Language (DML): subset of SQL which allows users to create queries and to insert, delete and modify rows.  Data Definition."

Similar presentations


Ads by Google