Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Structured Query Language Zachary G. Ives / Nicholas Taylor University of Pennsylvania CIS 550 – Database & Information Systems September 26, 2007.

Similar presentations


Presentation on theme: "The Structured Query Language Zachary G. Ives / Nicholas Taylor University of Pennsylvania CIS 550 – Database & Information Systems September 26, 2007."— Presentation transcript:

1 The Structured Query Language Zachary G. Ives / Nicholas Taylor University of Pennsylvania CIS 550 – Database & Information Systems September 26, 2007 Some slide content courtesy of Susan Davidson & Raghu Ramakrishnan

2 2 Administrivia  Homework 2 handed out today  Due 10/8

3 3 Recall Basic SQL SELECT [DISTINCT] {T 1.attrib, …, T 2.attrib} FROM {relation} T 1, {relation} T 2, … WHERE {predicates}  SELECT *  All STUDENTs  AS  As a “range variable” (tuple variable): optional  As an attribute rename operator select-list from-list qualification

4 4 Our Example Data Instance sidname 1Jill 2Qun 3Nitin fidname 1Ives 2Saul 8Martin sidexp-gradecid 1A550-0105 1A700-1005 3C501-0105 cidsubjsem 550-0105DBF05 700-1005AIS05 501-0105ArchF05 fidcid 1550-0105 2700-1005 8501-0105 STUDENT Takes COURSE PROFESSOR Teaches

5 5 Some Nice Features  SELECT *  All STUDENTs  AS  As a “range variable” (tuple variable): optional  As an attribute rename operator  Example:  Which students (names) have taken more than one course from the same professor?

6 6 Expressions in SQL  Can do computation over scalars (int, real or string) in the select-list or the qualification  Show all student IDs decremented by 1  Strings:  Fixed (CHAR(x)) or variable length (VARCHAR(x))  Use single quotes: ’A string’  Special comparison operator: LIKE  Not equal: <>  Typecasting:  CAST(S.sid AS VARCHAR(255))

7 7 Set Operations  Set operations default to set semantics, not bag semantics: (SELECT … FROM … WHERE …) {op} (SELECT … FROM … WHERE …)  Where op is one of:  UNION  INTERSECT, MINUS/EXCEPT (many DBs don’t support these last ones!)  Bag semantics: ALL

8 8 Exercise  Find all students who have taken DB but not AI  Hint: use EXCEPT

9 9 Set Operations  Set operations default to set semantics, not bag semantics: (SELECT … FROM … WHERE …) {op} (SELECT … FROM … WHERE …)  Where op is one of:  UNION  INTERSECT, MINUS/EXCEPT (many DBs don’t support these last ones!)  Bag semantics: ALL

10 10 Exercise  Find all students who have taken DB but not AI  Hint: use EXCEPT

11 11 Revised Example Data Instance sidname 1Jill 2Qun 3Nitin 4Marty fidname 1Ives 2Saul 8Martin sidexp-gradecid 1A550-0105 1A700-1005 3A 3C501-0105 4C cidsubjsem 550-0105DBF05 700-1005AIS05 501-0105ArchF05 555-1006SysS06 fidcid 1550-0105 2700-1005 8501-0105 STUDENT Takes COURSE PROFESSOR Teaches

12 12 Nested Queries in SQL  Simplest: IN/NOT IN  Example: Students who have taken subjects that have (at any point) been taught by Martin

13 13 Correlated Subqueries  Most common: EXISTS/NOT EXISTS  Find all students who have taken DB but not AI

14 14 Universal and Existential Quantification  Generally used with subqueries:  {op} ANY, {op} ALL  Find the students with the best expected grades

15 15 Table Expressions  Can substitute a subquery for any relation in the FROM clause: SELECT S.sid FROM (SELECT sid FROM STUDENT WHERE sid = 5) S WHERE S.sid = 4 Notice that we can actually simplify this query! What is this equivalent to?

16 16 Aggregation  GROUP BY SELECT {group-attribs}, {aggregate-operator}(attrib) FROM {relation} T 1, {relation} T 2, … WHERE {predicates} GROUP BY {group-list}  Aggregate operators  AVG, COUNT, SUM, MAX, MIN  DISTINCT keyword for AVG, COUNT, SUM

17 17 Some Examples  Number of students in each course offering  Number of different grades expected for each course offering  Number of (distinct) students taking AI courses

18 18 Data Instance, Again sidname 1Jill 2Qun 3Nitin 4Marty fidname 1Ives 2Saul 8Martin sidexp-gradecid 1A550-0105 1A700-1005 3A 3C501-0105 4C cidsubjsem 550-0105DBF05 700-1005AIS05 501-0105ArchF05 555-1006SysS06 fidcid 1550-0105 2700-1005 8501-0105 STUDENT Takes COURSE PROFESSOR Teaches

19 19 What If You Want to Only Show Some Groups?  The HAVING clause lets you do a selection based on an aggregate (there must be 1 value per group): SELECT C.subj, COUNT(S.sid) FROM STUDENT S, Takes T, COURSE C WHERE S.sid = T.sid AND T.cid = C.cid GROUP BY subj HAVING COUNT(S.sid) > 5  Exercise: For each subject taught by at least two professors, list the minimum expected grade

20 20 Aggregation and Table Expressions (aka Derived Relations)  Sometimes need to compute results over the results of a previous aggregation: SELECT subj, AVG(size) FROM ( SELECT C.cid AS id, C.subj AS subj, COUNT(S.sid) AS size FROM STUDENT S, Takes T, COURSE C WHERE S.sid = T.sid AND T.cid = C.cid GROUP BY cid, subj) GROUP BY subj

21 21 Thought Exercise…  Tables are great, but…  Not everyone is uniform – I may have a cell phone but not a fax  We may simply be missing certain information  We may be unsure about values  How do we handle these things?

22 22 One Answer: Null Values  We designate a special “null” value to represent “unknown” or “N/A”  But a question: what does: do? NameHomeFax Sam123-4567NULL Li234-8972234-8766 Maria789-2312789-2121 SELECT * FROM CONTACT WHERE Fax < “789-1111”

23 23 Three-State Logic  Need ways to evaluate boolean expressions and have the result be “unknown” (or T/F)  Need ways of composing these three-state expressions using AND, OR, NOT:  Can also test for null-ness: attr IS NULL, attr IS NOT NULL  Finally: need rules for arithmetic, aggregation T AND U = U F AND U = F U AND U = U T OR U = T F OR U = U U OR U = U NOT U = U

24 24 Nulls and Joins  Sometimes need special variations of joins:  I want to see all courses and their students  … But what if there’s a course with no students?  Outer join:  Most common is left outer join: SELECT C.subj, C.cid, T.sid FROM COURSE C LEFT OUTER JOIN Takes T ON C.cid = T.cid WHERE …

25 25 Data Instance, Again (!) sidname 1Jill 2Qun 3Nitin 4Marty fidname 1Ives 2Saul 8Martin sidexp-gradecid 1A550-0105 1A700-1005 3A 3C501-0105 4C cidsubjsem 550-0105DBF05 700-1005AIS05 501-0105ArchF05 555-1006SysS06 fidcid 1550-0105 2700-1005 8501-0105 STUDENT Takes COURSE PROFESSOR Teaches

26 26 Warning on Outer Join  Oracle doesn’t support standard SQL syntax here: SELECT C.subj, C.cid, T.sid FROM COURSE C, Takes T WHERE C.cid =(+) T.cid

27 27 Beyond Null  Can have much more complex ideas of incomplete or approximate information  Probabilistic models (tuple 80% likely to be an answer)  Naïve tables (can have variables instead of NULLs)  Conditional tables (tuple IF some condition holds)  … And what if you want “0 or more”?  In relational databases, create a new table and foreign key  But can have semistructured data (like XML)

28 28 Modifying the Database: Inserting Data  Inserting a new literal tuple is easy, if wordy: INSERT INTO PROFESSOR (fid, name) VALUES (4, ‘Simpson’)  But we can also insert the results of a query! INSERT INTO PROFESSOR (fid, name) SELECT sid AS fid, name FROM STUDENT WHERE sid < 20

29 29 Deleting Tuples  Deletion is a fairly simple operation: DELETE FROM STUDENT S WHERE S.sid < 25

30 30 Updating Tuples  What kinds of updates might you want to do? UPDATE STUDENT S SET S.sid = 1 + S.sid, S.name = ‘Janet’ WHERE S.name = ‘Jane’

31 31 Now, How Do I Talk to the DB?  Generally, apps are in a different (“host”) language with embedded SQL statements  Static (query fixed): SQLJ, embedded SQL in C  Dynamic (query generated by program at runtime): ODBC, JDBC, ADO, OLE DB, …  Predefined mappings between SQL types and host language types  CHAR, VARCHAR  String  INTEGER  int  DOUBLE  double

32 32 Static SQL using SQLJ int sid = 5; String name5 = " Jim ", name5; // Database connection setup omitted #sql { INSERT INTO STUDENT VALUES(:sid, :name) }; #sql { SELECT name INTO :name6 FROM STUDENT WHERE sid = 6 };

33 33 JDBC: Dynamic SQL import java.sql.*; Connection conn = DriverManager.getConnection(…); Statement s = conn.createStatement(); int sid = 5; String name = "Jim"; s.executeUpdate("INSERT INTO STUDENT VALUES(" + sid + ", '" + name + "')"); // or equivalently s.executeUpdate(" INSERT INTO STUDENT VALUES(5, 'Jim')");

34 34 Static vs. Dynamic SQL  Syntax  Static is cleaner that Dynamic  Dynamic doesn’t extend language syntax, so you can use any tool you like  Execution  Static must be precompiled  Can be faster at runtime  Extra step is needed to deploy application  Static checks SQL syntax at compilation time, Dynamic at run time  We’ll focus on JDBC, since it’s easy to use

35 35 The Impedance Mismatch and Cursors  SQL is set-oriented – it returns relations  There’s no relation type in most languages!  Solution: cursor that’s opened, read ResultSet rs = stmt.executeQuery("SELECT * FROM STUDENT"); while (rs.next()) { int sid = rs.getInt("sid"); String name = rs.getString("name"); System.out.println(sid + ": " + name); }

36 36 JDBC: Prepared Statements (1)  But query compilation takes a (relatively) long time!  This example is therefore inefficient. int[] students = {1, 2, 4, 7, 9}; for (int i = 0; i < students.length; ++i) { ResultSet rs = stmt.executeQuery("SELECT * " + "FROM STUDENT WHERE sid = " + students[i]); while (rs.next()) { … }

37 37 JDBC: Prepared Statements (2)  To speed things up, prepare statements and bind arguments to them  This also means you don’t have to worry about escaping strings, formatting dates, etc.  Problems with this lead to a lot of security holes (SQL injection)  Or suppose a user inputs the name “O’Reilly” PreparedStatement stmt = conn.prepareStatement("SELECT * " + " FROM STUDENT WHERE sid = ? "); int[] students = {1, 2, 4, 7, 9}; for (int i = 0; i < students.length; ++i) { stmt.setInt(1, students[i]); ResultSet rs = stmt.executeQuery(); while (rs.next()) { … }

38 38 Database-Backed Web Sites  We all know traditional static HTML web sites: Web-Browser HTTP-Request GET... Web-Server File-System Load File HTML-File

39 39 Common Gateway Interface (CGI) Can have the web server invoke code (with parameters) to generate HTML Web Server HTTP-Request HTML-File Web Server File-System Load File File HTML? HTML Execute Program Program?Output I/O, Network, DB

40 40 CGI: Discussion  Advantages:  Standardized: works for every web-server, browser  Flexible: Any language (C++, Perl, Java, …) can be used  Disadvantages:  Statelessness: query-by-query approach  Inefficient: new process forked for every request  Security: CGI programmer is responsible for security  Updates: To update layout, one has to be a programmer

41 41 Java-Server-Process DB Access in Java Sybase Java Applet TCP/UDP IP Oracle... JDBC- Driver JDBC Driver manager Browser JVM

42 42 Java Applets: Discussion  Advantages:  Can take advantage of client processing  Platform independent – assuming standard Java  Disadvantages:  Requires JVM on client; self-contained  Inefficient: loading can take a long time...  Resource intensive: Client needs to be state of the art  Restrictive: can only connect to server where applet was loaded from (for security … can be configured)

43 43 *SP Server Pages and Servlets (IIS, Tomcat, …) File-System Web Server HTTP Request HTML File Web Server Load File File HTML? HTML I/O, Network, DB Script/ Servlet? Output Server Extension May have a built- in VM (JVM, CLR)

44 44 DB-Driven Web Server One Step Beyond: DB-Driven Web Sites (Strudel, Cocoon, …) Local Database HTTP Request HTML File Web Server Cache Data HTML Other data sources Script? Dynamic HTML Generation Styles

45 45 Wrapping Up  We’ve seen how to query in SQL  Basic foundation is TRC-based  Subqueries and aggregation add extra power beyond *RC  Nulls and outer joins add flexibility of representation  We can update tables  We’ve also seen that SQL doesn’t precisely match standard host language semantics  Embedded SQL  Dynamic SQL  We’ve seen a hint of data-driven web site architectures


Download ppt "The Structured Query Language Zachary G. Ives / Nicholas Taylor University of Pennsylvania CIS 550 – Database & Information Systems September 26, 2007."

Similar presentations


Ads by Google