1.1 CAS CS 460/660 Introduction to Database Systems Relational Model and more…

Slides:



Advertisements
Similar presentations
Relational Database. Relational database: a set of relations Relation: made up of 2 parts: − Schema : specifies the name of relations, plus name and type.
Advertisements

Database Management Systems, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
SQL Lecture 10 Inst: Haya Sammaneh. Example Instance of Students Relation  Cardinality = 3, degree = 5, all rows distinct.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke. Edited by Keith Shomper, The Relational Model Chapter 3.
The Relational Model CMU SCS /615 C. Faloutsos – A. Pavlo Lecture #3 R & G, Chap. 3.
1 Lecture 11: Basic SQL, Integrity constraints
SQL Review.
Database Management Systems 1 Raghu Ramakrishnan The Relational Model Chapter 3 Instructor: Mirsad Hadzikadic.
The Relational Model Class 2 Book Chapter 3 Relational Data Model Relational Query Language (DDL + DML) Integrity Constraints (IC) (From ER to Relational)
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 28 Database Systems I The Relational Data Model.
D ATABASE S YSTEMS I T HE R ELATIONAL D ATA M ODEL.
The Relational Model Ramakrishnan & Gehrke Chapter 3.
The Relational Model CS 186, Fall 2006, Lecture 2 R & G, Chap. 3.
The Relational Model CS 186, Spring 2006, Lecture 2 R & G, Chap. 1 & 3.
FALL 2004CENG 351 File Structures and Data Management1 SQL: Structured Query Language Chapter 5.
SPRING 2004CENG 3521 The Relational Model Chapter 3.
The Relational Model Ramakrishnan & Gehrke, Chap. 3.
The Relational Model CS 186, Spring 2007, Lecture 2 Cow book Section 1.5, Chapter 3 Mary Roth.
The Relational Model 198:541 Rutgers University. Why Study the Relational Model?  Most widely used model. Vendors: IBM, Informix, Microsoft, Oracle,
1 Relational Model. 2 Relational Database: Definitions  Relational database: a set of relations  Relation: made up of 2 parts: – Instance : a table,
The Relational Model Lecture 3 Book Chapter 3 Relational Data Model Relational Query Language (DDL + DML) Integrity Constraints (IC) From ER to Relational.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 SQL: Queries, Constraints, Triggers Chapter 5.
CAS CS 460 Relational Model Based on Slides created by Prof. Mitch Cherniack, Brandeis University and from Prof. Joe.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
1 The Relational Model Chapter 3. 2 Objectives  Representing data using the relational model.  Expressing integrity constraints on data.  Creating,
The Relational Model These slides are based on the slides of your text book.
Data Models Amandeep Kaur Lecturer GPC Khunimajra Call at
Relational Data Model, R. Ramakrishnan and J. Gehrke with Dr. Eick’s additions 1 The Relational Model Chapter 3.
The Relational Model. Review Why use a DBMS? OS provides RAM and disk.
1 The Relational Model Chapter 3. 2 Why Study the Relational Model?  Most widely used model.  Vendors: IBM, Informix, Microsoft, Oracle, Sybase, etc.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3 Modified by Donghui Zhang.
1 The Relational Model Chapter 3. 2 Why Study the Relational Model?  Most widely used model  Vendors: IBM, Informix, Microsoft, Oracle, Sybase  Recent.
 Relational database: a set of relations.  Relation: made up of 2 parts: › Instance : a table, with rows and columns. #rows = cardinality, #fields =
1 The Relational Model Chapter 3. 2 Why Study the Relational Model?  Most widely used model.  Vendors: IBM, Informix, Microsoft, Oracle, Sybase, etc.
1 The Relational Model. 2 Why Study the Relational Model? v Most widely used model. – Vendors: IBM, Informix, Microsoft, Oracle, Sybase, etc. v “Legacy.
FALL 2004CENG 351 File Structures and Data Management1 Relational Model Chapter 3.
1.1 CAS CS 460/660 Relational Model. 1.2 Review E/R Model: Entities, relationships, attributes Cardinalities: 1:1, 1:n, m:1, m:n Keys: superkeys, candidate.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 SQL: Queries, Constraints, Triggers Chapter 5.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
The Relational Model Content based on Chapter 3 Database Management Systems, (Third Edition), by Raghu Ramakrishnan and Johannes Gehrke. McGraw Hill, 2003.
CMPT 258 Database Systems The Relationship Model (Chapter 3)
1 Databases II (Fall 2009) Professor: Iluju Kiringa SITE 5072.
The Relational Model Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY.
CS34311 The Relational Model. cs34312 Why Relational Model? Currently the most widely used Vendors: Oracle, Microsoft, IBM Older models still used IBM’s.
Data and Queries in the Relational Model CS 162 Guest Lecture Mike Franklin April 6, 2011 A relationship, I think, is like a shark, you know? It has to.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Database Management Systems Chapter 5 SQL.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
1 SQL: The Query Language. 2 Example Instances R1 S1 S2 v We will use these instances of the Sailors and Reserves relations in our examples. v If the.
Database Management Systems 1 Raghu Ramakrishnan The Relational Model Chapter 3 Instructor: Jianping Fan.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
SQL: The Query Language Part 1 R&G - Chapter 5 1.
Chapter 3 The Relational Model. Why Study the Relational Model? Most widely used model. Vendors: IBM, Informix, Microsoft, Oracle, Sybase, etc. “Legacy.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
1 CS122A: Introduction to Data Management Lecture #4 (E-R  Relational Translation) Instructor: Chen Li.
Database Management Systems 1 Raghu Ramakrishnan The Relational Model Chapter 3 Instructor: Xin Zhang.
1 SQL: The Query Language. 2 Example Instances R1 S1 S2 v We will use these instances of the Sailors and Reserves relations in our examples.
CENG 351 File Structures and Data Management1 Relational Model Chapter 3.
SQL: The Query Language Part 1 R&G - Chapter 5 Lecture 7 The important thing is not to stop questioning. Albert Einstein.
COP Introduction to Database Structures
CS 186, Fall 2006, Lecture 2 R & G, Chap. 3
The Relational Model Content based on Chapter 3
The Relational Model Chapter 3
The Relational Model Relational Data Model
CS 405G: Introduction to Database Systems
DATABASE SYSTEM.
The Relational Model The slides for this text are organized into chapters. This lecture covers Chapter 3. Chapter 1: Introduction to Database Systems Chapter.
The Relational Model Content based on Chapter 3
The Relational Model Content based on Chapter 3
Presentation transcript:

1.1 CAS CS 460/660 Introduction to Database Systems Relational Model and more…

1.2 The Structure Spectrum Structured (schema-first) Relational Database Formatted Messages Semi-Structured (schema-later) Documents XML Tagged Text/Media Unstructured (schema-never) Plain Text Media

1.3 The Relational Model The Relational Model is Ubiquitous  MySQL, PostgreSQL, Oracle, DB2, SQLServer, …l  Foundational work done at  IBM Santa Teresa Labs (now IBM Almaden in SJ) – “System R”  UC Berkeley CS – the “Ingres” System  Note: some Legacy systems use older models  e.g., IBM’s IMS Object-oriented concepts have been merged in  Early work: POSTGRES research project at Berkeley  Informix, IBM DB2, Oracle 8i As has support for XML (semi-structured data)

1.4 Relational Model The relational model for database management is a database model based on first-order predicate logic, first formulated and proposed in 1969 by Edgar F. Codd. Codd, E.F. (1970). "A Relational Model of Data for Large Shared Data Banks". Communications of the ACM 13 (6): 377–387.

1.5 Relational Database: Definitions Relational database: a set of relations Relation: made up of 2 parts: Schema : specifies name of relation, plus name and type of each column Students(sid: string, name: string, login: string, age: integer, gpa: real) Instance : the actual data at a given time  #rows = cardinality  #fields = degree / arity

1.6 Some Synonyms FormalNot-so-formal 1Not-so-formal 2 RelationTable TupleRowRecord AttributeColumnField DomainType

1.7 Ex: Instance of Students Relation sid name login age gpa Jones Smith Smith Cardinality = 3, arity = 5, all rows distinct Do all values in each column of a relation instance have to be unique?

1.8 SQL - A language for Relational DBs Say: “ess-cue-ell” or “sequel”  But spelled “SQL” Data Definition Language (DDL)  create, modify, delete relations  specify constraints  administer users, security, etc. Data Manipulation Language (DML)  Specify queries to find tuples that satisfy criteria  add, modify, remove tuples The DBMS is responsible for efficient evaluation.

1.9 The SQL Query Language The most widely used relational query language. Originally IBM, then ANSI in 1986 Current standard is SQL-2011  2008 added x-query stuff, new triggers,…  2003 was last major update: XML, window functions, sequences, auto-generated IDs. Not fully supported yet SQL-1999 Introduced “Object-Relational” concepts.  Also not fully supported yet. SQL92 is a basic subset  Most systems support at least this PostgreSQL has some “unique” aspects (as do most systems). SQL is not synonymous with Microsoft’s “SQL Server”

1.10 Creating Relations in SQL Creates the Students relation.  Note: the type (domain) of each field is specified, and enforced by the DBMS whenever tuples are added or modified. CREATE TABLE Students (sid CHAR(20), name CHAR(20), login CHAR(10), age INTEGER, gpa FLOAT)

1.11 Table Creation (continued) Another example: the Enrolled table holds information about courses students take. CREATE TABLE Enrolled (sid CHAR(20), cid CHAR(20), grade CHAR(2))

1.12 Adding and Deleting Tuples Can insert a single tuple using: INSERT INTO Students (sid, name, login, age, gpa) VALUES ('53688', 'Smith', 18, 3.2) Can delete all tuples satisfying some condition (e.g., name = Smith): DELETE FROM Students S WHERE S.name = 'Smith' Powerful variants of these commands are available; more later!

1.13Keys Keys are a way to associate tuples in different relations Keys are one form of integrity constraint (IC) sidnameloginagegpa sidcidgrade 53666Carnatic101C 53666Reggae203B 53650Topology112A 53666History105B Enrolled Students PRIMARY Key FOREIGN Key

1.14 Primary Keys A set of fields is a superkey if:  No two distinct tuples can have same values in all key fields A set of fields is a key for a relation if :  It is a superkey  No subset of the fields is a superkey what if >1 key for a relation?  One of the keys is chosen (by DBA) to be the primary key. Other keys are called candidate keys. E.g.  sid is a key for Students.  What about name?  The set {sid, gpa} is a superkey.

1.15 Primary and Candidate Keys in SQL Possibly many candidate keys (specified using UNIQUE), one of which is chosen as the primary key. Keys must be used carefully! “For a given student and course, there is a single grade.” “Students can take only one course, and no two students in a course receive the same grade.” CREATE TABLE Enrolled (sid CHAR(20) cid CHAR(20), grade CHAR(2), PRIMARY KEY (sid,cid)) CREATE TABLE Enrolled (sid CHAR(20) cid CHAR(20), grade CHAR(2), PRIMARY KEY (sid), UNIQUE (cid, grade)) vs.

1.16 Foreign Keys, Referential Integrity Foreign key: a “logical pointer”  Set of fields in a tuple in one relation that `refer’ to a tuple in another relation.  Reference to primary key of the other relation. All foreign key constraints enforced?  referential integrity!  i.e., no dangling references.

1.17 Foreign Keys in SQL E.g. Only students listed in the Students relation should be allowed to enroll for courses.  sid is a foreign key referring to Students: CREATE TABLE Enrolled (sid CHAR(20),cid CHAR(20),grade CHAR(2), PRIMARY KEY (sid,cid), FOREIGN KEY (sid) REFERENCES Students); sidcidgrade 53666Carnatic101C 53666Reggae203B 53650Topology112A 53666History105B Enrolled sidnameloginagegpa Students English102 A

1.18 Next Up We’ll talk a bit about the SQL DML Then we’ll start describing the DBMS from storage on up

1.19 The SQL DML The SQL DML Single-table queries are straightforward. To find records for all 18 year old students with gpa’s above 2.0, we can write: SELECT * FROM Students S WHERE S.age=18 AND S.gpa > 2.0 To get just names and logins, replace the first line: SELECT S.name, S.login

1.20 Basic SQL Queries SELECT [DISTINCT] target-list FROM relation-list WHERE qualification relation-list : A list of relation names –possibly with a range-variable after each name target-list : A list of attributes of tables in relation-list qualification : Comparisons combined using AND, OR and NOT. –Comparisons are Attr op const or Attr1 op Attr2, where op is one of =≠<>≤≥ DISTINCT: (optional) indicates that the answer should have no duplicates. –In SQL SELECT, the default is that duplicates are not eliminated! (Result is called a “multiset”)

1.21 Querying Multiple Relations Querying Multiple Relations Can specify a join over two tables as follows: SELECT S.name, E.cid FROM Students S, Enrolled E WHERE S.sid=E.sid AND E.grade=‘B' result = S.name E.cid Jones History105 Note: obviously no referential integrity constraints have been used here.

1.22 Basic Query Semantics The Semantics of a SQL query are defined in terms of the following conceptual evaluation strategy: 1. do FROM clause: compute cross-product of tables (e.g., Students and Enrolled). 2. do WHERE clause: Check conditions, discard tuples that fail. 3. do SELECT clause: Delete unwanted fields. 4. If DISTINCT specified, eliminate duplicate rows. Probably the least efficient way to compute a query! A query optimizer will find more efficient strategies to get the same answer.

1.23 Step 1 – Cross Product SELECT S.name, E.cid FROM Students S, Enrolled E WHERE S.sid=E.sid AND E.grade=‘B'

1.24 Step 2) Discard tuples that fail predicate SELECT S.name, E.cid FROM Students S, Enrolled E WHERE S.sid=E.sid AND E.grade=‘B'

1.25 Step 3) Discard Unwanted Columns SELECT S.name, E.cid FROM Students S, Enrolled E WHERE S.sid=E.sid AND E.grade=‘B'

1.26 Aggregate Operators For calculation and analytics COUNT (*) COUNT ([ DISTINCT ] A) SUM ([DISTINCT] A) AVG (A) MAX (A) MIN (A) SELECT AVG (S.gpa) FROM Students S WHERE S.age=18; SELECT COUNT (*) FROM Students; SELECT COUNT ( DISTINCT S.age ) FROM Students S WHERE S.name=‘Bob’;

1.27 GROUP BY and HAVING Sometimes, we want to apply aggs to each of several groups of tuples.  This query computes the average gpa per major (assume students have a “major” attribute)  If you want to exclude “small” majors, use Having: SELECT S.major, AVG (S.gpa) as AvgGPA FROM Students S GROUP BY S.major ; SELECT S.major, AVG (S.gpa) as AvgGPA FROM Students S GROUP BY S.major HAVING COUNT (*) > 10 ;

1.28 (Slightly) Less Basic SQL Queries SELECT [DISTINCT] target-list FROM relation-list WHERE qualification GROUP BY grouping-list HAVING group-qualification

1.29 Conceptual Evaluation The cross-product of relation-list is computed, tuples that fail qualification are discarded, `unnecessary’ fields are deleted, and the remaining tuples are partitioned into groups by the value of attributes in grouping-list. One answer tuple is generated per qualifying group.

1.30 Conceptual Evaluation (cont.) The group-qualification is then applied to eliminate some groups.  Expressions in group-qualification must have a single value per group!  That is, attributes in group-qualification must be arguments of an aggregate op or must also appear in the grouping-list. One answer tuple is generated per qualifying group.

1.31 Okay: Let’s start from the bottom up… Query Optimization and Execution Relational Operators Access Methods Buffer Management Disk Space Management Student Records stored on disk Database app These layers must consider concurrency control and recovery