The Relational Model Ramakrishnan & Gehrke Chapter 3.

Slides:



Advertisements
Similar presentations
Relational Database. Relational database: a set of relations Relation: made up of 2 parts: − Schema : specifies the name of relations, plus name and type.
Advertisements

Database Management Systems, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
SQL Lecture 10 Inst: Haya Sammaneh. Example Instance of Students Relation  Cardinality = 3, degree = 5, all rows distinct.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke. Edited by Keith Shomper, The Relational Model Chapter 3.
The Relational Model CMU SCS /615 C. Faloutsos – A. Pavlo Lecture #3 R & G, Chap. 3.
1 Lecture 11: Basic SQL, Integrity constraints
The Relational Model Class 2 Book Chapter 3 Relational Data Model Relational Query Language (DDL + DML) Integrity Constraints (IC) (From ER to Relational)
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 28 Database Systems I The Relational Data Model.
D ATABASE S YSTEMS I T HE R ELATIONAL D ATA M ODEL.
The Relational Model CS 186, Fall 2006, Lecture 2 R & G, Chap. 3.
The Relational Model CS 186, Spring 2006, Lecture 2 R & G, Chap. 1 & 3.
SPRING 2004CENG 3521 The Relational Model Chapter 3.
The Relational Model Ramakrishnan & Gehrke, Chap. 3.
The Relational Model 198:541 Rutgers University. Why Study the Relational Model?  Most widely used model. Vendors: IBM, Informix, Microsoft, Oracle,
1 Relational Model. 2 Relational Database: Definitions  Relational database: a set of relations  Relation: made up of 2 parts: – Instance : a table,
The Relational Model Lecture 3 Book Chapter 3 Relational Data Model Relational Query Language (DDL + DML) Integrity Constraints (IC) From ER to Relational.
1 Data Modeling Yanlei Diao UMass Amherst Feb 1, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
CAS CS 460 Relational Model Based on Slides created by Prof. Mitch Cherniack, Brandeis University and from Prof. Joe.
CS 405G: Introduction to Database Systems Lecture 4: Relational Model Instructor: Chen Qian.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
1.1 CAS CS 460/660 Introduction to Database Systems Relational Model and more…
1 The Relational Model Chapter 3. 2 Objectives  Representing data using the relational model.  Expressing integrity constraints on data.  Creating,
The Relational Model These slides are based on the slides of your text book.
The Relational Model Chapter 3
Data Models Amandeep Kaur Lecturer GPC Khunimajra Call at
Relational Data Model, R. Ramakrishnan and J. Gehrke with Dr. Eick’s additions 1 The Relational Model Chapter 3.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
The Relational Model. Review Why use a DBMS? OS provides RAM and disk.
1 The Relational Model Chapter 3. 2 Why Study the Relational Model?  Most widely used model.  Vendors: IBM, Informix, Microsoft, Oracle, Sybase, etc.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3 Modified by Donghui Zhang.
1 The Relational Model Chapter 3. 2 Why Study the Relational Model?  Most widely used model  Vendors: IBM, Informix, Microsoft, Oracle, Sybase  Recent.
 Relational database: a set of relations.  Relation: made up of 2 parts: › Instance : a table, with rows and columns. #rows = cardinality, #fields =
1 The Relational Model Instructor: Mohamed Eltabakh
1 The Relational Model Chapter 3. 2 Why Study the Relational Model?  Most widely used model.  Vendors: IBM, Informix, Microsoft, Oracle, Sybase, etc.
1 The Relational Model Chapter 3. 2 Why Study the Relational Model?  Most widely used model.  Vendors: IBM, Informix, Microsoft, Oracle, Sybase, etc.
1 The Relational Model. 2 Why Study the Relational Model? v Most widely used model. – Vendors: IBM, Informix, Microsoft, Oracle, Sybase, etc. v “Legacy.
FALL 2004CENG 351 File Structures and Data Management1 Relational Model Chapter 3.
1.1 CAS CS 460/660 Relational Model. 1.2 Review E/R Model: Entities, relationships, attributes Cardinalities: 1:1, 1:n, m:1, m:n Keys: superkeys, candidate.
Relational Data Model Ch. 7.1 – 7.3 John Ortiz Lecture 3Relational Data Model2 Why Study Relational Model?  Most widely used model.  Vendors: IBM,
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
The Relational Model Content based on Chapter 3 Database Management Systems, (Third Edition), by Raghu Ramakrishnan and Johannes Gehrke. McGraw Hill, 2003.
CMPT 258 Database Systems The Relationship Model PartII (Chapter 3)
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Database Management Systems Chapter 3 The Relational Model.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
CMPT 258 Database Systems The Relationship Model (Chapter 3)
The Relational Model Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY.
ICS 421 Spring 2010 Relational Model & Normal Forms Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 1/19/20101Lipyeow.
CS34311 The Relational Model. cs34312 Why Relational Model? Currently the most widely used Vendors: Oracle, Microsoft, IBM Older models still used IBM’s.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
Database Management Systems 1 Raghu Ramakrishnan The Relational Model Chapter 3 Instructor: Jianping Fan.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
Chapter 3 The Relational Model. Why Study the Relational Model? Most widely used model. Vendors: IBM, Informix, Microsoft, Oracle, Sybase, etc. “Legacy.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
1 CS122A: Introduction to Data Management Lecture #4 (E-R  Relational Translation) Instructor: Chen Li.
Database Management Systems 1 Raghu Ramakrishnan The Relational Model Chapter 3 Instructor: Xin Zhang.
CENG 351 File Structures and Data Management1 Relational Model Chapter 3.
COP Introduction to Database Structures
CS 186, Fall 2006, Lecture 2 R & G, Chap. 3
Instructor: Mohamed Eltabakh
The Relational Model Content based on Chapter 3
The Relational Model Chapter 3
The Relational Model Relational Data Model
The Relational Model Chapter 3
The Relational Model The slides for this text are organized into chapters. This lecture covers Chapter 3. Chapter 1: Introduction to Database Systems Chapter.
The Relational Model Chapter 3
The Relational Model Content based on Chapter 3
The Relational Model Content based on Chapter 3
Presentation transcript:

The Relational Model Ramakrishnan & Gehrke Chapter 3

Administrivia Homework 0 is due next Thursday No discussion sections next Monday (Labor Day) Enrollment goal ~150, 118 currently enrolled, 47 on the wait list –Pretty good chance to get in –To be sure, appeal deadline is *tomorrow* –Talk to Michael-David Sasson

Review What is a “Database”? What is a “Database System”? What makes a Database System different from RAM, disk, and the WWW?

Review da·ta·base Pronunciation: 'dA-t&-"bAs, 'da- also 'dä- Function: noun Date: circa 1962 : a usually large collection of data organized especially for rapid search and retrieval (as by a computer) A database management system (DBMS), sometimes just called a database manager, is a program that lets one or more computer users create and access data in a database. 1 datadatabase

Review What makes DBMSs different? –A.C.I.D. Properties –Levels of Abstraction –Data Independence –Query Languages –Data Modelling When do you use DBMSs (or not) –Need above features –Don’t mind performance/cost penalty

Some useful terms Byte Kilobyte– 2 10 bytes (i.e bytes) Megabyte– 2 20 bytes (i.e kilobytes) Gigabyte – 2 30 bytes (i.e megabytes) Terabyte – 2 40 bytes (i.e gigabytes) –A handful of these for files in EECS –Biggest single online DB is Wal-Mart, >100TB –Internet Archive WayBack Machine is > 100 TB Petabyte – 2 50 bytes (i.e terabytes) –11 of these in in 1999 Exabyte – 2 60 bytes (i.e petabytes) –8 of these projected to be sold in new disks in 2003 Zettabyte – 2 70 bytes (i.e exabytes) Yottabyte – 2 80 bytes (i.e zettabytes)

Today: Data Models DBMS used to model real world Data Model is link between user’s view of the world and bits stored in computer Many models exist We will concentrate on the Relational Model Relational model uses concept of tables, rows and columns Student(sid: Students(sid: string, name: string, login: string, age: integer, gpa:real)

Why Study the Relational Model? Has a nice formal theoretical foundation vs earlier models –allows declarative queries, not procedural Most widely used model. –Vendors: IBM, Microsoft, Oracle, Sybase, etc. “Legacy systems” in older models –e.g., IBM’s IMS Object-oriented concepts have recently merged in –object-relational model IBM DB2, Oracle 9i, IBM Informix Will touch on this toward the end of the semester –Based on POSTGRES research project at Berkeley Postgres still represents the cutting edge on some of these features!

Storing Data in a Table Data about individual students One row per student How to represent course enrollment?

Storing More Data in Tables Students may enroll in more that one course Most efficient: keep enrollment in separate table Enrolled

Linking Data from Multiple Tables How to connect student data to enrollment? Need a Key Enrolled Students

Relational Data Model: Formal Definitions Relational database: a set of relations. Relation: made up of 2 parts: –Instance : a table, with rows and columns. #rows = cardinality –Schema : specifies name of relation, plus name and type of each column. E.g. Students(sid: string, name: string, login: string, age: integer, gpa: real) #fields = degree / arity Can think of a relation as a set of rows or tuples. –i.e., all rows are distinct

In other words... Data Model – a way to organize information Schema – one particular organization, –i.e., a set of fields/columns, each of a given type Relation –a name –a Schema –a set of tuples/rows, each following organization specified in Schema

Example Instance of Students Relation Cardinality = 3, arity = 5, all rows distinct Do all values in each column of a relation instance have to be distinct?

SQL - A language for Relational DBs SQL: standard language Data Definition Language (DDL) –create, modify, delete relations –specify constraints –administer users, security, etc. Data Manipulation Language (DML) –Specify queries to find tuples that satisfy criteria –add, modify, remove tuples

SQL Overview CREATE TABLE (, … ) INSERT INTO ( ) VALUES ( ) DELETE FROM WHERE UPDATE SET = WHERE SELECT FROM WHERE

Creating Relations in SQL Creates the Students relation. Note: the type (domain) of each field is specified, and enforced by the DBMS –whenever tuples are added or modified. Another example: the Enrolled table holds information about courses students take. CREATE TABLE Students (sid CHAR(20), name CHAR(20), login CHAR(10), age INTEGER, gpa FLOAT) CREATE TABLE Enrolled (sid CHAR(20), cid CHAR(20), grade CHAR(2))

Adding and Deleting Tuples Can insert a single tuple using: INSERT INTO Students (sid, name, login, age, gpa) VALUES (‘53688’, ‘Smith’, 18, 3.2) Can delete all tuples satisfying some condition (e.g., name = Smith): DELETE FROM Students S WHERE S.name = ‘Smith’  Powerful variants of these commands are available; more later!

Updating Tuples Can update all tuples, or just those matching some criterion UPDATE Students SET sid = sid UPDATE Enrolled SET grade = ‘E’ WHERE grade = ‘F’

Keys Keys are a way to associate tuples in different relations Keys are one form of integrity constraint (IC) Students Enrolled

Primary Keys - Definitions A set of fields is a superkey if: –No two distinct tuples can have same values in all key fields A set of fields is a key for a relation if : –It is a superkey –No subset of the fields is a superkey >1 key for a relation? – one of the keys is chosen (by DBA) to be the primary key. E.g. –sid is a key for Students. –What about name? –The set {sid, gpa} is a superkey.

Primary and Candidate Keys in SQL Possibly many candidate keys (specified using UNIQUE ), one of which is chosen as the primary key. CREATE TABLE Enrolled (sid CHAR(20) cid CHAR(20), grade CHAR(2), PRIMARY KEY (sid,cid)) “For a given student and course, there is a single grade.” vs. “Students can take only one course, and receive a single grade for that course; further, no two students in a course receive the same grade.” Used carelessly, an IC can prevent storage of database instances that should be permitted! CREATE TABLE Enrolled (sid CHAR(20) cid CHAR(20), grade CHAR(2), PRIMARY KEY (sid), UNIQUE (cid, grade))

Foreign Keys A Foreign Key is a field whose values are keys in another relation. Students Enrolled

Foreign Keys, Referential Integrity Foreign key : Set of fields in one relation used to `refer’ to tuples in another relation. –Must correspond to primary key of the second relation. –Like a `logical pointer’. E.g. sid in Enrolled is a foreign key referring to Students: –Enrolled(sid: string, cid: string, grade: string) –If all foreign key constraints are enforced, referential integrity is achieved (i.e., no dangling references.)

Foreign Keys in SQL Only students listed in the Students relation should be allowed to enroll for courses. CREATE TABLE Enrolled (sid CHAR(20), cid CHAR(20), grade CHAR(2), PRIMARY KEY (sid,cid), FOREIGN KEY (sid) REFERENCES Students ) Enrolled Students

Integrity Constraints (ICs) IC: condition that must be true for any instance of the database; –e.g., domain constraints. –ICs are specified when schema is defined. –ICs are checked when relations are modified. A legal instance of a relation is one that satisfies all specified ICs. –DBMS should not allow illegal instances. If the DBMS checks ICs, stored data is more faithful to real-world meaning. –Avoids data entry errors, too!

Where do ICs Come From? ICs are based upon the semantics of the real-world that is being described in the database relations. We can check a database instance to see if an IC is violated, but we can NEVER infer that an IC is true by looking at an instance. –An IC is a statement about all possible instances! –From example, we know name is not a key, but the assertion that sid is a key is given to us. Key and foreign key ICs are the most common; more general ICs supported too.

Enforcing Referential Integrity Remember Students and Enrolled; sid in Enrolled is a foreign key that references Students. What should be done if an Enrolled tuple with a non-existent student id is inserted? –(Reject it!) What should be done if a Students tuple is deleted? –Also delete all Enrolled tuples that refer to it. –Disallow deletion of a Students tuple that is referred to. –Set sid in Enrolled tuples that refer to it to a default sid. –(In SQL, also: Set sid in Enrolled tuples that refer to it to a special value null, denoting `unknown’ or `inapplicable’.) Similar if primary key of Students tuple is updated. Enrolled Students

Relational Query Languages A major strength of the relational model: supports simple, powerful querying of data. Declarative, not Procedural –Specify what you want, not how to get it Queries can be written intuitively, and the DBMS is responsible for efficient evaluation. –The key: precise semantics for relational queries. –Allows the optimizer to extensively re-order operations, and still ensure that the answer does not change.

The SQL Query Language The most widely used relational query language. –Current std is SQL99; SQL92 is a basic subset To find all 18 year old students, we can write: SELECT * FROM Students S WHERE S.age=18 To find just names and logins, replace the first line: SELECT S.name, S.login FROM Students S WHERE S.age = 18

Querying Multiple Relations What does the following query compute? SELECT S.name, E.cid FROM Students S, Enrolled E WHERE S.sid=E.sid AND E.grade='A' Given the following instances: we get:

Semantics of a Query A conceptual evaluation method for the query: 1. FROM clause: compute cross-product of Students, Enrolled 2. WHERE clause: Check conditions, discard tuples that fail 3. SELECT clause: Delete unwanted fields Remember, this is conceptual. –Actual evaluation will be much more efficient, –(but must produce the same answers) SELECT S.name, E.cid FROM Students S, Enrolled E WHERE S.sid=E.sid AND E.grade='A'

Cross-product of Students and Enrolled Instances

Relational Model: Summary A tabular representation of data. Simple and intuitive, currently the most widely used –Object-relational variant gaining ground –XML support being added Integrity constraints can be specified by the DBA, based on application semantics. DBMS checks for violations. –Two important ICs: primary and foreign keys –In addition, we always have domain constraints. Powerful and natural query languages exist.