Levels of Abstraction in DBMS data Many views, single conceptual (logical) schema and physical schema. – Views describe how users see the data (possibly.

Slides:



Advertisements
Similar presentations
Relational Database. Relational database: a set of relations Relation: made up of 2 parts: − Schema : specifies the name of relations, plus name and type.
Advertisements

Database Management Systems, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 The Entity-Relationship Model Chapter 2.
SQL Lecture 10 Inst: Haya Sammaneh. Example Instance of Students Relation  Cardinality = 3, degree = 5, all rows distinct.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke. Edited by Keith Shomper, The Relational Model Chapter 3.
CSC 411/511: DBMS Design 1 1 Dr. Nan WangCSC411_L3_Relational Model 1 The Relational Model (Chapter 3)
The Relational Model CMU SCS /615 C. Faloutsos – A. Pavlo Lecture #3 R & G, Chap. 3.
Database Management Systems 1 Raghu Ramakrishnan The Relational Model Chapter 3 Instructor: Mirsad Hadzikadic.
The Relational Model Class 2 Book Chapter 3 Relational Data Model Relational Query Language (DDL + DML) Integrity Constraints (IC) (From ER to Relational)
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 28 Database Systems I The Relational Data Model.
D ATABASE S YSTEMS I T HE R ELATIONAL D ATA M ODEL.
SPRING 2004CENG 3521 The Relational Model Chapter 3.
The Relational Model Ramakrishnan & Gehrke, Chap. 3.
Modeling Your Data Chapter 2. Overview of Database Design Conceptual design: –What are the entities and relationships in the enterprise? – What information.
The Relational Model 198:541 Rutgers University. Why Study the Relational Model?  Most widely used model. Vendors: IBM, Informix, Microsoft, Oracle,
1 Relational Model. 2 Relational Database: Definitions  Relational database: a set of relations  Relation: made up of 2 parts: – Instance : a table,
The Relational Model Lecture 3 Book Chapter 3 Relational Data Model Relational Query Language (DDL + DML) Integrity Constraints (IC) From ER to Relational.
1 Data Modeling Yanlei Diao UMass Amherst Feb 1, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
1 Data Modeling Yanlei Diao UMass Amherst Feb 1, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
1 The Relational Model Chapter 3. 2 Objectives  Representing data using the relational model.  Expressing integrity constraints on data.  Creating,
The Relational Model These slides are based on the slides of your text book.
The Relational Model Chapter 3
Relational Data Model, R. Ramakrishnan and J. Gehrke with Dr. Eick’s additions 1 The Relational Model Chapter 3.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
The Relational Model. Review Why use a DBMS? OS provides RAM and disk.
1 The Relational Model Chapter 3. 2 Why Study the Relational Model?  Most widely used model.  Vendors: IBM, Informix, Microsoft, Oracle, Sybase, etc.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3 Modified by Donghui Zhang.
1 The Relational Model Chapter 3. 2 Why Study the Relational Model?  Most widely used model  Vendors: IBM, Informix, Microsoft, Oracle, Sybase  Recent.
 Relational database: a set of relations.  Relation: made up of 2 parts: › Instance : a table, with rows and columns. #rows = cardinality, #fields =
1 The Relational Model Chapter 3. 2 Why Study the Relational Model?  Most widely used model.  Vendors: IBM, Informix, Microsoft, Oracle, Sybase, etc.
1 The Relational Model Chapter 3. 2 Why Study the Relational Model?  Most widely used model.  Vendors: IBM, Informix, Microsoft, Oracle, Sybase, etc.
1 The Relational Model. 2 Why Study the Relational Model? v Most widely used model. – Vendors: IBM, Informix, Microsoft, Oracle, Sybase, etc. v “Legacy.
FALL 2004CENG 351 File Structures and Data Management1 Relational Model Chapter 3.
1.1 CAS CS 460/660 Relational Model. 1.2 Review E/R Model: Entities, relationships, attributes Cardinalities: 1:1, 1:n, m:1, m:n Keys: superkeys, candidate.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
The Relational Model Content based on Chapter 3 Database Management Systems, (Third Edition), by Raghu Ramakrishnan and Johannes Gehrke. McGraw Hill, 2003.
CMPT 258 Database Systems The Relationship Model PartII (Chapter 3)
Lecture 3 Book Chapter 3 (part 2 ) From ER to Relational.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Database Management Systems Chapter 3 The Relational Model.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
The Relational Model Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY.
ICS 421 Spring 2010 Relational Model & Normal Forms Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 1/19/20101Lipyeow.
CS34311 The Relational Model. cs34312 Why Relational Model? Currently the most widely used Vendors: Oracle, Microsoft, IBM Older models still used IBM’s.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
1 The Relational Model Chapter 3. 2 Why Study the Relational Model?  Most widely used model.  Vendors: IBM, Informix, Microsoft, Oracle, Sybase, etc.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
Chapter 3 The Relational Model. Why Study the Relational Model? Most widely used model. Vendors: IBM, Informix, Microsoft, Oracle, Sybase, etc. “Legacy.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
1 CS122A: Introduction to Data Management Lecture #5 (E-R  Relational, Cont.) Instructor: Chen Li.
1 CS122A: Introduction to Data Management Lecture #4 (E-R  Relational Translation) Instructor: Chen Li.
Database Management Systems 1 Raghu Ramakrishnan The Relational Model Chapter 3 Instructor: Xin Zhang.
1 The Relational Model Chapter 3. 2 Why Study the Relational Model?  Most widely used model.  Multi-billion dollar industry, $15+ bill in  Vendors:
CENG 351 File Structures and Data Management1 Relational Model Chapter 3.
COP Introduction to Database Structures
These slides are based on the slides of your text book.
The Relational Model Lecture 3 INFS614, Fall
The Relational Model Chapter 3
The Relational Model Content based on Chapter 3
The Relational Model Chapter 3
The Relational Model Relational Data Model
The Relational Model Chapter 3
The Relational Model The slides for this text are organized into chapters. This lecture covers Chapter 3. Chapter 1: Introduction to Database Systems Chapter.
More on ER Model and the Relationship Model
The Relational Model Chapter 3
The Relational Model Content based on Chapter 3
The Relational Model Content based on Chapter 3
Presentation transcript:

Levels of Abstraction in DBMS data Many views, single conceptual (logical) schema and physical schema. – Views describe how users see the data (possibly different data models for different views). – Conceptual schema defines logical structure of entire data enterprise – Physical schema describes the underlying files and indexes used. – Called ANSI schema model * Schemas are defined using DDL; data is modified/queried using DML. Physical Schema Conceptual Schema View 1View 2View 3

Structure of a DBMS A typical DBMS has a layered architecture. The figure does not show the concurrency control and recovery components. This is one of several possible architectures; each system has its own variations. Query Optimization and Execution Relational Operators Files and Access Methods Buffer Management Disk Space Management DB These layers must consider concurrency control and recovery

Overview of Database Design Conceptual design: (ER Model is used at this stage.) – What are the entities and relationships in the enterprise? – What information about these entities and relationships should we store in the database? – What integrity constraints or business rules hold? – A database `schema’ in the ER Model can be represented pictorially (ER diagrams). – Then we can map an ER diagram into a relational schema.

ER Model Basics Entity: Real-world object distinguishable from other objects. An entity is described (in DB) using a set of attributes. Entity Set: A collection of similar entities. E.g., all employees. – All entities in an entity set have the same set of attributes. (except when we consider ISA hierarchies, anyway!) – Each entity set has a key.(the chosen identifier attribute(s) ) – Each attribute has a domain.(allowable value universe) Employees ssn name lot

ER Model Basics (Contd.) Relationship: Association among two or more entities. E.g., Jones works in Pharmacy department. lot dname budget did since name Works_In DepartmentsEmployees ssn Reports_To lot name Employees subor- dinate super- visor ssn Degree=2 relationship between entities, Employees and Departments Degree=2 relationship between entities,Employees and Employees. Must specify the “role” of each entity to distinguish them.

Key Constraints (many-to-many) Consider Works_In: An employee can work in many depts; a dept can have many employees. (1-many) In contrast, it may be required that each dept have at most one manager. Many-to-Many 1-to-11-to ManyMany-to-1 dname budgetdid since lot name ssn Manages Employees Departments lot dname budget did since name Works_In DepartmentsEmployees ssn

Participation Constraints Does every department have to have a manager? – If so, this is a participation constraint: the participation of Departments in Manages is said to be total (vs. partial). Every did value in Departments table must appear in a row of the Manages table (with a non-null ssn value!) lot name dname budgetdid since name dname budgetdid since Manages since Departments Employees ssn Works_In

ISA (`is a’) Hierarchies Contract_Emps name ssn Employees lot hourly_wages ISA Hourly_Emps contractid hours_worked * Attributes are inherited. * If we declare A ISA B, every A entity is also considered to be a B entity (e.g., every Hourly_Emps and every Contract_Emps ISA Employees) Overlap constraints: Can Joe be an Hourly_Emps as well as a Contract_Emps entity? (Allowed/disallowed) Covering constraints: Does every Employees entity also have to be an Hourly_Emps or a Contract_Emps entity? (Yes/no) Reasons for using ISA : – To add descriptive attributes specific to a subclass (e.g., Hourly_Emps have hourly_wages but Contract_Emps don’t)

Conceptual Design Using the ER Model Design choices: – Should a concept be modeled as an entity or an attribute? – Should a concept be modeled as an entity or a relationship?

Why Study the Relational Model? Most widely used model. – Vendors: IBM, Informix, Microsoft, Oracle, Sybase, etc. A competitor: object-oriented model – ObjectStore, Versant, Ontos – A synthesis emerging: object-relational model Informix Universal Server, UniSQL, O2, Oracle, DB2 Really just a more flexible relational model

Relational Database: Working Definitions Relational database: a set of relations Relation: made up of 2 parts: – Instance or occurrence : a table, with rows and columns. #Rows = cardinality, #fields = degree / arity. – Schema or type : specifies name of relation, plus name and type of each column/attribute. E.G. Students(sid: string, name: string, login: string, age: integer, gpa: real). Strictly speaking, a relation is a set of tuples but it is commonplace to think if it a table (sequence of rows made up of a sequence of attribute values)

Relational Query Languages A major strength of the relational model: supports simple, powerful querying of data. Queries can be written intuitively (what, not how), and the DBMS is responsible for efficient evaluation. – Allows the optimizer to extensively re-order operations, and still ensure that the answer does not change.

The SQL Query Language Developed by IBM (system R) in the 1970s Need for a standard since it is used by many vendors Standards: – SQL-86 – SQL-89 (minor revision) – SQL-92 (major revision, current standard) – SQL-99 (major extensions) – Procedural constructs (if-then-else, loops, procs) – OO constructs (inheritance, polymorphism,…)

A look at the SQL Query Language One of the simplest languages on earth (compared to C++ or JAVA or…) Find all 18 year old students, we can write: SELECT * FROM Students S WHERE S.age=18 To find just names and logins, replace the first line: SELECT S.name, S.login

Querying Multiple Relations What does the following query produce? SELECT S.name, E.cid FROM Students S, Enrolled E WHERE S.sid=E.sid AND E.grade=“A” we get:

Creating Relations in SQL Creates the Students relation. Observe that the type (domain) of each field is specified, and enforced by the DBMS whenever tuples are added or modified. As another example, the Enrolled table holds information about courses that students take. CREATE TABLE Students (sid: CHAR(20), name: CHAR(20), login: CHAR(10), age: INTEGER, gpa: REAL ) CREATE TABLE Enrolled (sid: CHAR(20), cid: CHAR(20), grade: CHAR (2))

Destroying and Altering Relations Destroys the relation Students. The schema information and the tuples are deleted. DROP TABLE Students v The schema of Students is altered by adding a new field; every tuple in the current instance is extended with a null value in the new field. ALTER TABLE Students ADD COLUMN Year: integer

Adding and Deleting Tuples Can insert a single tuple using: INSERT INTO Students (sid, name, login, age, gpa) VALUES (53688, ‘Smith’, 18, 3.2) v Can delete all tuples satisfying some condition (e.g., name = Smith): DELETE FROM Students S WHERE S.name = ‘Smith’ * Powerful variants of these commands are available!

Integrity Constraints (ICs) IC: condition that must be true for any instance of the database; e.g., domain constraints. – ICs are specified when schema is defined. – ICs are checked when relations are modified. A legal instance of a relation is one that satisfies all specified ICs. – DBMS should not allow illegal instances. If the DBMS checks ICs, stored data is more faithful to real-world meaning. – Avoids data entry errors, too!

Primary Key Constraints A set of fields is a key (strictly: candidate key) for a relation if : 1. (Uniqueness) No two distinct tuples can have same values in the key field (all key fields if composite), and 2. (Minimality) This is not true for any subset of a composite key. – If Part 2 is false, it’s called a superkey (superset of a key) – There’s always >1 key for a relation, one of the keys is chosen (by DBA) to be the primary key. E.g., sid is a key for Students. The set {sid, gpa} is a superkey.

Entity integrity No column of the primary key can contain a null value.

Foreign Keys, Referential Integrity Foreign key : Set of fields in one relation that is used to `refer’ to a tuple in another relation. (Must correspond to primary key of the second relation.) Like a `logical pointer’. E.g. sid is a foreign key referring to Students in Enrolled(sid: string, cid: string, grade: string) – If all foreign key constraints are enforced, referential integrity is achieved, i.e., no dangling references. – That is, an Enrolled record cannot have an sid that is not present in Students

Foreign Keys in SQL Only students listed in the Students relation should be allowed to enroll for courses. CREATE TABLE Enrolled (sid CHAR (20), cid CHAR(20), grade CHAR (2), PRIMARY KEY (sid,cid), FOREIGN KEY (sid) REFERENCES Students ) Enrolled Students

Enforcing Referential Integrity Consider Students and Enrolled; sid in Enrolled is a foreign key that references Students. What should be done if an Enrolled tuple with a non-existent student id is inserted? (Reject it!) What should be done if a Students tuple is deleted? – Also delete all Enrolled tuples that refer to it. – Disallow deletion of a Students tuple that is referred to. – Set sid in Enrolled tuples that refer to it to a default sid. – (In SQL, also: Set sid in Enrolled tuples that refer to it to a special value null, denoting `unknown’ or `inapplicable’.) Similar if primary key of Students tuple is updated.

Referential Integrity in SQL/92 SQL/92 supports all 4 options on deletes and updates. – Default is NO ACTION (delete/update is rejected) – CASCADE (also delete all tuples that refer to deleted tuple) – SET NULL / SET DEFAULT (sets foreign key value of referencing tuple) CREATE TABLE Enrolled (sid CHAR (20), cid CHAR(20), grade CHAR (2), PRIMARY KEY (sid,cid), FOREIGN KEY (sid) REFERENCES Students ON DELETE CASCADE ON UPDATE SET DEFAULT )

Where do ICs Come From? ICs are based upon the semantics of the real-world enterprise that is being described in the database relations (I.e., users decide, not DB experts!). We can check a database instance to see if an IC is violated, but we can NEVER infer that an IC is true by looking at the instances. – An IC is a statement about all possible instances! It is not a statement that can be inferred from the set of existing instances. Key and foreign key ICs are the most common; more general ICs supported too.

Views A view is just a relation, but we store a definition, rather than a set of tuples. CREATE VIEW YoungActiveStudents (name, grade) AS SELECT S.name, E.grade FROM Students S, Enrolled E WHERE S.sid = E.sid and S.age<21 v Views can be dropped using the DROP VIEW command. u How to handle DROP TABLE if there’s a view on the table?  DROP TABLE command has options to let user specify this. Views can be used to present necessary information (or a summary), while hiding details in underlying relation(s). – Given YoungStudents, but not Students or Enrolled, we can find students s who have are enrolled, but not the cid’s of the courses they are enrolled in.

Who decides what the primary key is? (and other design choices?) The Database design expert? NO! - not in isolation anyway. Someone from the enterprise who understands the data and the procedures should be consulted. The following story illustrates this point. CAST: Mr. Goodwrench = MG (parts manager); Database Expert = DE DE: I've looked at your data, and I have decided Part Number (P#) will be designated the primary key for the relation, PARTS(P#, COLOR, WT, TIME-OF-ARRIVAL). MG: You're the expert, but I think we should use the weight (WT)! DE: Well, according to textbooks P# should be the primary key, because it is the main lookup attribute!... later

MG: Why is the system so slow? DE: Do store parts in the stock room ordered by P#? MG: No. We store by weight. When a shipment comes in, I take each part into the back room and throw it as far as I can. The lighter ones go further than the heavy ones so they get ordered by weight! DE: But weight doesn't have Uniqueness property! Parts with the same weight end up together in a pile! MG: No they don't. I tire quickly, so the first one goes furthest, etc. DE: Then use composite primary key, (weight, TIME-OF-ARRIVAL). MG: OK. The point: This conversation should have taken place during the 1 st meeting.