Presentation is loading. Please wait.

Presentation is loading. Please wait.

H.Lu/HKUST L1: Course Overview & Review. L01: RDBMS REVIEW -- 2 H.Lu/HKUST The Teaching Staff  Instructor: Lu Hongjun  Office: 3543 (Lift 25-26), HKUST.

Similar presentations


Presentation on theme: "H.Lu/HKUST L1: Course Overview & Review. L01: RDBMS REVIEW -- 2 H.Lu/HKUST The Teaching Staff  Instructor: Lu Hongjun  Office: 3543 (Lift 25-26), HKUST."— Presentation transcript:

1 H.Lu/HKUST L1: Course Overview & Review

2 L01: RDBMS REVIEW -- 2 H.Lu/HKUST The Teaching Staff  Instructor: Lu Hongjun  Office: 3543 (Lift 25-26), HKUST  E-Mail: luhj@cs.ust.hkluhj@cs.ust.hk  URL: http://www.cs.ust.hk/~luhj  Research Interests: Data/Knowledge base management with emphasis on query processing and optimization Data warehousing and data mining Applied performance evaluation Database application development Parallel and distributed database systems  TA:  NameJiang Haifeng Liu Guimei  Office: 4212 (DB Lab) HKUST  E-Mail: jianghf@cs.ust.hk cslgm@cs.ust.hkjianghf@cs.ust.hk  URL: http://ihome.ust.hk/~jianghf http://ihome.ust.hk/~cslgmhttp://ihome.ust.hk/~jhttp://ihome.ust.hk/~c

3 L01: RDBMS REVIEW -- 3 H.Lu/HKUST References  R.Ramakrishnan & J. Gehrke. Database Management Systems, 3rd Ed. McGraw Hill, 2000  D. Shasha & P. Bonnet. Database Tuning: Principles, Experiments, and Troubleshooting Techniques, Revised edition, Morgan Kaufmann, 2002  Related papers

4 L01: RDBMS REVIEW -- 4 H.Lu/HKUST Course Contents  Part I: Issues in database administration  Database design  Principles of database performance tuning  Database security  Part II: Emerging DB-related technology  OLAP and data warehouse  XML data management  Data stream processing  Course Web Page: http://course.cs.ust.hk/comp334/http://course.cs.ust.hk/comp334/

5 L01: RDBMS REVIEW -- 5 H.Lu/HKUST Grading  Written assignment (20%)  Exams (25%)  Course project (50 %)  Class participation (5%)

6 L01: RDBMS REVIEW -- 6 H.Lu/HKUST Course Project Requirements  Carried in teams of two or four  Database related projects  You propose your own project, and get approve from the instructor  Topic: database related  The amount of work : it accounts for 50% of your final grade  Required documents (double-spaced)  Project proposal (1-2 pages) due date: 23-24/02  Status report (4-6 pages) due date: 28-29/03  Final report (8-10 pages) due date: 10-11/05

7 L01: RDBMS REVIEW -- 7 H.Lu/HKUST Summary  It is a graduate level course  Not a DBA course  Not an introductory database course  Not a programming course, but you need to know how to write programs  Hopefully, you will leave with  A good grade  A good understanding of studied topics

8 L01: RDBMS REVIEW -- 8 H.Lu/HKUST Review -- RDBMS  Relational database systems  The basic concepts in database systems  Relational data model  Relational languages  Database design  Previous course: conceptual and logic design  This course: physical database design  Database management systems  The basic components of DBMS  Storage management  Transaction management  Query processing & optimization

9 L01: RDBMS REVIEW -- 9 H.Lu/HKUST What Is Database & DBMS?  Database: a very large, integrated, persistent collection of data.  Models real-world enterprise. Entities (e.g., students, courses) Relationships (e.g., James is taking CSIT530)  A Database Management System (DBMS) is a software package designed to store and manage databases.

10 L01: RDBMS REVIEW -- 10 H.Lu/HKUST Data Models  A data model is a collection of concepts for describing  data and related operations,  semantics of data,  relationship among data, and  constraints on data  Two types of data models  Conceptual models: emphasize semantics of data Entity-Relationship model, Object-Oriented model  Logical models: ways how the data is organized in the logical level Hierarchical model, Network model, Relational model

11 L01: RDBMS REVIEW -- 11 H.Lu/HKUST Instances and Schemas  A schema is a description of a particular collection of data, using a given data model - the logical structure of the database (e.g., set of customers and accounts and the relationship between them)  Schema Instance - the actual content of the database at a particular point in time  Similar to types and variables in programming languages

12 L01: RDBMS REVIEW -- 12 H.Lu/HKUST Levels of Abstraction  ANSI-SPARC three-level architecture  Many views, single conceptual (logical) schema and physical schema.  Views describe how users see the data.  Conceptual schema defines logical structure  Physical schema describes the files and indexes used. View Conceptual Schema Physical Schema View

13 L01: RDBMS REVIEW -- 13 H.Lu/HKUST Data Independence  Applications insulated from how data is structured and stored.  Ability to modify a schema definition in one level without affecting a schema definition in the next higher level.  The interfaces between the various levels and components should be well defined so that changes in some parts do not seriously influence others.  Logical data independence: Protection from changes in logical structure of data.  Physical data independence: Protection from changes in physical structure of data.

14 L01: RDBMS REVIEW -- 14 H.Lu/HKUST Database Environment Procedures And standards Data DBMS Hardware Application Programs System administrator Database Administrator Analysts & Programmers Database Designer designs manages designs write use Specifies & enforces End Users

15 L01: RDBMS REVIEW -- 15 H.Lu/HKUST DBMS Related Languages  Data Definition Language (DDL)  Specification notation for defining the database schema  Data storage and definition language - special type of DDL in which the storage structure and access methods used by the database system are specified  Data Manipulation Language (DML)  Language for accessing and manipulation the data organized by the appropriate data model  Two classes of languages Procedural - user specifies what data is required and how to get those data. Nonprocedural - user specifies what data is required without specifying how to get those data

16 L01: RDBMS REVIEW -- 16 H.Lu/HKUST DBMS Related Languages Programming Language for DBMS Applications Host Language Data Sublanguage DDL DML Query Language Procedural Non-Procedural

17 L01: RDBMS REVIEW -- 17 H.Lu/HKUST Evolution of Database Technology  1960s: Hierarchical (IMS) & network (CODASYL) DBMS.  1970s: Relational data model, relational DBMS implementation.  1980: RDBMS rules the earth  1985-: Advanced data models (extended-relational, OO, deductive, etc.) Application-oriented DBMS (spatial, scientific, engineering, etc.).  1990s: ORDB, OLAP, Data mining, data warehousing, multimedia databases, and network databases.

18 L01: RDBMS REVIEW -- 18 H.Lu/HKUST What is an RDBMS  A piece of software that manages data based on the relational model  Relational data, SQL queries  Commercial products  Oracle, IBM DB2, IBM Informix, Sybase, Microsoft SQL Server  Each has ~10 million lines of C/C++ code  Smaller packages – MySQL, PostgresSQL

19 L01: RDBMS REVIEW -- 19 H.Lu/HKUST Relational Data Model  Main concept: relation  A table with rows and columns  Every relation has a schema  Description of the columns, or fields  Relational data – rows in a table  No order among the rows in a table  The most widely used data model!

20 L01: RDBMS REVIEW -- 20 H.Lu/HKUST University Database  Conceptual schema:  Students (sid: string, name: string, login: string, age: integer, gpa:real) Cardinality = 3, degree = 5, all rows distinct  Courses (cid: string, cname:string, credits:integer)  Enrolled (sid:string, cid:string, grade:string)

21 L01: RDBMS REVIEW -- 21 H.Lu/HKUST Relational Languages  Formal languages  Relational algebra  Relational calculus  Commercial language: SQL  DDL (Data Definition Language) Create Table, Create Index, Create View …  DML (Data Manipulation Language) Queries –Select Updates –Insert, Delete, Update

22 L01: RDBMS REVIEW -- 22 H.Lu/HKUST Creating Tables CREATE TABLE Students (sid: CHAR(20), name: CHAR(20), login: CHAR(10), age: INTEGER, gpa: REAL) CREATE TABLE Enrolled (sid: CHAR(20), cid: CHAR(20), grade: CHAR(2))

23 L01: RDBMS REVIEW -- 23 H.Lu/HKUST Primary Key Constraints  A set of fields is a key for a relation if :  1. Any two distinct tuples differ in some fields of the set, and  2. This is not true for any subset of the set.  A superkey: Condition 1 true and 2 false.  E.g., sid is a key for Students. {sid, gpa} is a superkey.  One primary key can be set per relation.

24 L01: RDBMS REVIEW -- 24 H.Lu/HKUST Primary and Candidate Keys CREATE TABLE Students (sid: CHAR(20), name: CHAR(20), login: CHAR(10), age: INTEGER, gpa: REAL, PRIMARY KEY (sid), UNIQUE (login)) CREATE TABLE Enrolled (sid CHAR(20) cid CHAR(20), grade CHAR(2), PRIMARY KEY (sid,cid))

25 L01: RDBMS REVIEW -- 25 H.Lu/HKUST Foreign Key Constraints  Foreign key : a set of fields in a relation  Refers to the primary key of another relation  Referential integrity  No dangling references CREATE TABLE Enrolled (sid CHAR(20), cid CHAR(20), grade CHAR(2), PRIMARY KEY (sid,cid), FOREIGN KEY (sid) REFERENCES Students ) Enrolled Students

26 L01: RDBMS REVIEW -- 26 H.Lu/HKUST Integrity Constraints (ICs)  IC: condition that must be true for any db instance  Domain constraints  Primary constraints  Foreign key constraints  ICs are specified when a schema is defined.  ICs are checked when relations are modified.  A legal instance of a relation  Satisfies all specified ICs

27 L01: RDBMS REVIEW -- 27 H.Lu/HKUST Adding and Deleting Tuples INSERT INTO Students (sid, name, login, age, gpa) VALUES (53688, ‘Smith’, ‘smith@ee’, 18, 3.2) DELETE FROM Students S WHERE S.name = ‘Smith’

28 L01: RDBMS REVIEW -- 28 H.Lu/HKUST Queries SELECT * FROM Students S WHERE S.sid = 53688

29 L01: RDBMS REVIEW -- 29 H.Lu/HKUST Querying Multiple Tables SELECT S.name, E.cid FROM Students S, Enrolled E WHERE S.sid=E.sid AND E.grade=“A” Enrolled Students

30 L01: RDBMS REVIEW -- 30 H.Lu/HKUST Functional Components of DBMS Statistics Metadata Indexes User data Storage Manager Buffer Management Index/file/record Management Execution Engine Query Processing & Optimization Buffer DDL Compiler Transaction Management Recovery Log Concurrency Control Lock Table Query Plan DDL Command User/ApplicationDatabase Administrator Security Control Storage Management DML Stmt. Query Processing Transaction Manager

31 L01: RDBMS REVIEW -- 31 H.Lu/HKUST Query Optimization  A major strength of RDBMS  SQL queries are declarative  Optimizer figures out how to answer them  Re-order operations  Pick among alternatives of one operation  Ensure that the answer is correct!

32 L01: RDBMS REVIEW -- 32 H.Lu/HKUST Transaction  A key concept in databases  An atomic sequence of actions (read/write)  Brings DB from a consistent state to another  ACID  Atomicity  Consistency  Isolation  Durability

33 L01: RDBMS REVIEW -- 33 H.Lu/HKUST Concurrency Control & Recovery  Concurrency Control  Essential for good DBMS performance  Run several user programs concurrently  Interleave actions of different users  Ensure the correctness Users may think it is a single-user system.  Recovery  Essential for durability of transactions

34 L01: RDBMS REVIEW -- 34 H.Lu/HKUST RDBMS Features  Effective and efficient access  Easier application development  Data independence  Data integrity and security  Concurrent access  Recovery from crashes  Uniform data administration

35 L01: RDBMS REVIEW -- 35 H.Lu/HKUST Summary  DBMS used to maintain, query large datasets.  Benefits include recovery from system crashes, concurrent access, quick application development, data integrity and security.  Levels of abstraction give data independence.  A DBMS typically has a layered architecture.  DBAs hold responsible jobs and are well-paid!  DBMS R&D is one of the broadest, most exciting areas in CS.


Download ppt "H.Lu/HKUST L1: Course Overview & Review. L01: RDBMS REVIEW -- 2 H.Lu/HKUST The Teaching Staff  Instructor: Lu Hongjun  Office: 3543 (Lift 25-26), HKUST."

Similar presentations


Ads by Google