Presentation is loading. Please wait.

Presentation is loading. Please wait.

Using and Designing Database Systems

Similar presentations


Presentation on theme: "Using and Designing Database Systems"— Presentation transcript:

1 Using and Designing Database Systems
Instructor: Gordon Turpin

2 ITEC 3220A Using and Designing Database Systems
Instructor: Gordon Turpin Course Website: Office: CSEB3020

3 Course Objective Examine databases, trends in database management systems and their application in a wide range of organizational areas Provide an overview of database processing, both historical and discussion of recent trends in database management Provide the student with exposure to a range of tools, including a relational DBMS as well as an object-oriented DBMS

4 Textbooks A bundle consisting of
Database Systems Design, Implementation, & Management, Seventh Edition - Peter Rob & Carlos Coronel A Guide to SQL, Seventh Edition - Philip J. Pratt

5 Marking Scheme Final exam (closed book) - 50% Midterm (closed book) - 35% Assignments (3 assignments) - 15% Lecture notes will be made available at through the course website.

6 Tentative Schedule Week 1 Database concepts and the relational database model (Chapter 1, 2 & 3) Week 2 Entity relationship model (Chapter 4) Week 3 Normalization (Chapter 5) Week 4 SQL (Chapter 7 & A guide to SQL) Week 5 SQL + lab (Chapter 7 & A guide to SQL) Week 6 Advanced SQL + lab (Chapter 8 & A guide to SQL)

7 Schedule (Cont’d) Week 7 Midterm
Week 8 Database design & case study (Chapter 9) Week 9 Transaction management and concurrent control (Chapter 10) Week 10 Transaction management and concurrent control (Cont’d) and data warehousing (chapter 10 &13) Week 11 Objected-Oriented database (Appendix G) Week 12 TBA and review for final exam

8 Database Systems and Data Models
Introduction Database Systems and Data Models

9 Basic Definition Data: raw facts
Constitute building blocks of information Information: is produced by processing data and reveals meaning of data Good, timely, relevant information key to decision making Good decision making key to organizational survival Database: shared, integrated computer structure housing: End user data Metadata

10 An Example Converting data to information

11 An Example (Cont’d) Metadata

12 What is a Database Management System (DBMS)
A collection of programs that manages the database structure and controls access to the data stored in the database Possible to share data among multiple applications or users Example: bank and its ATM machines Makes data management more efficient and effective End users have better access to more and better-managed data

13 DBMS Manages Interaction

14 File and File System Terminology Data Field Record File Raw Facts
Group of characters with specific meaning Record Logically connected fields that describe a person, place, or thing File Collection of related records

15 Example

16 Disadvantages of File Processing
Data Dependence Change in file’s data characteristics requires modification of data access programs Lengthy development time Excessive program maintenance Structural Dependence Change in file structure requires modification of related programs

17 Example

18 Disadvantages of File Processing (Cont’d)
Data Redundancy Different and conflicting versions of same data Results of uncontrolled data redundancy Data anomalies Modification Insertion Deletion Data inconsistency Lack of data integrity

19 Solution: Database Approach
Database consists of logically related data stored in a single repository Advantages of database approach Structural and data independence Minimal data redundancy Reduces inconsistency, data anomalies Improves data sharing and data quality Stores data structures, relationships, and access paths

20 Database vs. File Systems

21 Database System Environment
Hardware: all the system's physical devices Software Operating system software DBMS software Application programs and utility software People Procedures Data

22 Database Models Collection of logical constructs used to represent data structure and relationships within the database Conceptual models: logical nature of data representation Implementation models: emphasis on how the data are represented in the database

23 Database Models: Historic Overview
Flat files s s Hierarchical – 1970s s Network – 1970s s Relational – 1980s - present Object-oriented – 1990s - present Object-relational – 1990s - present Data warehousing – 1980s - present Web-enabled – 1990s - present

24 Hierarchical Database Model
Logically represented by an upside down tree Each parent can have many children Each child has only one parent

25 Hierarchical Database Model (Cont’d)
Advantages Conceptual simplicity Database security and integrity Data independence Efficiency Disadvantages Complex implementation Difficult to manage and lack of standards Lacks structural independence Application programming and use complexity Implementation limitations

26 Network Database Model
Each record can have multiple parents Composed of sets Each set has owner record and member record Member may have several owners

27 Network Database Model (Cont’d)
Advantages Conceptual simplicity Handles more relationship types Data access flexibility Promotes database integrity Data independence Conformance to standards Disadvantages System complexity Lack of structural independence

28 Relational Database Model
Perceived by user as a collection of tables for data storage Tables are a series of row/column intersections Tables related by sharing common entity characteristic(s)

29 Relational Database Model (Cont’d)

30 Relational Database Model (Cont’d)
Schema for the table Graphical representation Text description AGENT(AGENT_CODE, AGENT_LNAME, AGENT_FNAME, AGENT_INITIAL, AGENT_AREACODE, AGETN_PHONE) AGENT AGENT_CODE AGENT_LNAME AGENT_FNAME AGENT_INITIAL AGENT_AREACODE AGENT_PHONE

31 Relational Database Model (Cont’d)
Advantages Structural independence Improved conceptual simplicity Easier database design, implementation, management, and use Ad hoc query capability with SQL Powerful database management system Disadvantages Substantial hardware and system software overhead Poor design and implementation is made easy May promote “islands of information” problems

32 Object-Oriented Database Model
Objects or abstractions of real-world entities are stored Attributes describe properties Collection of similar objects is a class Methods represent real world actions of classes Classes are organized in a class hierarchy Inheritance is ability of object to inherit attributes and methods of classes above it

33 OO Data Model Advantages Disadvantages Adds semantic content
Visual presentation includes semantic content Database integrity Both structural and data independence Disadvantages Lack of OODM Complex navigational data access Steep learning curve High system overhead slows transactions

34 Costs and Risks of the Database Approach
Up-front costs: Installation Management Cost and Complexity Conversion Costs Ongoing Costs Requires New, Specialized Personnel Need for Explicit Backup and Recovery Organizational Conflict

35 Review Basic concepts: data, information, database, DBMS, file, conceptual model, implementation model, etc Why database and its importance, cost and risk Different database models definition advantage disadvantage

36 The Relational Database Model
Chapter 3 The Relational Database Model

37 In this chapter, you will learn:
Basic components of the relational database model Entities and their attributes Relationships among entities Relational algebra Relationship in relational database Data redundancy

38 Basic Definition Entities and Attributes Tables
Entity is a person, place, event, or thing about which data is collected Attributes are characteristics of the entity Tables Holds related entities or entity set Also called relations Comprised of rows and columns

39 Table Characteristics
Two-dimensional structure with rows and columns Rows (tuples) represent single entity Columns represent attributes Row/column intersection represents single value Tables must have an attribute to uniquely identify each row Column values all have same data format Each column has range of values called attribute domain Order of the rows and columns is immaterial to the DBMS

40 Example Tables

41 Terminology for Relational Database
Table-Oriented Set-oriented Record-Oriented Table Relation Record type Row Tuple Record Column Attribute Field

42 Key Consists of one or more attributes that determine other attributes
Primary key (PK) is an attribute (or a combination of attributes) that uniquely identifies any given entity (row) Key’s role is based on determination If you know the value of attribute A, you can look up (determine) the value of attribute B

43 Keys (Cont’d) Composite key Key attribute Superkey Candidate key
Composed of more than one attribute Key attribute Any attribute that is part of a key Superkey Any key that uniquely identifies each entity Candidate key A superkey without redundancies

44 Keys (Cont’d) Foreign key (FK) Referential integrity Secondary key
An attribute whose values match primary key values in the related table Referential integrity FK contains a value that refers to an existing valid tuple (row) in another relation Secondary key Key used strictly for data retrieval purposes

45 Simple Relational Database

46 Controlled Redundancy
Makes the relational database work Tables within the database share common attributes that enable us to link tables together Multiple occurrences of values in a table are not redundant when they are required to make the relationship work Redundancy is unnecessary duplication of data

47 Integrity Rules

48 Integrity Rules (cont’d)

49 Exercises Table name: TRUCK Table name: BASE Table name: TYPE

50 Exercises (Cont’d) For each table, identify the primary key and the foreign keys. Do the tables exhibit entity integrity? Explain So the tables exhibit referential integrity? Explain Identify the TRUCK table’s candidate key (s). For each table, identify a super key and a secondary key

51 ITEC 3220A Using and Designing Database Systems
Instructor: Gordon Turpin Course Website Office: CSEB3020

52 The Relational Database Model (Cont’d)
Chapter 3 The Relational Database Model (Cont’d)

53 Relational Database Operators
Relational algebra Defines theoretical way of manipulating table contents using relational operators: SELECT PROJECT JOIN INTERSECT Use of relational algebra operators on existing tables (relations) produces new relations UNION DIFFERENCE PRODUCT DIVIDE

54 Relational Algebra Operators (continued)
Union: Combines all rows from two tables, excluding duplicate rows Tables must have the same attribute characteristics Intersect: Yields only the rows that appear in both tables

55 Union

56 Intersect

57 Relational Algebra Operators (continued)
Difference Yields all rows in one table not found in the other table—that is, it subtracts one table from the other

58 Venn Diagrams for Traditional Set Operators
Union Intersection Differences

59 Product Yields all possible pairs of rows from two tables

60 Relational Algebra Operators (continued)
Select Yields values for all rows found in a table Can be used to list either all row values or it can yield only those row values that match a specified criterion Yields a horizontal subset of a table Project Yields all values for selected attributes Yields a vertical subset of a table

61 Select

62 Project

63 Relational Algebra Operators (continued)
Join Allows us to combine information from two or more tables Real power behind the relational database, allowing the use of independent tables linked by common attributes

64 Natural Join Process Links tables by selecting rows with common values in common attribute(s) Three-stage process Product creates one table Select yields appropriate rows Project yields single copy of each attribute to eliminate duplicate columns

65 Natural Join (continued)
Final outcome yields table that Does not include unmatched pairs Provides only copies of matches If no match is made between the table rows, the new table does not include the unmatched row

66 Other Joins EquiJOIN Theta JOIN Outer JOIN
Links tables based on equality condition that compares specified columns of tables Join criteria must be explicitly defined Theta JOIN EquiJOIN that compares specified columns of each table using operator other than equality one Outer JOIN Matched pairs are retained Unmatched values in other tables left null Right and left

67 Divide Requires use of single-column table and two-column table

68 Summary of Meanings of the Relational Algebra Operators
Select: Extracts rows that satisfy a specified condition Project: Extracts specified columns Product: Builds a table from two tables consisting of all possible combinations of rows, one from each of the two tables Union: Builds a table from all rows appearing in either of two tables Intersect: Builds a table consisting of all rows appearing in both of two specified tables

69 Summary of Meanings of the Relational Algebra Operators (Cont’d)
Join: Extracts rows from a product of two tables such that two input rows contributing to any output row satisfy some specified condition Outer Join: Extracts the matching rows of two tables and the unmatched rows from both tables Divide: Builds a table consisting of all values of one column of a binary table that match all values in a unary table

70 Entity Relationship (E-R) Modeling
Chapter 4 Entity Relationship (E-R) Modeling

71 In this chapter, you will learn:
How relationships between entities are defined and refined, and how such relationships are incorporated into the database design process Key terms: cardinality, connectivity, optional, mandatory, strong relationship, weak relationship, supertype, subtype, etc. How to develop an E-R diagram

72 The Entity Relationship (ER) Model
ER model forms the basis of an ER diagram ERD represents the conceptual database as viewed by end user ERDs depict the ER model’s three main components: Entities Attributes Relationships

73 Entities Refers to the entity set and not to a single entity occurrence Corresponds to a table and not to a row in the relational environment In both the Chen and Crow’s Foot models, an entity is represented by a rectangle containing the entity’s name Entity name, a noun, is usually written in capital letters

74 Attributes Characteristics of entities
Domain is set of possible values Primary keys underlined

75 Examples EMPLOYEE (EMPLOYEE _ID, EMPLOYEE _NAME, ADDRESS, DATE-EMPLOYED) EMPLOYEE EMPLOYEE _ID EMPLOYEE _NAME ADDRESS DATE-EMPLOYED ADDRESS EMPLOYEE _NAME EMPLOYEE EMPLOYEE _ID DATE-EMPLOYED

76 Attributes (Cont’d) Simple Composite Single-valued Multi-valued
Cannot be subdivided Age, sex, marital status Composite Can be subdivided into additional attributes Address into street, city, zip Single-valued Can have only a single value Person has one social security number Multi-valued Can have many values Person may have several college degrees In the Chen E-R model, the multivalued attributes are shown by a double line connecting the attributes to the entity Derived Can be derived with algorithm Age can be derived from date of birth Versus stored attribute

77 Attributes (Cont’d) An attribute broken into component parts Address
Post_Code Street_Address City State

78 Attributes (Cont’d) Entity with a multivalued attribute (Skill) and derived attribute (Years_Employed) Years_Employed Employee_ID Date_Employed Skills Address Employee_Name EMPLOYEE

79 How to Deal with Multivalued Attributes
With the original entity, create several new attributes, one for each of the original multivalued attribute’s components. Create a new entity composed of the original multivalued attribute’s components.

80 An Example Mod_code Car_Year Car_Vin Car_Color CAR

81 Relationships Association between entities
Connected entities are called participants Operate in both directions Connectivity describes relationship classification 1:1, 1:M, M:N Cardinality Expresses number of entity occurrences associated with one occurrence of related entity

82 ERD Symbols Rectangles represent entities
Diamonds represent the relationship(s) between the entities “1” side of relationship Number 1 in Chen Model Bar crossing line in Crow’s Feet Model “Many” relationships Letter “M” and “N” in Chen Model Three pronged “Crow’s foot” in Crow’s Feet Model

83 Connectivity and Cardinality in an ERD

84 Relationship Strength
Existence dependence Entity’s existence depends on existence of related entities Existence-independent entities can exist apart from related entities EMPLOYEE claims DEPENDENT Weak (non-identifying) One entity is existence-independent on another PK of related entity doesn’t contain PK component of parent entity Strong (identifying) One entity is existence-dependent on another PK of related entity contains PK component of parent entity

85 Weak Entity Existence-dependent on another entity
Has primary key that is partially or totally derived from parent entity

86 Relationship Participation
Optional Entity occurrence does not require a corresponding occurrence in related entity Shown by drawing a small circle on side of optional entity on ERD Mandatory Entity occurrence requires corresponding occurrence in related entity If no optionality symbol is shown on ERD, it is mandatory

87 Relationship Degree Indicates number of associated entities Unary
Single entity Exists between occurrences of same entity set Binary Two entities associated Most common To simplify the conceptual design, most higher-order relationships are decomposed into appropriate equivalent relationships when possible Ternary Three entities associated

88 Three Types of Relationships

89 Recursive Relationship
Definition: A relationship can exist between occurrences of the same entity set. PERSON is married to 1 EMPLOYEE manages 1 M

90 Composite Entities Also known as bridge entities
Composed of the primary keys of each of the entities to be connected May also contain additional attributes that play no role in the connective process

91 A Composite Entity in an ERD

92 Example M:N Relationship

93 Converting M:N Relationship to Two 1:M Relationships (Cont’d)

94 An Example claims DEPENDENT ORDER PRODUCT EMPLOYEE employs STORE 1 W X
Z claims DEPENDENT ORDER PRODUCT EMPLOYEE employs STORE (a,b) (e,f) (g,h) (i,j) (k,l) (c,d) M 1 M (o,p) (m,n)

95 Comparison of E-R Modeling Symbols

96 Developing an E-R Diagram
Iterative Process Step1: General narrative of organizational operations developed Step2: Basic E-R Model graphically depicted and reviewed Step3: Modifications made to incorporate newly discovered E-R components Repeat process until designers and users agree E-R Diagram complete

97 Example Create an ERD using the following business rules:
A company operates four departments Each department employs employees Each of the employees may or may not have one or more dependents Each employee may or may not have an employment history

98 Exercise Design an E-R diagram for a real estate firm that lists property of sale. The firm has a number of sales offices in several states. Each sales offices is assigned one or more employees. Attributes of employees include ID and name. An employee must be assigned to only one sales office. For each sales office, there is always one employee assigned to manage that office. An employee may manage only the sales office to which he is assigned. The firm lists property for sale. Attributes of property include ID and location. Components of location include address, city, state, and Zip_code. Each unit of property must be listed with one of the sales office. A sales office may have any number of properties listed, or may have no properties listed. Each unit of property has one or more owners. An owner may own one or more units of property. An attribute of the relationship between property and owner is Percent_Owned.

99 ITEC 3220A Using and Designing Database Systems
Instructor: Gordon Turpin Course Website: Office: CSEB3020

100 Supertypes and Subtypes
Generalization hierarchy: depicts relationships between higher-level supertype and lower-level subtype entities Supertype: contains the shared attributes Subtype: contains the unique attributes Inheritance: Subtype entities inherit values of all attributes of the supertype An instance of a subtype is also an instance of the supertype

101 Supertypes and Subtypes (Cont’d)
Attributes shared by all entities Supertype/ subtype relationships General entity type SUPERTYPE And so forth SUBTYPE1 SUBTYPE2 Specialized version of supertype Attributes unique to subtype1 Attributes unique to subtype2

102 Supertypes and Subtypes (Cont’d)
Disjoint relationships Unique subtypes Non-overlapping Indicated with a ‘G’ Overlapping subtypes An instance of the supertype could be more than one of the subtypes Indicated with a ‘Gs’

103 Generalization Hierarchy with Overlapping Subtypes

104 Logical Database Design and Normalization of Database Tables
Chapter 5 Logical Database Design and Normalization of Database Tables

105 In this chapter, you will learn:
How to transform ERD into relations What normalization is and what role it plays in database design About the normal forms 1NF, 2NF, 3NF, BCNF, and 4NF How normal forms can be transformed from lower normal forms to higher normal forms Normalization and E-R modeling are used concurrently to produce a good database design Some situations require denormalization to generate information efficiently

106 Transforming ERD into Relations
Step one: Map regular entities Each regular entity type in an ER diagram is transformed into a relation The name given to the relation is generally the same as the entity type Each simple attribute of the entity type becomes an attribute of the relation and the identifier of entity becomes the primary key of the corresponding relation

107 Example STUDENT Student_ID Student_Name Other_Attributes

108 Transforming ERD into Relations (Cont’d)
Step two: Map weak entities Create a new relation and include all of the simple attributes as the attributes of this relation. Then include the primary key of the identifying relation as a foreign key attribute in this new relation. The primary key of the new relation is the combination of this primary key of the identifying relation and the partial identifier of the weak entity type.

109 Example Has Gender EMPLOYEE Employee_ID Employee_Name DEPENDENT
Dependent_Name Gender Date_of_birth EMPLOYEE Employee_ID Employee_Name DEPENDENT Dependent_Name Employee_ID Date_of_birth Gender

110 Transforming ERD into Relations (Cont’d)
Step three: Map binary relationship Map Binary one-to-many relations First create a relation for each of the two entity types participating in the relationship, using the procedure described in step one. Next, include the primary key attribute of the entity on the one-side of the relationship as a foreign key in the relation that is on the many-side of the relationship.

111 Example Customer_ID Customer_Name Order_ID Order_Date Customer_ID
Submits Order 1 (1,1) (0,N) M Customer Customer_ID Customer_Name Order Order_ID Order_Date Customer_ID

112 Transforming ERD into Relations (Cont’d)
Step three: Map binary relationship (Cont’d) Map binary one-to-one relationships First, two relationships are created one for each of the participating entity types. Second, the primary key of one of the relations is included as a foreign key in the other relation.

113 Example Location Nurse_ID Nurse_Name 1 1 Nurse Care Centre (0,1) (1,1)
Centre_Name 1 1 In_charge Nurse Care Centre (0,1) (1,1) Nurse Nurse_ID Nurse_Name Care Centre Centre_Name Location Nurse_in_charge

114 Transforming ERD into Relations (Cont’d)
Step Four: Map composite Entities First step Create three relations: one for each of the two participating entities, and the third for the composite entity. We refer to the relation formed from the composite entity as the composite relation Second step Identifier not assigned: The default primary key for the composite relation consists of the two primary key attributes from the other two relations. Identifier assigned: The primary key for the composite relation is the identifier. The primary keys for the two participating entity types are included as foreign keys in the composite relation.

115 Example Order_ID Order_Date Order 1 (1,N) (1,1) M Order Line Quantity
Product Price Description Product_ID 1 (0,N)

116 Example Order_ID Order_Date Product_ID Order_ID Quantity Product_ID
Order Line Product_ID Order_ID Quantity Product Product_ID Description Standard_Price

117 Example Address Vendor_ID Customer Name Customer_ID Shipment Vendor
Shipment_No Amount Date 1 M M Customer Customer_ID Customer_Name Shipment Shipment_No Vendor_ID Customer_ID Date Amount Vendor Vendor_ID Address

118 Transforming ERD into Relations (Cont’d)
Step Five: Map unary relationship Map unary one-to-many relationship The entity type in the unary relationship is mapped to a relation using the procedure described in Step one. Then a foreign key attribute is added within the same relation that references the primary key values. A recursive foreign key is a foreign key in a relation that references the primary key values of that same relation.

119 Example Employee_ID Name Birthdate Manager_ID Manages Employee_ID
1 M (1,1) (0,N) Employee Employee_ID Name Birthdate Manager_ID

120 Transforming ERD into Relations (Cont’d)
Step six: Map ternary relationship Convert a ternary relationship to a composite entity To map a composite entity that links three regular entities, we create a new composite relation. The default primary key of their relation consists of the three primary key attributes for the participating entities. Any attributes of the composite entity become attributes of the new relation

121 Example Treatment Physician Patient Physician_ID Physician_Name
Patient_ID Patient_Name Description Treatment_ Code Patient Treatment Time Date Results (0,N) 1 (0,N) 1 (1,1) (1,1) M M M (1,1) 1 (0,N)

122 Example Patient_ID Patient_Name Physician_ID Physician_Name
Treatment Patient Treatment Patient_ID Patient_Name Physician_ID Physician_Name Patient_ID Physician_ID Treatment_Code Date Time Result Treatment_Code Description

123 Transforming ERD into Relations (Cont’d)
Step seven: Map supertype/subtype relationships Create a separate relation for the supertype and for each of its subtype Assign to the relation created for the supertype the attributes that are common to all members of the supertype, including the primary key Assign to the relation for each subtype the primary key of the supertype, and only those attributes that are unique to that subtype Assign one attribute of the supertype to function as the subtype discriminator

124 Example Employee_Name Address Date_Hired Employee Employee_Number
Employee_Type Salaried Hourly Gs Hourly_Rate Stock_Option Annual_Salary

125 Example Employee Hourly_Employee Salaried_Employee Employee_Number
Employee_Name Address Employee_Type Date_Hired H_Employee_Number Hourly_Rate S_Employee_Number Annual_Salary Stock_Option

126 Database Tables and Normalization
Table is the basic building block in database design Normalization is the process for assigning attributes to entities Reduces data redundancies Helps eliminate data anomalies Produces controlled redundancies to link tables Normalization stages 1NF - First normal form 2NF - Second normal form 3NF - Third normal form 4NF - Fourth normal form

127 Need for Normalization

128 Anomalies In the Table PRO_NUM intended to be primary key
Table displays data anomalies Update Modifying JOB_CLASS Insertion New employee must be assigned project Deletion If employee deleted, other vital data lost

129 Conversion to 1NF Step 1: Eliminate the Repeating Groups
Present data in a tabular format, where each cell has a single value and there are no repeating groups Eliminate repeating groups by eliminating nulls, making sure that each repeating group attribute contains an appropriate data value Step 2: Identify the Primary Key Primary key must uniquely identify attribute value

130 Dependency Diagram (1NF)

131 1NF Summarized All key attributes defined No repeating groups in table
All attributes dependent on primary key All relational tables satisfy 1NF requirements

132 Conversion to 2NF Start with 1NF form:
Write each key component on separate line Write original key on last line Each component is new table Write dependent attributes after each key PROJECT (PROJ_NUM, PROJ_NAME) EMPLOYEE (EMP_NUM, EMP_NAME, JOB_CLASS, CHG_HOUR) ASSIGN (PROJ_NUM, EMP_NUM, HOURS)

133 2NF Conversion Results

134 2NF Summarized In 1NF Includes no partial dependencies
No attribute dependent on a portion of primary key Still possible to exhibit transitive dependency Attributes may be functionally dependent on nonkey attributes

135 Conversion to 3NF Create separate table(s) to eliminate transitive functional dependencies For every transitive dependency, write its determinant as a PK for a new table Identify the Dependent Attributes PROJECT (PROJ_NUM, PROJ_NAME) ASSIGN (PROJ_NUM, EMP_NUM, HOURS) EMPLOYEE (EMP_NUM, EMP_NAME, JOB_CLASS) JOB (JOB_CLASS, CHG_HOUR)

136 3NF Summarized In 2NF Contains no transitive dependencies

137 Boyce-Codd Normal Form (BCNF)
Every determinant in the table is a candidate key Determinant is an attribute whose value determines other values within a row 3NF table with one candidate key is already in BCNF

138 3NF Table Not in BCNF

139 Decomposition of Table Structure to Meet BCNF

140 Decomposition into BCNF

141 An Example GRADE( Student_ID, Student_Name, Address, Major, Course_ID, Course_Title, Instructor_Name, Instructor_Office, Grade)

142 Normalization and Database Design
Normalization should be part of the design process E-R Diagram provides macro view Normalization provides micro view of entities Focuses on characteristics of specific entities May yield additional entities Difficult to separate normalization from E-R diagramming Business rules must be determined

143 Higher-Level Normal Forms
Fourth Normal Form (4NF) Table is in 3NF Has no multiple sets of multivalued dependencies

144 Conversion to 4NF Set of Tables in 4NF Multivalued Dependencies
Stud-ID Course Service 1126 1212F Red Cross 1620F United Way 1320F Stud-ID Course 1126 1212F 1620F 1320F Stud-ID Course Service 1126 1212F 1620F 1320F Red Cross United Way Stud-ID Service 1126 Red Cross United Way Set of Tables in 4NF Stud-ID Course Service 1126 1212F Red Cross 1620F United Way 1320F Multivalued Dependencies

145 Denormalization Normalization is one of many database design goals
Normalized table requires Additional processing Loss of system speed Normalization purity is difficult to sustain due to conflict in: Design efficiency Information requirements Processing

146 Exercise Part Supplier Data Part_No Description Vendor_Name Address
Unit_Cost 1234 Logic Chips Fast Chips Smart Chips Cupertino Phoenix 10.00 8.00 5678 Memory Chips Quality Chips Austin 3.00 2.00 5.00

147 Exercise(Cont’d) Convert the table to a relation in first normal form (Named Part Supplier) List the functional dependency in the Part Supplier and identify a candidate key For the relation Part Supplier, identify the followings: an insert anomaly, a delete anomaly, and a modification anomaly. Draw a relation schema and show the functional dependencies Develop a set of 3NF relations from Part Supplier

148 Chapter 5 Review How to transform ERD into relations
Definitions: 1NF, 2NF, 3NF, BCNF, and 4NF How normal forms can be transformed from lower normal forms to higher normal forms

149 ITEC 3220A Using and Designing Database Systems
Instructor: Gordon Turpin Course Website: Office: CSEB3020

150 Introduction to Structured Query Language (SQL)
Chapter 7 Introduction to Structured Query Language (SQL)

151 Introduction to SQL SQL functions fit into two broad categories:
Data definition language SQL includes commands to create Database objects such as tables, indexes, and views Commands to define access rights to those database objects Data manipulation language Includes commands to insert, update, delete, and retrieve data within the database tables

152 SQL Data Definition Commands

153 Data Manipulation Commands

154 Creating the Database Two tasks must be completed First task
create the database structure create the tables that will hold the end-user data First task RDBMS creates the physical files that will hold the database Tends to differ substantially from one RDBMS to another

155 Data Types CHAR (n): Character string n characters long
DATE: Dates in the form DD-MON-YYYY OR MM/DD/YYYY DECIMAL (p, q): Decimal number p digits long with q if these being decimal places to the right of the decimal points INTEGER: Range from to SMALLINT: Similar to INTEGER but does not occupy as much space. It ranges from to 32767 NULL

156 Creating Table Structures
Use one line per column (attribute) definition Use spaces to line up the attribute characteristics and constraints Table and attribute names are capitalized NOT NULL specification UNIQUE specification Primary key attributes contain both a NOT NULL and a UNIQUE specification RDBMS will automatically enforce referential integrity for foreign keys Command sequence ends with a semicolon

157 Table Creation Steps Steps in table creation:
Identify data types for attributes Identify columns that can and cannot be null Identify columns that must be unique Identify primary key-foreign key mates Determine default values Identify constraints on columns (domain specifications) Create the table CREATE TABLE <table name> (<attribute1 name and attribute1 characteristics, attribute2 name and attribute2 characteristics, attribute3 name and attribute3 characteristics, primary key designation, foreign key designation and foreign key requirement>);

158 An Example Unique specifications
CREATE TABLE STUDENT( 2 STU_NUM INTEGER NOT NULL UNIQUE, 3 STU_NAME VARCHAR(15) NOT NULL, 4 GPA DECIMAL(3,2) NOT NULL, 5 PRIMARY KEY(STU_NUM)); Table created. Unique specifications Define attributes and their data types Not null specifications Semicolon indicates end of command Message indicates table was created Comma indicates end of description of an attribute Identify primary key

159 SQL Integrity Constraints
Adherence to entity integrity and referential integrity rules is crucial Entity integrity enforced automatically if primary key specified in CREATE TABLE command sequence Referential integrity can be enforced in specification of FOREIGN KEY Other specifications to ensure conditions met: ON DELETE RESTRICT ON UPDATE CASCADE

160 Advanced Data Definition Commands
All changes in the table structure are made by using the ALTER command Followed by a keyword that produces specific change Three options are available ADD MODIFY DROP

161 Changing a Column’s Data Type
ALTER can be used to change data type Some RDBMSs (such as Oracle) do not permit changes to data types unless the column to be changed is empty

162 Changing a Column’s Data Characteristics
Use ALTER to change data characteristics If the column to be changed already contains data, changes in the column’s characteristics are permitted if those changes do not alter the data type

163 Adding or Dropping a Column
Use ALTER to add a column Do not include the NOT NULL clause for new column Use ALTER to drop a column Some RDBMSs impose restrictions on the deletion of an attribute

164 Data Manipulation Commands
Adding table rows Saving table changes Listing table rows Updating table rows Restoring table contents Deleting table rows Inserting table rows with a select subquery

165 Common SQL Data Manipulation Commands

166 Data Entry Enter data into a table INSERT INTO <table name>
VALUES (attribute 1 value, attribute 2 value, … etc.);

167 Listing Table Rows SELECT Syntax
Used to list contents of table Syntax SELECT columnlist FROM tablename Columnlist represents one or more attributes, separated by commas Asterisk can be used as wildcard character to list all attributes

168 Updating Table Rows UPDATE Modify data in a table Syntax
UPDATE tablename SET columnname = expression [, columname = expression] [WHERE conditionlist]; If more than one attribute is to be updated in the row, separate corrections with commas

169 Saving Table Changes Changes made to table contents are not physically saved on disk until Database is closed Program is closed COMMIT command is used Syntax COMMIT Will permanently save any changes made to any table in the database

170 Restoring Table Contents
ROLLBACK Used to restore the database to its previous condition Only applicable if COMMIT command has not been used to permanently store the changes in the database Syntax ROLLBACK; COMMIT and ROLLBACK only work with data manipulation commands that are used to add, modify, or delete table rows

171 Deleting Table Rows DELETE Deletes a table row Syntax
DELETE FROM tablename [WHERE conditionlist ]; WHERE condition is optional If WHERE condition is not specified, all rows from the specified table will be deleted

172 Inserting Table Rows with a Select Subquery
Inserts multiple rows from another table (source) Uses SELECT subquery Query that is embedded (or nested) inside another query Executed first Syntax INSERT INTO tablename SELECT columnlist FROM tablename

173 Selecting Rows with Conditional Restrictions
Select partial table contents by placing restrictions on rows to be included in output Add conditional restrictions to the SELECT statement, using WHERE clause Syntax SELECT columnlist FROM tablelist [ WHERE conditionlist ] ;

174 Comparison Operators

175 Arithmetic Operators: The Rule of Precedence
Perform operations within parentheses Perform power operations Perform multiplications and divisions Perform additions and subtractions

176 Special Operators BETWEEN
Used to check whether attribute value is within a range IS NULL Used to check whether attribute value is null LIKE Used to check whether attribute value matches a given string pattern IN Used to check whether attribute value matches any value within a value list EXISTS Used to check if a subquery returns any rows

177 More Complex Queries and SQL Functions
Listing unique values DISTINCT clause produces list of different values Aggregate functions Mathematical summaries SELECT DISTINCT V_CODE FROM PRODUCT;

178 Example Aggregate Function Operations
COUNT MAX and MIN SELECT COUNT(DISTINCT V_CODE) FROM PRODUCT; SELECT COUNT(DISTINCT V_CODE) FROM PRODUCT WHERE P_PRICE <= 10.00; SELECT MIN(P_PRICE) FROM PRODUCT; SELECT P_CODE, P_DESCRIPT, P_PRICE FROM PRODUCT WHERE P_PRICE = MAX(P_PRICE);

179 Example Aggregate Function Operations (Cont’d)
SUM AVG SELECT SUM(P_ONHAND * P_PRICE) FROM PRODUCT; SELECT P_DESCRIPT, P_ONHAND, P_PRICE, V_CODE FROM PRODUCT WHERE P_PRICE > (SELECT AVG(P_PRICE) FROM PRODUCT) ORDER BY P_PRICE DESC;

180 More Complex Queries and SQL Functions (Cont’d)
Ordering a listing Results ascending by default Descending order uses DESC Cascading order sequence ORDER BY <attributes> ORDER BY <attributes> DESC ORDER BY <attribute 1, attribute 2, ...>

181 More Complex Queries and SQL Functions (cont’d)
Grouping data Creates frequency distributions Only valid when used with SQL arithmetic functions HAVING clause operates like WHERE for grouping output SELECT P_SALECODE, MIN(P_PRICE) FROM PRODUCT_2 GROUP BY P_SALECODE; SELECT V_CODE,COUNT(DISTINCT(P_CODE)),AVG(P_PRICE) FROM PRODUCT_2 GROUP BY V_CODE HAVING AVG(P_PRICE) < 10;

182 SQL Exercise Write SQL code that will create the relations shown. Assume the following attribute data types: Student_ID: integer Student_Name: 25 characters Faculty_ID: integer Faculty_Name: 25 characters Course_ID: 25 characters Course_Name: 15 characters Date_Qualified: date Section_ID: integer Semester: 7 characters

183 SQL Exercise (Cont’d) STUDENT (Primary key: Student_ID)
IS_QUALIFIED (Primary key: Faculty_ID, Course_ID) Faculty_ ID Course_ID Date_ Qualified 2143 ISM3112 9/1988 3467 ISM4212 9/1995 ISM4930 9/1996 4756 ISM3113 9/1991 Student_ ID Name 38214 Letersky 54907 Altvater 66324 Aiken 70542 Marra

184 SQL Exercise (Cont’d) FACULTY (Primary key: Faculty_ID)
SECTION (Primary key: Section_ID) Section_ID Course_ID 2712 ISM3113 2713 2714 ISM4212 2715 ISM4930 Faculty_ID Faculty_Name 2143 Birkin 3467 Berndt 4756 Collins

185 SQL Exercise (Cont’d) COURSE ((Primary key: Course_ID)
IS_REGISTERED (Primary key: Student_ID, Section_ID) Course_ID Course_ Name ISM3113 Syst Analysis ISM3112 Syst Design ISM4212 Database ISM4930 Networking Student_ID Section_ID Semester 38214 2714 I 54907 2715 66324 2713

186 SQL Exercise (Cont’d) Write SQL queries to answer the following questions: Display the course ID and course name for all courses with an ISM prefix. Is any instructor qualified to teach ISM 3113 and not qualified to teach ISM 4930? How many students are enrolled in section 2714 during semester I – 2001? Which students were not enrolled in any courses during semester I – 2001?

187 Summary SQL commands can be divided into two overall categories:
Data definition language commands Data manipulation language commands Basic data definition commands allow you to create tables, indexes, and views Many SQL constraints can be used with columns Aggregate functions Special functions that perform arithmetic computations over a set of rows

188 Summary (Cont’d) ORDER BY clause
Used to sort output of a SELECT statement Can sort by one or more columns and use either an ascending or descending order Join output of multiple tables with SELECT statement Natural join uses join condition to match only rows with equal values in specified columns

189 Lab Instruction Log into the workstations using your ACADLAB account.
Before you go to the lab sessions, please use your Passport York to create Acadlab account (NOVELL account) and AML account if you don't have them yet. These two accounts will allow you to get access to the ORACLE database. How to get access to Oracle in ITEC labs Log into the workstations using your ACADLAB account. Choose Oracle from programs under the start menu and then choose sqlplus When prompted for the username/password enter (where your_username is your AML username) at the username prompt and your AML password at the password prompt.

190 Lab Instruction How to get access to Oracle at home
Login to unix.aml.yorku.ca At the prompt type the following commands: source /javainit Start Oracle SQL*PLUS environment by typing the following command: sqlplus When prompted for the username/password enter (where your_username is your AML username) at the username prompt and your AML password at the password prompt.

191 Lab Content Practice Question 2 of Assignment Two.
Create table structure for each table Fill the tables with the data Run the SQL queries Print your queries and the answers

192 Lab Tips To list all tables you have in your Oracle account use the following SQL command: select table_name from user_tables; To describe a given Oracle table use the following Oracle environment command (note that this is not an SQL command): desc tablename (where tablename is the name of the table that you have in your account)

193 ITEC 3220A Using and Designing Database Systems
Instructor: Gordon Turpin Course Website: Office: CSEB3020

194 Advanced Structured Query Language (SQL)
Chapter 8 Advanced Structured Query Language (SQL)

195 SQL Queries Single table query Multiple table query
Nesting query (subquery) Using IN Using EXISTS Join Table

196 Examples SELECT Order_Num FROM ORDERS WHERE Order_Num IN
FROM ORDER_LINE WHERE Part_Num =1234;

197 Examples (Cont’d) SELECT Order_Num FROM ORDERS WHERE EXISTS (SELECT *
FROM ORDER_LINE WHERE ORDERS.Order_Num = ORDERLINE.Order_Num AND Part_Num =1234;

198 Examples (Cont’d) SELECT S.Last, S.First, C.Last, C.First
FROM SALES_REP S, CUSTOMER C WHERE S.Srep_Num = C. Srep_Num

199 SQL Exercise Write SQL code that will create the relations shown. Assume the following attribute data types: Student_ID: integer Student_Name: 25 characters Faculty_ID: integer Faculty_Name: 25 characters Course_ID: 25 characters Course_Name: 15 characters Date_Qualified: date Section_ID: integer Semester: 7 characters

200 SQL Exercise (Cont’d) STUDENT (Primary key: Student_ID)
IS_QUALIFIED (Primary key: Faculty_ID, Course_ID) Faculty_ ID Course_ID Date_ Qualified 2143 ISM3112 9/1988 3467 ISM4212 9/1995 ISM4930 9/1996 4756 ISM3113 9/1991 Student_ ID Name 38214 Letersky 54907 Altvater 66324 Aiken 70542 Marra

201 SQL Exercise (Cont’d) FACULTY (Primary key: Faculty_ID)
SECTION (Primary key: Section_ID) Section_ID Course_ID 2712 ISM3113 2713 2714 ISM4212 2715 ISM4930 Faculty_ID Faculty_Name 2143 Birkin 3467 Berndt 4756 Collins

202 SQL Exercise (Cont’d) COURSE ((Primary key: Course_ID)
IS_REGISTERED (Primary key: Student_ID, Section_ID) Course_ID Course_ Name ISM3113 Syst Analysis ISM3112 Syst Design ISM4212 Database ISM4930 Networking Student_ID Section_ID Semester 38214 2714 I 54907 2715 66324 2713

203 SQL Exercise (Cont’d) Write SQL queries to answer the following questions: Is any instructor qualified to teach ISM 3113 and not qualified to teach ISM 4930? How many students are enrolled in section 2714 during semester I – 2001? Display all the courses (Course_Name) for which Professor Berndt has been qualified. Which students were not enrolled in any courses during semester I – 2001?

204 Exercise Write SQL codes to create the following tables CUSTOMER
ORDER_LINE Customer_ID Customer_Name City State 1 Value Furniture Plano TX 2 Home furnishings Albany NY 3 Eastern Furniture Carteret NJ 4 Furniture Gallery Order_ID Product_ID Quan 1001 1 2 1002 3 5 1003

205 Exercise (Cont’d) ORDER PRODUCT Order_ID Order_Date Customer_ID 1001
21-Oct-2000 1 1002 4 1003 22-Oct-2000 2 PRODUCT Product_ID Description Product_finish Standard_price 1 End Table Cherry 175 2 Coffee Table Natural Ash 200 3 Computer Desk 375

206 Exercise (Cont’d) Use SQL to design the following queries:
How many different items were ordered on order number 1001? List product ID and standard price for all desks and all tables that cost more than $200. What furniture is not made of cherry?

207 Exercise (Cont’d) Use SQL to design the following queries:
List all the customers who live in FL, TX and CA. Find only states with more than one customer. What are order numbers that have included furniture finished in natural Ash. What are the names of all customers who have placed orders?

208 Exercise (Cont’d) Use SQL to design the following queries:
For each customer who has placed an order, what is the customer’s name and order number? Which customers have not placed any orders for computer desk? List the product name and price with the highest standard price.

209 ITEC 3220A Using and Designing Database Systems
Instructor: Gordon Turpin Course Website: Office: CSEB3020

210 Transaction Management and Concurrent Control
Chapter 10 Transaction Management and Concurrent Control

211 What is a Transaction? Any action that reads from and/or writes to a database may consist of Simple SELECT statement to generate a list of table contents A series of related UPDATE statements to change the values of attributes in various tables A series of INSERT statements to add rows to one or more tables A combination of SELECT, UPDATE, and INSERT statements

212 What is a Transaction? (continued)
A logical unit of work that must be either entirely completed or aborted Successful transaction changes the database from one consistent state to another One in which all data integrity constraints are satisfied Most real-world database transactions are formed by two or more database requests The equivalent of a single SQL statement in an application program or transaction

213 Example Transaction Examine current account balance
Consistent state after transaction No changes made to Database SELECT ACC_NUM, ACC_BALANCE FROM CHECKACC WHERE ACC_NUM = ‘ ’; 6

214 Example Transaction Register credit sale of 100 units of product X to customer Y for $500 Consistent state only if both transactions are fully completed DBMS doesn’t guarantee transaction represents real-world event UPDATE PRODUCT SET PROD_QOH = PROD_QOH WHERE PROD_CODE = ‘X’; UPDATE ACCT_RECEIVABLE SET ACCT_BALANCE = ACCT_BALANCE WHERE ACCT_NUM = ‘Y’; 6

215 Incomplete Transactions
Reasons: An anomaly arises during execution (automatically restart) System crashes An unexpected situation during transaction execution May bring database to inconsistent state

216 Transaction Properties
Atomicity All transaction operations must be completed Incomplete transactions aborted Durability Permanence of consistent database state Serializability Conducts transactions in serial order Important in multi-user and distributed databases Isolation Transaction data cannot be reused until its execution complete 9

217 Transaction Management with SQL
Transaction support COMMIT ROLLBACK User initiated transaction sequence must continue until: COMMIT statement is reached ROLLBACK statement is reached End of a program reached Program reaches abnormal termination 10

218 Transaction Log Tracks all transactions that update database
May be used by ROLLBACK command May be used to recover from system failure Log stores Record for beginning of transaction Each SQL statement Operation Names of objects Before and after values for updated fields Pointers to previous and next entries Commit Statement 12

219 Transaction Log Example

220 Example Suppose that you are a manufacturer of product ABC, which is composed of parts A, B, C. Each time a new product ABC is created, it must be added to the product inventory, using the PROD_QOH in PRODUCT table. And each time the product is created the parts inventory, using PART_QOH in PART table must be reduced by one each of parts, A, B, and C. PART PRODUCT PART_CODE PART_QOH A 567 B 98 C 549 PROD_CODE PROD_QOH ABC 1205

221 Example (Cont’d) Given the information, answer:
How many database requests can you identify for an inventory update for both PRODUCT and PART? Using SQL, write each database request you have identified above. Write the complete transactions. Write the transaction log, using the template in slide 11.

222 Concurrency Control Coordinates simultaneous transaction execution in multiprocessing database Ensure serializability of transactions in multiuser database environment Potential problems in multiuser environments Lost updates Uncommitted data Inconsistent retrievals 14

223 Normal Execution of Two Transactions

224 Lost Updates

225 More Example

226 Correct Execution of Two Transactions

227 An Uncommitted Data Problem

228 Retrieval During Update

229 Transaction Results: Data Entry Correction

230 Inconsistent Retrievals

231 Example A department store runs a multiuser DBMS on a local area network file server which does not enforce concurrency control. One customer has a balance due of $250 when the following three transactions related to this customer were processed at the same time: Payment of $250 Purchase on credit of $100 Merchandise return of $50. Each transaction reads the customer record when the balance was $250. the updated record was returned to the database in the order shown above. What balance will be for the customer after the last transaction was completed?

232 The Scheduler Establishes order of concurrent transaction execution
Interleaves execution of database operations to ensure serializability Bases actions on concurrency control algorithms Locking Time stamping Ensures efficient use of computer’s CPU 23

233 Read/Write Conflict Scenarios:

234 Concurrency Control with Locking Methods
Lock guarantees current transaction exclusive use of data item Acquires lock prior to access Lock released when transaction is completed DBMS automatically initiates and enforces locking procedures Managed by lock manager Lock granularity indicates level of lock use 25

235 Locking Mechanisms Locking level:
Database – used during database updates Table – used for bulk updates Block or page – very commonly used Row – only requested row; fairly commonly used Field – requires significant overhead; impractical

236 Locking Granularity Granularity refers to the level of the database item locked. A trade-off between overhead and waiting. Holding locks at a fine level decreases waiting among users but increase the system overhead. Holding locks at a coarser level reduces the number of locks but increases the amount of waiting.

237 A Database-Level Locking Sequence

238 An Example of a Table-Level Lock

239 Example of a Page-Level Lock

240 An Example of a Row-Level Lock

241 Binary Locks Two states Locked objects unavailable to other objects
Unlocked (0) Locked objects unavailable to other objects Unlocked objects open to any transaction Transaction unlocks object when complete 31

242 An Example of a Binary Lock

243 Shared/Exclusive Locks
Exists when concurrent transactions granted READ access Produces no conflict for read-only transactions Issued when transaction wants to read and exclusive lock not held on item Exclusive Exists when access reserved for locking transaction Used when potential for conflict exists Issued when transaction wants to update unlocked data

244 Shared/Exclusive Locks (Cont’d)
_ No Yes T2 T1

245 Two-Phase Locking to Ensure Serializability
Defines how transactions acquire and relinquish locks Guarantees serializability, but it does not prevent deadlocks Growing phase, in which a transaction acquires all the required locks without unlocking any data Shrinking phase, in which a transaction releases all locks and cannot obtain any new lock

246 Two-Phase Locking to Ensure Serializability (continued)
Governed by the following rules: Two transactions cannot have conflicting locks No unlock operation can precede a lock operation in the same transaction No data are affected until all locks are obtained—that is, until the transaction is in its locked point

247 Two-Phase Locking Protocol

248 Deadlocks Condition that occurs when two transactions wait for each other to unlock data Possible only if one of the transactions wants to obtain an exclusive lock on a data item No deadlock condition can exist among shared locks Control through Prevention Detection Avoidance

249 How a Deadlock Condition Is Created

250 Example on Concurrency Control
Given schedule S1 as follows, and the locks won’t be released until commit. Is there any deadlock in S1 using Shared/Exclusive lock. T1 T2 T3 R(A) W(B) W(A) Commit A, B Commit B

251 More Example T1 T2 T3 R(C) R(B) W(B) R(A) W(A) W(C) Commit A
Commit A, B & C Commit B

252 More Example Let transactions T1, T2, and T3 be defined to perform the following operations: T1: Add one to A T2: Double A T3: Display A and then set A to one Suppose the structure for T1, T2, T3 is indicated below. If the transactions execute without any locking, please give an example of wrong schedules.

253 More Examples (Cont’d)
Read (A), A ← A+1 T12: Update (A) T21: Read (A), A ← A*2 T22: T31: Read (A), A = 1 T32: Suppose the following schedule T11- T31- T12- T32- T21- T22 obeyed the two-phase locking algorithm. Explain what could be produced by the schedule.

254 Concurrency Control with Time Stamping Methods
Assigns a global unique time stamp to each transaction Produces an explicit order in which transactions are submitted to the DBMS Uniqueness Ensures that no equal time stamp values can exist Monotonicity Ensures that time stamp values always increase

255 Wait/Die and Wound/Wait Schemes
Older transaction waits and the younger is rolled back and rescheduled Wound/wait Older transaction rolls back the younger transaction and reschedules it

256 Wait/Die and Wound/Wait Concurrency Control Schemes

257 Example Concurrency control is implemented based on time stamping method. Consider the following schedule: T1 T2 R(A) W(A) W(B) W(C) R(C)

258 Concurrency Control with Optimistic Methods
Optimistic approach Based on the assumption that the majority of database operations do not conflict Does not require locking or time stamping techniques Transaction is executed without restrictions until it is committed Phases are read, validation, and write

259 Better Performance than Locking

260 Example T1 T2 R(A) W(A) R(B) commit

261 Database Recovery Management
Restores database from a given state, usually inconsistent, to a previously consistent state Based on the atomic transaction property All portions of the transaction must be treated as a single logical unit of work, in which all operations must be applied and completed to produce a consistent database If transaction operation cannot be completed, transaction must be aborted, and any changes to the database must be rolled back (undone)

262 Transaction Recovery Deferred write Write-through
Transaction operations do not immediately update the physical database Only the transaction log is updated Database is physically updated only after the transaction reaches its commit point using the transaction log information Write-through Database is immediately updated by transaction operations during the transaction’s execution, even before the transaction reaches its commit point

263 Example Describe the restart work if transaction T1 is committed after the checkpoint but prior to the failure. Assume that the recovery manager uses the deferred update approach The write though approach Backup Checkpoint Failure T1

264 Review Transaction property Transaction log
Potential problems in multiuser environments Different locking methods and how they work Database recovery management

265 ITEC 3220A Using and Designing Database Systems
Instructor: Gordon Turpin Course Website: Office: CSEB3020

266 Object-Oriented Database
Appendix G Object-Oriented Database

267 Object Orientation Object Orientation OO Contribution areas
Set of design and development principles Based on autonomous computer structures known as objects OO Contribution areas Programming Languages Graphical User Interfaces Databases Design Operating Systems

268 Evolution of OO Concepts
Concepts stem from object-oriented programming languages Ada, ALGOL, LISP, SIMULA OOPLs goals Easy-to-use development environment Powerful modeling tools for development Decrease in development time Make reusable code OO Attributes Data set not passive Data and procedures bound together Objects can act on self

269 OO Concepts: Objects Abstract representation of a real-world entity
Unique identity Embedded properties Ability to interact with other objects and self OID Unique to that object Assigned by system at moment of object’s creation Cannot be changed under any circumstances Can be deleted only if the object is deleted Can never be reused

270 Attributes (Instance Variables)
Known as instance variables in OO environment Domain: Logically groups and describes the set of all possible values that an attribute can have

271 Object State Set of values that object’s attributes have at a given time Can vary, although its OID remains the same To change the object’s state, change the values of the object’s attributes To change the object’s attribute values, send a message to the object Message will invoke a method

272 Messages and Methods Method:
Code that performs a specific operation on object’s data Protects data from direct and unauthorized access by other objects Used to change the object’s attribute values or to return the value of selected object attributes Represent real-world actions

273 Classes Collection of similar objects with shared structure (attributes) and behavior (methods) Class instance or object instance Each object in a class

274 Protocol An object’s public aspect
How it is known by other objects as well as end users Other objects communicate with the student object using any of these methods

275 Object Characteristics

276 Class Hierarchy Superclass Subclass Class lattice

277 Inheritance Ability of object to inherit the data structure and behavior of classes above it Single inheritance Class has one immediate superclass

278 Inheritance (Cont’d.) Multiple
Class has more than one immediate superclass

279 Method Overriding Method redefined at subclass level

280 Polymorphism Allows different objects to respond to same message in different ways

281 Abstract Data Types (ADT)
Describes a set of similar objects Differs from conventional data types Operations are user-defined Uses encapsulation Definitions needed for creation Name Data representation Abstract data type operations and constraints

282 Object Classification
Simple Only single-valued attributes No attributes refer to other object Composite At least one multivalued attribute Compound At least one attribute that references other object Hybrid Repeating group of attributes At least one refers to other object Associative object

283 OO vs. E-R Model Components

284 Class-Subclass Relationship

285 Interobject Relationships
Attribute-Class Link Object’s attribute references another object Relationship Representation Related classes enclosed in boxes Double line on right side indicates mandatory Connectivity indicated by labeling each box 1:M M:N M:N with an Intersection Class

286 1:1 and 1:M Relationships

287 Employee-Dependent Relationship

288 Representing the M:N Relationship

289 Representing the M:N Relationship with Associated Attributes

290 Representing the M:N Relationship with Intersection Class

291 Late and Early Binding Late binding Early binding
Data type of attribute not known until runtime Allows different instances of same class to contain different data types for same attribute Early binding Allows database to check data type at compilation or definition time

292 OODM vs. E-R Data Models Object, Entity, and Tuple
OODM object has behavior, inheritance, and encapsulation OO modeling more natural Class, Entity Set, and Table Class allows description of data and behavior Class allows abstract data types Encapsulation and Inheritance Object inherits properties of superclasses Encapsulation hides data representation and method

293 OODM vs. E-R Data Models (Cont’d)
Object ID Not supported in relational models Relationships OODM Interclass references Class hierarchy inheritance Relational models Value-based approach Access Relational models SQL OODM Navigational Set-oriented access

294 Example Assume the following business rules:
A course contains many sections, but each section has only one course A section is taught by one professor, but each professor may teach one or more different sections of one or more course A section may contain many students, and each student is enrolled in many sections, but each section belongs to a different course. (Students my take many courses, but they cannot take many sections of the same course.) Each section is taught in one room, but each room may be used to teach several different sections of one or more courses A professor advises many students, but a student has only one advisor

295 Example (Cont’d) Identify and describe the main classes of objects
Modify your description in part 1 to include the use of abstract types such as Name, DOB, and Address Create the conceptual OO representations

296 More Example Using intersection class to represent the following relationship

297 OO Design Example Design OO conceptual representations for an engineering company, using the following requirements: A customer has a unique customer identifier. Other important attributes of each customer include name and address. They can request any number of work orders from the company. The company maintains a list of materials. The data about materials include a unique material identifier, a name and cost. A work order has a unique work order number, a creation date, a completion date, a work address and a set of (one or more) tasks. In addition, each work order has one optional supervising employee. Each employee has a unique number assigned by the company. Other important attributes of each employee include name and skill. Each work order also has a collection of materials. The same material can be used by any number of work orders. Material requirement includes material quantity. Each task has a unique task identifier, a task name, an hourly rate and estimated hours. Tasks are standardized across work orders so that the same task may be performed on many work orders. We have to keep record of actual hours of each task on a work order.

298 OO Influences on Relational Model
Extensibility of new user-defined (abstract) data types Complex objects Inheritance Procedure calls (rules or triggers) System-generated identifiers (OID surrogates)

299 ITEC 3220A Using and Designing Database Systems
Instructor: Gordon Turpin Course Website: Office: CSEB3020

300 Chapter 13 The Data Warehouse

301 Transaction Processing Versus Decision Support
Transaction processing allows organizations to conduct daily business in an efficient manner Operational database Decision support helps management provide medium-term and long-term direction for an organization

302 Decision Support System (DSS) Components

303 Operational vs. Decision Support Data
Operational data Relational, normalized database Optimized to support transactions Real time updates DSS Snapshot of operational data Summarized Large amounts of data Data analyst viewpoint Timespan Granularity Dimensionality

304 The DSS Database Requirements
Database schema Support complex (non-normalized) data Extract multidimensional time slices Data extraction and filtering End-user analytical interface Database size Very large databases (VLDBs) Contains redundant and duplicated data

305 Data Warehouse Integrated Subject-Oriented Time Variant Non-Volatile
Centralized Holds data retrieved from entire organization Subject-Oriented Optimized to give answers to diverse questions Used by all functional areas Time Variant Flow of data through time Projected data Non-Volatile Data never removed Always growing

306 Data Marts Single-subject data warehouse subset
Decision support to small group Can be tested for exploring potential benefits of Data warehouses Address local or departmental problems

307 Data Warehouse Versus Data Mart

308 Star Schema Data-modeling technique
Maps multidimensional decision support into relational database Yield model for multidimensional data analysis while preserving relational structure of operational DB Four Components: Facts Dimensions Attributes Attribute hierarchies

309 Simple Star Schema

310 Slice and Dice View of Sales

311 Star Schema Representation
Facts and dimensions represented by physical tables in data warehouse DB Fact table related to each dimension table (M:1) Fact and dimension tables related by foreign keys Subject to the primary/foreign key constraints

312 Star Schema for Sales

313 Example Canadian financial organization is interested in building a data warehouse to analyze customers’ credit payments over time, location where the payments were made, customers, and types of credit cards. A customer may use the credit card to make a payment in different locations across the country and abroad. If a payment is made abroad it can be based on domestic currency and then converted into Canadian dollars based on currency rate. Time is described by Time_ID, day, month, quarter and year. Location is presented by Location_ID, name of the organization billing the customer, city and country where the organization is located, domestic currency. A credit card is described by credit card number, type of the credit account, and customer’s credit rate. The customer’s rate depends on the type of the credit account. A customer is described by ID, name, address, and phone.

314 Performance-Improving Techniques for Star Schema
Normalization of dimensional tables Multiple fact tables representing different aggregation levels Denormalization of the fact tables Table partitioning and replication

315 Normalization Example
Normalize the star schema that you developed for Canadian financial organization on page 16 into 3NF.

316 More Example A supermarket chain is interested in building a data warehouse to analyze the sales of different products in different supermarkets at different times using different payment method. Each supermarket is presented by location_ID, city, country, and domestic currency. Time can be measured in time_ID, day, month, quarter, and year. Each product is described by product_ID, product_name, and vendor. Payment method is described by payment_ID, payment_ type. Design a star schema for this problem and then normalize the star schema that you developed into 3NF.

317 Data Warehouse Implementation Road Map

318 Distributed Database Management Systems
Chapter 12 Distributed Database Management Systems

319 The Evolution of Distributed Database Management Systems
Distributed database management system (DDBMS) Governs storage and processing of logically related data over interconnected computer systems in which both data and processing functions are distributed among several sites

320 Distributed Database Environment

321 Database Systems: Levels of Data and Process Distribution

322 Single-Site Processing, Single-Site Data (SPSD)
All processing is done on single CPU or host computer (mainframe, midrange, or PC) All data are stored on host computer’s local disk Processing cannot be done on end user’s side of the system Typical of most mainframe and midrange computer DBMSs DBMS is located on the host computer, which is accessed by dumb terminals connected to it Also typical of the first generation of single-user microcomputer databases

323 Single-Site Processing, Single-Site Data (Centralized)

324 Multiple-Site Processing, Single-Site Data (MPSD)
Multiple processes run on different computers sharing a single data repository MPSD scenario requires a network file server running conventional applications that are accessed through a LAN Many multi-user accounting applications, running under a personal computer network, fit such a description

325 Multiple-Site Processing, Single-Site Data

326 Multiple-Site Processing, Multiple-Site Data (MPMD)
Fully distributed database management system with support for multiple data processors and transaction processors at multiple sites Classified as either homogeneous or heterogeneous Homogeneous DDBMSs Integrate only one type of centralized DBMS over a network

327 Multiple-Site Processing, Multiple-Site Data (MPMD) (Cont’d)
Heterogeneous DDBMSs Integrate different types of centralized DBMSs over a network Fully heterogeneous DDBMS Support different DBMSs that may even support different data models (relational, hierarchical, or network) running under different computer systems, such as mainframes and microcomputers

328 Distributed Database Design
Data fragmentation: How to partition the database into fragments Data replication: Which fragments to replicate Data allocation: Where to locate those fragments and replicas

329 Data Fragmentation Breaks single object into two or more segments or fragments Each fragment can be stored at any site over a computer network Information about data fragmentation is stored in the distributed data catalog (DDC), from which it is accessed by the TP to process user requests

330 Data Fragmentation Strategies
Horizontal fragmentation: Division of a relation into subsets (fragments) of tuples (rows) Vertical fragmentation: Division of a relation into attribute (column) subsets Mixed fragmentation: Combination of horizontal and vertical strategies

331 Data Replication Storage of data copies at multiple sites served by a computer network Fragment copies can be stored at several sites to serve specific information requirements Can enhance data availability and response time Can help to reduce communication and total query costs

332 Replication Scenarios
Fully replicated database: Stores multiple copies of each database fragment at multiple sites Can be impractical due to amount of overhead Partially replicated database: Stores multiple copies of some database fragments at multiple sites Most DDBMSs are able to handle the partially replicated database well Unreplicated database: Stores each database fragment at a single site No duplicate database fragments

333 Data Allocation Deciding where to locate data Allocation strategies:
Centralized data allocation Entire database is stored at one site Partitioned data allocation Database is divided into several disjointed parts (fragments) and stored at several sites Replicated data allocation Copies of one or more database fragments are stored at several sites Data distribution over a computer network is achieved through data partition, data replication, or a combination of both


Download ppt "Using and Designing Database Systems"

Similar presentations


Ads by Google