SI654 Database Application Design

SI654 Database Application Design
Instructor: Dragomir R. Radev Winter 2005

Book information Database Processing by David Kroenke (9th Edition, Prentice Hall, ISBN ) : Managing and Using MySQL by Reese, Yarger, and King (O'Reilly, ISBN ) : Optional reading: Database Management Systems by Ramakrishnan and Gerhke (McGraw-Hill, ISBN ) : Optional reading: Data Mining by Han and Kamber (Morgan Kaufmann, ISBN ): Copyright © 2004

Syllabus - I DK Ch. 1. Introduction to Database Processing
DK Ch. 2. Entity-relationship data modeling: tools and techniques DK Ch. 3. Entity-relationship data modeling: process and examples DK Ch. 4. The Relational Model and Normalization DK Ch. 5. Database Design READING The ERWin System DK Ch. 6. Introduction to SQL DK Ch. 7. Using SQL in applications RYK Ch. 1 MySQL DK Ch. 8. Database redesign Copyright © 2004

Syllabus - II RYK Ch. 3 SQL according to MySQL
DK Ch. 9. Managing Multi-User Databases RYK Ch. 7 Database Design DK Ch. 10/11. Managing Databases with Oracle/SQL Server DK Ch, 12 ODBC, OLE DB, ADO, and ASP DK Ch. 13 XML and ADO.NET DK Ch. 14 JDBC, Java Server pages, and MySQL DK Ch. 15 Sharing enterprise data READING XML and query languages for XML READING Data Mining DK App. A. Data Structures for Database Processing Copyright © 2004

Assignments Assignment 1: Entity-Relationship Model, Relational Model, SQL Assignment 2: Database design using ERWin and Oracle Assignment 3: Database design using MySQL Assignment 4: XML, Data Mining, and other advanced topics Copyright © 2004

Final project Proposal Database design Progress report Project
Final presentation Copyright © 2004

Grading Four assignments: 40% (10% each) Project + presentation: 30%
Final exam: 25% Class participation: 5% Copyright © 2004

Policies Class participation counts as 5% of the grade
Timely submission of assignments is important Syllabus can be amended during the semester Honors Code Copyright © 2004

Notes on programming All students will do some programming as part of the assignments. For the final project, teams will be formed in ways to include students with diverse backgrounds. Copyright © 2004

Chapter 1 Introduction to Database Processing

Why Use A Database? The purpose of a database is to help people and organizations keep track of things Problems of using list to store data Data inconsistencies Data privacy: The departments want to share some, but not all, of their data Databases store data in single-theme tables Tables are related through primary and foreign keys Copyright © 2004

Components of A Database System
Copyright © 2004

Application Programs Functions: Create and process forms
Create and transmit queries Create and process reports Execute application logic Control application Copyright © 2004

DBMS DBMS: Database Management System Functions:
Create database, tables, and supporting structures Read and update database data Maintain database structures Enforce rules Control concurrency Provide security Perform backup and recovery Example: Oracle, DB2, Microsoft Access, SQL Server Copyright © 2004

Database Database is a self-describing collection of related records or tables Components: User Data Metadata: data about the structure of a database Indexes and related structures Stored procedures: program modules stored within the database Triggers: a procedure that is executed when a particular data activity occurs Application metadata: data describing application elements such as forms and reports Copyright © 2004

Types of Database Personal database Workgroup database
1 user; < 10 MB Workgroup database < 25 users; < 100 MB Organizational database Hundreds to thousands users >1 Trillion bytes, possibly several databases Copyright © 2004

Example: Organizational Database
Copyright © 2004

Building a Database System
3 Phases Requirements phase: a data model is developed Data model is a logical representation of the database structure Design phase: the data model is transformed into tables and relationships Implementation phase: Tables, relationships, and constraints are created Stored procedures and triggers are written The database is filled and systems are tested Database and its applications will be modified (through these same three phases) to meet new requirements Copyright © 2004

Application Development
Application development proceeds in parallel with database development Copyright © 2004

History of Database Processing
Copyright © 2004

Early Database Models Before mid-1960s, only sequential file processing using magnetic tape was possible In mid-1960s, disk storage enabled hierarchical and network database IBM’s DL/I (Data Language One) CODAYSL’s DBTG (Data Base Task Group) model  the basis of current DBMSs Copyright © 2004

The Relational Model E.F. Codd introduced the relational model in 1970
DB2 from IBM is the first DBMS product based on the relational model Other DBMS based on the relational model were developed in the late 1980s Today, DB2, Oracle, and SQL Server are the most prominent commercial DBMS products based on the relational model Copyright © 2004

Personal Computer DBMS
The advent of microcomputer increases popularity of personal databases Graphical User Interface (GUI) make it easy to use Examples of early DBMS products: dBase, R:base, and Paradox Copyright © 2004

Object Oriented DBMS (OODBMS)
Object-oriented programming started in the mid-1980s Goal of OODBMS is to store object-oriented programming objects in a database without having to transform them into relational format Object-relational DBMS products, such as Oracle 8i and 9i, allow both relational and object views of data on the same database Currently, OODBMS have not been a commercial success due to high cost of relational to object-oriented transformation Copyright © 2004

Recent History Success story of the Microsoft Access Internet database
Microsoft Office suite and Windows integration Easy-to-use and powerful personal DBMS Internet database XML and database integration Copyright © 2004

Chapter 1 Introduction to Database Processing

Chapter 2 Entity-Relationship Data Modeling: Tools and Techniques

Three Schema Model (cont.)
External schema or user view Representation of how users view the database Conceptual schema A logical view of the database containing a description of all the data and relationships Independent of any particular means of storing the data One conceptual schema usually contains many different external schemas Internal schema A representation of a conceptual schema as physically stored on a particular product A conceptual schema can be represented by many different internal schemas Copyright © 2004

E-R Model Entity-Relationship model is a set of concepts and graphical symbols that can be used to create conceptual schemas Four versions Original E-R model by Peter Chen (1976) Extended E-R model: the most widely used model Information Engineering (IE) by James Martin (1990) IDEF1X national standard by the National Institute of Standards and Technology Unified Modeling Language (UML) supporting object-oriented methodology Copyright © 2004

Entities Something that can be identified and the users want to track
Entity class is a collection of entities described by the entity format in that class Entity instance is the representation of a particular entity There are usually many instances of an entity in an entity class Copyright © 2004

Attributes Description of the entity’s characteristics
All instances of a given entity class have the same attributes Composite attribute: attribute consisting of the group of attributes Multi-value attributes: attribute with more than one possible value Copyright © 2004

Identifiers Identifiers are attributes that name, or identify, entity instances The identifier of an entity instance consists of one or more of the entity’s attributes An identifier may be either unique or non-unique Unique identifier: the value identifies one and only one entity instance Non-unique identifier: the value identifies a set of instances Composite identifiers: Identifiers that consist of two or more attributes Copyright © 2004

Relationships Entities can be associated with one another in relationships Relationship classes: associations among entity classes Relationship instances: associations among entity instances Relationships can have attributes A relationship class can involve many entity classes Degree of the relationship is the number of entity classes in the relationship Copyright © 2004

Example: Degree of the relationship
Relationships of degree 2 are very common and are often referred to by the term binary relationships Copyright © 2004

Recursive Relationship
Recursive relationships are relationships among entities of a single class Copyright © 2004

Cardinality Maximum cardinality indicates the maximum number of entities that can be involved in a relationship Minimum cardinality indicate that there may or may not be an entity in a relationship Copyright © 2004

Weak Entities Weak entities are those that must logically depend on another entity Weak entities cannot exist in the database unless another type of entity (strong entity) also exists in the database ID-dependent entity: the identifier of one entity includes the identifier of another entity Copyright © 2004

Example: Weak Entities
Copyright © 2004

Subtype Entities Subtype entity is an entity that represents a special case of another entity, called supertype Sometimes called an IS-A relationship Entities with an IS-A relationship should have the same identifier Copyright © 2004

Example: Subtype Entities
Copyright © 2004

IDEF1X Standard IDEF1X (Integrated Definition 1, Extended) was announced as a national standard in 1993 It defines entities, relationships, and attributes in more specific meanings It changed some of the E-R graphical symbols It includes definition of domains, a component not present in the extended E-R model Four Relationship Types Non-Identifying Connection Relationships Identifying Connection Relationships Non-Specific Relationships Categorization Relationships Products supporting IDEF1X: ERWin, Visio, Design/2000 Copyright © 2004

Non-Identifying Connection Relationships
Represent relationship with a dashed line from a parent to a child entity Default cardinality is 1:N with a mandatory parent and an optional child 1 indicates exactly one child is required Z indicates zero or one children Copyright © 2004

Non-Identifying Connection Relationships
Copyright © 2004

Identifying Connection Relationships
Same as ID-dependent relationships in the extended E-R model Parent’s identifier is always part of the child’s identifier Relationship are indicated with solid lines, child entities are shown with rounded corners (ID-dependent entities only) Copyright © 2004

Identifying Connection Relationships
Copyright © 2004

Non-Specific Relationships
Simply a many-to-many relationship Relationships are shown with a filled-in circle on each end of the solid relationship line Cannot set minimum cardinalities of a non-specific relationship Copyright © 2004

Non-Specific Relationships
Copyright © 2004

Categorization Relationships
A relationship between a generic entity and another entity called a category entity Called specialization of generalization/subtype relationships (IS-A relationships) in the extended E-R model Within category clusters, category entities are mutually exclusive Two types of category clusters: Complete: every possible type of category for the cluster is shown (denoted by two horizontal lines with a gap in-between) Incomplete: at least one category is missing (denoted by placing the category cluster circle on top of a single line, no gap between horizontal lines) Copyright © 2004

Example: Categorization Relationships
Copyright © 2004

Example: IDEF1X Model With Relationship Names
Normally, a name consists of a verb or verb phrase expressed from the standpoint of the parent in the relationship, followed by a slash, and followed by the verb phrase expressed from the standpoint of the child. Copyright © 2004

Domains A domain is a named set of values that an attribute can have
It can be a specific list of values or a pre-defined data characteristic, e.g. character string of length less than 75 Domains reduce ambiguity in data modeling and are practically useful Two types of domains Base domain: have a data type and possibly a value list or range definition Type domain: a subset of a base domain or a subset of another type domain Copyright © 2004

Example: Domain Hierarchy
Copyright © 2004

UML-style E-R Diagrams
The Unified Modeling Language (UML) is a set of structures and techniques for modeling and designing object-oriented programs (OOP) and applications The concept of UML entities, relationships, and attributes are very similar to those of the extended E-R model Several OOP constructs are added: <Persistent> indicates that the entity class exist in the database UML allows entity class attributes UML supports visibility of attributes and methods UML entities specify constraints and methods in the third segment of the entity classes Currently, the object-oriented notation is of limited practical value Copyright © 2004

Example: UML Each entity is represented by an entity class, which is shown as a rectangle with three segments. The top segment shows the name of the entity and other data that we will discuss. The second segment lists the names of the attributes in the entity, and the third documents constraints and lists methods (program procedures) that belong to the entity. Relationships are shown with a line between two entities. Cardinalities are represented in the format x..y, where x is the minimum required and y is the maximum allowed. Copyright © 2004

UML: Weak Entities A filled-in diamond is placed on the line to the parent of the weak entity (the entity on which the weak entity depends). Figure 2-28(a) shows a weak entity that is not an ID-dependent entity. It is denoted by the expression <non-identifying> on the PATIENT-PRESCRIPTION relationship. Figure 2-28(b) shows a weak entity that is ID-dependent. It is denoted with the label <identifying>. Copyright © 2004

UML: Subtypes A filled-in diamond is placed on the line to the parent of the weak entity (the entity on which the weak entity depends). Figure 2-28(a) shows a weak entity that is not an ID-dependent entity. It is denoted by the expression <non-identifying> on the PATIENT-PRESCRIPTION relationship. Figure 2-28(b) shows a weak entity that is ID-dependent. It is denoted with the label <identifying>. Copyright © 2004

Chapter 2 Entity-Relationship Data Modeling: Tools and Techniques

Chapter 3 Entity-Relationship Data Modeling: Process and Examples

A Data Modeling Process
Steps in the data modeling process Plan project Determine requirements Specify entities Specify relationships Determine identifiers Specify attributes Specify domains Validate model Copyright © 2004

Planning the Project Obtaining project authorization and budget
Building the project team Planning the team’s activities Establishing tools, techniques, and standards for consistent results Defining the project’s scope Copyright © 2004

Determining System Requirements
Sources for data modeling requirements User interviews and user activity observations Existing forms and reports New forms and reports Existing manual files Existing computer files/databases Formally defined interfaces (XML) Domain expertise The result of the requirements determination will be a repository of notes, diagram, forms reports, files, etc., that can be used to develop the data model Copyright © 2004

Specifying Entities An entity is something that the users want to track; something the users want to keep data about Entities can be physical things or logical concepts are identifiable; you can tell one from another are things described by nouns, not characteristics described by adjectives Copyright © 2004

Specifying Relationships
Includes: Identity of the parent and child entities Relationship type Minimum and maximum cardinalities Name of the relationships Two techniques: Examine whether a relationship exists between every combination of two entities Locate relationships from requirement documents A combination of the two approaches may be used Copyright © 2004

Determining Identifiers
Identifier is an attribute or group of attributes that uniquely identifies an entity instance If there is difficulty specifying an identifier, maybe: it should be part of a different entity it is a subtype or category of a common entity it needs one or more identifying relationships Copyright © 2004

Specifying Attributes and Domains
Find attributes on forms, reports, existing files, etc., and add them to entities Determine whether the attribute has already defined a domain If so, the attribute is based upon that domain If not, a new domain is defined Review the domains and make adjustments as necessary Domain property inheritance: when the domain properties change, all the attribute properties change as well Domains may be used to enforce data standards promoting compatible data types and systems Once all attributes have been specified the model should be reviewed for missing entities Copyright © 2004

Validating Model Data model is a model of humans’ models, not a model of reality A data model is wrong if it does not accurately reflect the ways the users think about their world Data models are validated through a series of reviews Normally, a team review is followed by user reviews E-R model as well as prototypes of forms and reports may be used to communicate to users features of the data model Copyright © 2004

Creating Data Models From Forms and Reports
Example: Single entities Copyright © 2004

Example: Identifying Connection Relationships
Copyright © 2004

Example: Repeating Groups
Copyright © 2004

Example: Nested Groups
Copyright © 2004

Example: Non-Identifying Connection Relationships
Copyright © 2004

Example: Assignment Relationship
Copyright © 2004

Example: Category Relationship
Copyright © 2004

Example: University System
Copyright © 2004

University System With Domain Names
Copyright © 2004

Chapter 3 Entity-Relationship Data Modeling: Process and Examples

Chapter 4 The Relational Model and Normalization

Relations Relational DBMS products store data in the form of relations, a special type of table A relation is a two-dimensional table that has the following characteristics Rows contain data about an entity Columns contain data about attributes of the entity Cells of the table hold a single value All entries in a column are of the same kind Each column has a unique name The order of the columns is unimportant The order of the rows is unimportant No two rows may be identical Although not all tables are relations, the terms table and relation are normally used interchangeably Table/row/column = file/record/field = relation/tuple/attribute Copyright © 2004

Example: Tables Not Relations
Copyright © 2004

Types of Keys A key is one or more columns of a relation that identifies a row A unique key identifies a single row; a non-unique key identifies several rows Composite key is a key that contains two or more attributes A relation has one unique primary key and may also have additional unique keys called candidate keys Primary key is used to Represent the table in relationships Organize table storage Generate indexes Sometimes, relations are denoted by showing the name of the relation followed by the columns of the relation in parentheses. The primary key of the relation is underlined. Copyright © 2004

Functional Dependencies
A functional dependency occurs when the value of one (set of) attribute(s) determines the value of a second (set of) attribute(s) The attribute on the left side of the functional dependency is called the determinant SID  DormName, Fee (CustomerNumber, ItemNumber, Quantity)  Price While a primary key is always a determinant, a determinant is not necessarily a primary key Copyright © 2004

Normalization Normalization eliminates modification anomalies
Deletion anomaly: deletion of a row loses information about two or more entities Insertion anomaly: insertion of a fact in one entity cannot be done until a fact about another entity is added Anomalies can be removed by splitting the relation into two or more relations; each with a different, single theme However, breaking up a relation may create referential integrity constraints Normalization works through classes of relations called normal forms Copyright © 2004

Relationship of Normal Forms
Copyright © 2004

Normal Forms Any table of data is in 1NF if it meets the definition of a relation A relation is in 2NF if all its non-key attributes are dependent on all of the key (no partial dependencies) If a relation has a single attribute key, it is automatically in 2NF A relation is in 3NF if it is in 2NF and has no transitive dependencies A relation is in BCNF if every determinant is a candidate key A relation is in fourth normal form if it is in BCNF and has no multi-value dependencies Copyright © 2004

DK/NF First published in 1981 by Fagin
DK/NF has no modification anomalies; so no higher normal form is needed A relation is in DK/NF if every constraint on the relation is a logical consequence of the definition of keys and domains Copyright © 2004

The Synthesis of Relations
Given a set of attributes with certain functional dependencies, what relations should we form? Example: A and B are two attributes If A  B and B  A A and B have a one-to-one attribute relationship If A  B, but B not  A A and B have a many-to-one attribute relationship If A not  B and B not  A A and B have a many-to-many attribute relationship Copyright © 2004

Types of Attribute Relationship
Copyright © 2004

One-to-One Attribute Relationships
Attributes that have a one-to-one relationship must occur together in at least one relation Call the relation R and the attributes A and B: Either A or B must be the key of R An attribute can be added to R if it is functionally determined by A or B An attribute that is not functionally determined by A or B cannot be added to R A and B must occur together in R, but should not occur together in other relations Either A or B should be consistently used to represent the pair in relations other than R Copyright © 2004

Many-to-One Attribute Relationships
Attributes that have a many-to-one relationship can exist in a relation together Assume C determines D in relation S C must be the key of S An attribute can be added to S if it is determined by C An attribute that is not determined by C cannot be added to S Copyright © 2004

Many-to-Many Attribute Relationships
Attributes that have a many-to-many relationship can exist in a relation together Assume attributes E and F reside together in relation T The key of T must be (E, F) An attribute can be added to T if it is determined by the combination (E, F) An attribute may not be added to T if it is not determined by the combination (E, F) If adding a new attribute, G, expands the key to (E, F, G), then the theme of the relation has been changed Either G does not belong in T or the name of T must be changed to reflect the new theme Copyright © 2004

De-normalized Designs
When a normalized design is unnatural, awkward, or results in unacceptable performance, a de-normalized design is preferred Example Normalized relation CUSTOMER (CustNumber, CustName, Zip) CODES (Zip, City, State) De-Normalized relations CUSTOMER (CustNumber, CustName, City, State, Zip) Copyright © 2004

Chapter 4 The Relational Model and Normalization

Chapter 5 Database Design

Elements of Database Design
Copyright © 2004

The Database Design Process
Create tables and columns from entities and attributes Select primary keys Represent relationships Specify constraints Re-examine normalization criteria Copyright © 2004

Transforming an Entity to a Table
To transform an entity-relationship model into a relational database design, each entity is represented as a table. All attributes of the entity become columns of that table. By default, the identifier of the entity will become the primary key of the new table. Copyright © 2004

Selecting the Primary Key
An ideal primary key is short, numeric, and seldom changing If there are more than one candidate keys (alternate identifiers), they should be evaluated and the best one chosen as the table’s primary key If the entity has no identifier, an attribute needs to be selected as the identifier In some situations, a surrogate key should be defined Copyright © 2004

Surrogate Keys A surrogate key is a unique, DBMS-supplied identifier used as the primary key of a relation The values of a surrogate key have no meaning to the users and are normally hidden on forms and reports DBMS does not allow the value of a surrogate key to be changed Disadvantages: Foreign keys that are based on surrogate keys have no meaning to the users When data shared among different databases contain the same ID, merging those tables might yield unexpected results Copyright © 2004

Example: Surrogate Keys
Copyright © 2004

Representing Relationships
Relationships are expressed by placing the primary key of one table into a second table The new column in the second table is referred to as a foreign key Three principles of relationship representation Preservation of referential integrity constraints Specification of referential integrity actions Representation of minimum cardinality Copyright © 2004

Rules for Referential Integrity Constraints
Copyright © 2004

Specifying Referential Integrity Actions
If default referential integrity constraint is too strong, overriding the default referential integrity enforcement could be defined during database design The policy will be programmed into triggers during implementation Two referential integrity overrides Cascading updates automatically change the value of the foreign key in all related child rows to the new value Cascading deletions automatically delete all related child rows Copyright © 2004

Enforcing Minimum Cardinality
If the minimum cardinality on the child is one, at least one child row must be connected to the parent A required parent can be specified by making the foreign key value not null A required child can be represented by creating update and delete referential integrity actions on the child and insert referential integrity actions on the parent Such referential integrity actions must be declared during database design and trigger codes must be written during implementation Copyright © 2004

Representing ID-Dependent Relationships
To represent ID-dependent relationships, primary key of the parent relation is added to the child relation The new foreign key attribute becomes part of the child’s composite primary key Referential integrity actions should be carefully determined For cascading updates, data values are updated to keep child rows consistent with parent rows If the entity represents multi-value attributes, cascading deletions are appropriate Check user requirements when designing more complex situation Copyright © 2004

Example: ID-Dependent Relationship
Copyright © 2004

Example: Cascading Deletion
Copyright © 2004

Representing Relationship Using Surrogate Keys
If the parent in an ID-dependent relationship has a surrogate key as its primary key, but the child has a data key, use the parent’s surrogate key as a primary key A mixture of a surrogate key with a data key does not create the best design as the composite key will have no meaning to the users Therefore, whenever any parent of an ID-dependent relationship has a surrogate key, the child should have a surrogate key as well By using surrogate keys in the child table, the relationship type has changed to 1:N non-identifying relationship Copyright © 2004

Representing 1:1 and 1:N Relationships
IDEF1X refers to 1:1 and 1:N as Non-identifying connection relationships General rule: the key of a parent table is always placed into the child For 1:1 relationship, either entity could be considered the parent or the child For 1:N relationship, the parent entity is always the entity on the one side Copyright © 2004

Example: 1:1 Relationship
Copyright © 2004

Example: 1:N Relationship
Copyright © 2004

Representing N:M Relationships
IDEF1X refers to N:M relationships as non-specific relationships N:M relationships need to be converted into two ID-dependent relationships by defining an intersection table Two referential integrity constraints will be created The minimum cardinality from the child to the parent is always one The minimum cardinality from the parent to the intersection table depends on the system requirements Copyright © 2004

Example: N:M Relationship
Copyright © 2004

N:M Relationships Suggesting Missing Entities
According to IDEF1X, N:M relationship suggests a possible missing entity If there is a missing entity, that entity will be ID-dependent on both of its parents If there is no missing entity, create the connecting entity with no non-key attributes This approach is similar to the representation of N:M relationship in extended E-R model using intersection table Copyright © 2004

Example: Missing Entity
Copyright © 2004

Representing Subtype Relationships
Called subtypes in the extended E-R model and categories in the IDEF1X model Primary key of the supertype (or generic) entity is placed into the subtype (or category entity) Category entities in IDEF1X are mutually exclusive in the categories For complete categories, the generic entity will have to have exactly one category entity in that cluster These constraints are enforced by properly specifying referential integrity actions Copyright © 2004

Example: Subtype Relationship
Copyright © 2004

Representing Weak Entities
Weak entities logically depend on the existence of another entity in the database Representing these entities are the same as modeling 1:1 or 1:N relationships Referential integrity actions need to be specified to ensure that When the parent is deleted, the weak entity is deleted as well New weak entities have a parent with which to connect Copyright © 2004

Example: Weak, Non ID-Dependent Relationships
Copyright © 2004

Example: Nested ID-Dependent Relationships
Copyright © 2004

Example: University System
Copyright © 2004

Representing Recursive Relationships
A recursive relationship is a relationship among entities of the same class For 1:1 and 1:N recursive relationships, add a foreign key to the relation that represents the entity For N:M recursive relationships, add a new intersection table that represents the N:M relationship Copyright © 2004

Example: 1:1 Recursive Relationships
Copyright © 2004

Example: 1:N Recursive Relationships
Copyright © 2004

Example: M:N Recursive Relationships
Copyright © 2004

Representing Ternary and Higher-Order Relationships
Ternary and higher-order relationships can be treated as combinations of binary relationships There are three types of binary constraints: MUST, MUST NOT, and MUST COVER MUST NOT constraint: the binary relationship indicates combinations that are not allowed to occur in the ternary relationship MUST COVER constraint: the binary relationship indicates all combinations that must appear in the ternary relationship Because none of these constraints can be represented in the relational design, they must be documented as business rules and enforced in application programs or triggers Copyright © 2004

Null values A null value is an attribute value that has not been supplied Null values are ambiguous as they can mean The value is unknown The value is inappropriate The value is known to be blank Inappropriate nulls can be avoided by Defining subtype or category entities Forcing attribute values through the use of not null Supplying initial values Ignore nulls if the ambiguity is not a problem to the users Copyright © 2004

Chapter 5 Database Design

Chapter 6 Introduction to Structured Query Language (SQL)

Introduction Structured Query Language (SQL) is a data sublanguage that has constructs for defining and processing a database It can be Used stand-alone within a DBMS command Embedded in triggers and stored procedures Used in scripting or programming languages Copyright © 2004

SQL-92 SQL was developed by IBM in late 1970s
SQL-92 was endorsed as a national standard by ANSI in 1992 SQL3 incorporates some object-oriented concepts but has not gained acceptance in industry Data Definition Language (DDL) is used to define database structures Data Manipulation Language (DML) is used to query and update data SQL statement is terminated with a semicolon Copyright © 2004

CREATE TABLE CREATE TABLE statement is used for creating relations
Each column is described with three parts: column name, data type, and optional constraints Example CREATE TABLE PROJECT ( ProjectID Integer Primary Key, Name Char(25) Unique Not Null, Department VarChar(100) Null, MaxHours Numeric(6,1) Default 100); Copyright © 2004

Data Types Standard data types
Char for fixed-length character VarChar for variable-length character It requires additional processing than Char data types Integer for whole number Numeric There are many more data types in the SQL-92 standard Copyright © 2004

Constraints Constraints can be defined within the CREATE TABLE statement, or they can be added to the table after it is created using the ALTER table statement Five types of constraints: PRIMARY KEY may not have null values UNIQUE may have null values NULL/NOT NULL FOREIGN KEY CHECK Copyright © 2004

ALTER Statement ALTER statement changes table structure, properties, or constraints after it has been created Example ALTER TABLE ASSIGNMENT ADD CONSTRAINT EmployeeFK FOREIGN KEY (EmployeeNum) REFERENCES EMPLOYEE (EmployeeNumber) ON UPDATE CASCADE ON DELETE NO ACTION; Copyright © 2004

DROP Statements DROP TABLE statement removes tables and their data from the database A table cannot be dropped if it contains foreign key values needed by other tables Use ALTER TABLE DROP CONSTRAINT to remove integrity constraints in the other table first Example: DROP TABLE CUSTOMER; ALTER TABLE ASSIGNMENT DROP CONSTRAINT ProjectFK; Copyright © 2004

WHERE Clause Conditions
Require quotes around values for Char and VarChar columns, but no quotes for Integer and Numeric columns AND may be used for compound conditions IN and NOT IN indicate ‘match any’ and ‘match all’ sets of values, respectively Wildcards _ and % can be used with LIKE to specify a single or multiple unknown characters, respectively IS NULL can be used to test for null values Copyright © 2004

Example: SELECT Statement
SELECT Name, Department, MaxHours FROM PROJECT; Insert Figure 6-2 (PROJECT Table only) Copyright © 2004

Example: SELECT DISTINCT
SELECT DISTINCT Department FROM PROJECT; Insert Figure 6-2 (PROJECT Table only) Copyright © 2004

Example: SELECT Statement
FROM PROJECT WHERE Department =’Finance’ AND MaxHours > 100; Insert Figure 6-2 (PROJECT Table only) Copyright © 2004

Example: IN/NOT IN SELECT Name, Phone, Department FROM EMPLOYEE
WHERE Department IN (‘Accounting’, ‘Finance’, ‘Marketing’); Insert Figure 6-2 (EMPLOYEE Table only) SELECT Name, Phone, Department FROM EMPLOYEE WHERE Department NOT IN (‘Accounting’, ‘Finance’, ‘Marketing’); Insert Figure 6-2 (EMPLOYEE Table only) Copyright © 2004

Example: BETWEEN Insert Figure 6-2 (EMPLOYEE table only)
SELECT Name, Department FROM EMPLOYEE WHERE EmployeeNumber BETWEEN 200 AND 500; Or WHERE EmployeeNumber >= 200 AND EmployeeNumber <= 500; Insert Figure 6-2 (EMPLOYEE table only) Copyright © 2004

Example: LIKE SELECT * FROM EMPLOYEE WHERE Phone LIKE ‘285-____’;
Insert Figure 6-2 (EMPLOYEE Table only) Copyright © 2004

Example: IS NULL SELECT Name, Department FROM EMPLOYEE
WHERE Phone IS NULL; Insert Figure 6-2 (EMPLOYEE Table only) Copyright © 2004

Sorting the Results ORDER BY phrase can be used to sort rows from SELECT statement SELECT Name, Department FROM EMPLOYEE ORDER BY Department; Two or more columns may be used for sorting purposes ORDER BY Department DESC, Name ASC; Copyright © 2004

Built-in Functions Five built-in functions for SELECT statement:
COUNT counts the number of rows in the result SUM totals the values in a numeric column AVG calculates an average value MAX retrieves a maximum value MIN retrieves a minimum value Result is a single number (relation with a single row and a single column) Column names cannot be mixed with built-in functions Built-in functions cannot be used in WHERE clauses Copyright © 2004

Example: Built-in Functions
SELECT COUNT (DISTINCT Department) FROM PROJECT; SELECT MIN(MaxHours), MAX(MaxHours), SUM(MaxHours) FROM PROJECT WHERE ProjectID < 1500; Copyright © 2004

Built-in Functions and Grouping
GROUP BY allows a column and a built-in function to be used together GROUP BY sorts the table by the named column and applies the built-in function to groups of rows having the same value of the named column WHERE condition must be applied before GROUP BY phrase Example SELECT Department, Count(*) FROM EMPLOYEE WHERE EmployeeNumber < 600 GROUP BY Department HAVING COUNT(*) > 1; Copyright © 2004

Querying Multiple Tables
Multiple tables can be queried by using either subqueries or joins If all of the result data comes from a single table, subqueries can be used If results come from two or more tables, joins must be used Joins cannot substitute for correlated subqueries nor for queries that involve EXISTS and NOT EXISTS Copyright © 2004

Subqueries Subqueries can be extended to include many levels Example
SELECT DISTINCT Name FROM EMPLOYEE WHERE EmployeeNumber IN (SELECT EmployeeNum FROM ASSIGNMENT WHERE HoursWorked > 40 AND ProjectID IN (SELECT ProjectID FROM PROJECT WHERE Department = ‘Accounting’)); Copyright © 2004

Joins The basic idea of a join is to form a new relation by connecting the contents of two or more other relations This joined table can be processed like any other table Example SELECT PROJECT.Name, HoursWorked, EMPLOYEE.Name FROM PROJECT, ASSIGNMENT, EMPLOYEE WHERE PROJECT.ProjectID = ASSIGNMENT.ProjectID AND EMPLOYEE.EmployeeNumber = ASSIGNMENT.EmployeeNum; Copyright © 2004

Alternate Join Syntax SQL-92’s alternative join syntax substitutes the words JOIN and ON for WHERE Using aliases for table names improves the readability of a join Example: alias E is assigned to the EMPLOYEE table SELECT P.Name, HoursWorked, E.Name FROM PROJECT P JOIN ASSIGNMENT A ON P.ProjectID = A.ProjectID JOIN EMPLOYEE E ON A.EmployeeNum = E.EmployeeNumber; Copyright © 2004

Outer Joins Outer joins can be used to ensure that all rows from a table appear in the result Left (right) outer join: every row on the table on the left (right) hand side is included in the results even though the row may not have a match Outer joins can be nested Copyright © 2004

Example: Outer Join Left outer join Nested outer join
SELECT Name, HoursWorked FROM PROJECT LEFT JOIN ASSIGNMENT ON PROJECT.ProjectID = ASSIGNMENT.ProjectID; Nested outer join SELECT PROJECT.Name, HoursWorked, EMPLOYEE.Name FROM ((PROJECT LEFT JOIN ASSIGNMENT ON PROJECT.ProjectID = ASSIGNMENT.ProjectID) LEFT JOIN EMPLOYEE ON EMPLOYEE.EmployeeNumber = Assignment.EmployeeNum); Copyright © 2004

INSERT INTO Statement The order of the column names must match the order of the values Values for all NOT NULL columns must be provided No value needs to be provided for a surrogate primary key It is possible to use a select statement to provide the values for bulk inserts from a second table Examples: INSERT INTO PROJECT VALUES (1600, ‘Q4 Tax Prep’, ‘Accounting’, 100); INSERT INTO PROJECT (Name, ProjectID) VALUES (‘Q1+ Tax Prep’, 1700); Copyright © 2004

UPDATE Statement UPDATE statement is used to modify values of existing data Example: UPDATE EMPLOYEE SET Phone = ‘ ’ WHERE Name = ‘James Nestor’; UPDATE can also be used to modify more than one column value at a time SET Phone = ‘ ’, Department = ‘Production’ WHERE EmployeeNumber = 200; Copyright © 2004

DELETE FROM Statement Delete statement eliminates rows from a table
Example DELETE FROM PROJECT WHERE Department = ‘Accounting’; ON DELETE CASCADE removes any related referential integrity constraint of a deleted row Copyright © 2004

Chapter 6 Introduction to Structured Query Language (SQL)

Chapter 7 Relational Algebra and SQL applications

Review Relational Model Terminology
Relation is a two-dimensional table Attributes are single valued Each attribute belongs to a domain A domain is a physical and logical description of permittable values No two rows are identical Order is unimportant The row is called a tuple Copyright © 2004

Relational Algebra Relational algebra defines a set of operators that may work on relations. Recall that relations are simply data sets. As such, relational algebra deals with set theory. The operators in relational algebra are very similar to traditional algebra except that they apply to sets. Copyright © 2004

Relational Algebra Operators
Relational algebra provides several operators: Union Difference Intersection Product Projection Selection Join Copyright © 2004

Union Operator JUNIOR and HONOR-STUDENT relations and their union:
Example of JUNIOR relation Example HONOR-STUDENT relation Union of JUNIOR and HONOR-STUDENT relations Copyright © 2004

Difference Operator JUNIOR relation HONOR-STUDENT relation
JUNIOR minus HONOR-STUDENT relation Copyright © 2004

Intersection Operator
An intersection operation will produce a third relation that contains the tuples that are common to the relations involved. This is similar to the logical operator ‘AND’ Copyright © 2004

Intersection Operator
JUNIOR relation HONOR-STUDENT relation Intersection of JUNIOR and HONOR-STUDENT relations Copyright © 2004

Product Operator A product operator is a concatenation of every tuple in one relation with every tuple in a second relation The resulting relation will have n x m tuples, where… n = the number of tuples in the first relation and m = the number of tuples in the second relation This is similar to multiplication Copyright © 2004

Projection Operator A projection operation produces a second relation that is a subset of the first. The subset is in terms of columns, not tuples The resulting relation will contain a limited number of columns. However, every tuple will be listed. Copyright © 2004

Selection Operator The selection operator is similar to the projection operator. It produces a second relation that is a subset of the first. However, the selection operator produces a subset of tuples, not columns. The resulting relation contains all columns, but only contains a portion of the tuples. Copyright © 2004

Join Operator The join operator is a combination of the product, selection, and projection operators. There are several variations of the join operator… Equijoin Natural join Outer join Left outer join Right outer join Copyright © 2004

Data for Join Examples SID Name Major GradeLevel 123 Jones History JR
158 Parks Math GR 271 Smith 105 Anderson Management SN StudentNumber ClassName PositionNumber 123 H350 1 105 BA490 3 B490 7 Copyright © 2004

Expressing Queries in Relational Algebra
1. What are the names of all students? STUDENT [Name] 2. What are the student numbers of all students enrolled in a class? ENROLLMENT [StudentNumber] Copyright © 2004

3. What are the student numbers of all students not enrolled in a class? STUDENT [SID] – ENROLLMENT [StudentNumber] 4. What are the numbers of students enrolled in the class ‘BD445’? ENROLLMENT WHERE ClassName = ‘BD445’[StudentNumber] Copyright © 2004

7. What are the grade levels and meeting rooms of all students, including students not enrolled in a class? STUDENT LEFT OUTER JOIN (SID = StudentNumber) ENROLLMENT JOIN (ClassName = Name) CLASS [GradeLevel, Room] Copyright © 2004

Summary of Relational Algebra Operators
Copyright © 2004

Using SQL in Applications

View Ridge Gallery View Ridge Gallery is a small art gallery that has been in business for 30 years It sells contemporary European and North American fine art View Ridge has one owner, three salespeople, and two workers View Ridge owns all of the art that it sells; it holds no items on a consignment basis Copyright © 2004

Application Requirements
View Ridge application requirements Track customers and their artist interests Record gallery's purchases Record customers' art purchases List the artists and works that have appeared in the gallery Report how fast an artist's works have sold and at what margin Show current inventory in a Web page Copyright © 2004

Surrogate Key Database Design
Copyright © 2004

CHECK CONSTRAINT CHECK CONSTRAINT defines limits for column values
Two common uses Specifying a range of allowed values Specifying an enumerated list CHECK constraints may be used To compare the value of one column to another To specify the format of column values With subqueries Copyright © 2004

SQL Views SQL view is a virtual table that is constructed from other tables or views It has no data of its own, but obtains data from tables or other views SELECT statements are used to define views A view definition may not include an ORDER BY clause SQL views are a subset of the external views They can be used only for external views that involve one multi-valued path through the schema Copyright © 2004

SQL Views Views may be used to Hide columns or rows
Show the results of computed columns Hide complicated SQL statements Provide a level of indirection between application programs and tables Assign different sets of processing permissions to tables Assign different sets of triggers Copyright © 2004

Example: CREATE VIEW CREATE VIEW CustomerNameView AS
SELECT Name AS CustomerName FROM CUSTOMER; SELECT * FROM CustomerNameView ORDER BY CustomerName; Copyright © 2004

Updating Views Views may or may not be updatable
Rules for updating views are both complicated and DBMS-specific Copyright © 2004

Embedding SQL In Program Code
SQL can be embedded in triggers, stored procedures, and program code Problem: assigning SQL table columns with program variables Solution: object-oriented programming, PL/SQL Problem: paradigm mismatch between SQL and application programming language SQL statements return sets of rows; an applications work on one row at a time Solution: process the SQL results as pseudo-files Copyright © 2004

Triggers A trigger is a stored program that is executed by the DBMS whenever a specified event occurs on a specified table or view Three trigger types: BEFORE, INSTEAD OF, and AFTER Each type can be declared for Insert, Update, and Delete Resulting in a total of nine trigger types Oracle supports all nine trigger types SQL Server supports six trigger types (only for INSTEAD OF and AFTER triggers) Copyright © 2004

Firing Triggers When a trigger is fired, the DBMS supplies
Old and new values for the update New values for inserts Old values for deletions The way the values are supplied depends on the DBMS product Trigger applications: Checking validity (Figure 7-14) Providing default values (Figure 7-15) Updating views (Figure 7-16) Enforcing referential integrity actions (Figure 7-17, 7-18) Copyright © 2004

Stored Procedures A stored procedure is a program that is stored within the database and is compiled when used In Oracle, it can be written in PL/SQL or Java In SQL Server, it can be written in TRANSACT-SQL Stored procedures can receive input parameters and they can return results Stored procedures can be called from Programs written in standard languages, e.g., Java, C# Scripting languages, e.g., JavaScript, VBScript SQL command prompt, e.g., SQL Plus, Query Analyzer Copyright © 2004

Stored Procedure Advantages
Greater security as store procedures are always stored on the database server Decreased network traffic SQL can be optimized by the DBMS compiler Code sharing resulting in Less work Standardized processing Specialization among developers Copyright © 2004

Using SQL In Application Code
SQL can be embedded in application programs Several SQL statements need to be executed to populate an external view The application program causes the statements to be executed and then displays the results of the query in the form’s grid controls Copyright © 2004

Using SQL In Application Code (cont.)
The application program also processes and coordinates user actions on a form, including Populating a drop-down list box Making the appropriate changes to foreign keys to create record relationships The particulars by which SQL code is inserted into applications depend on the language and data-manipulation methodology used Copyright © 2004

MySQL Chapters 1 and 3 Introduction to MySQL

Overview TcX - Michael Widenius (MySQL) Hughes - David Hughes (mSQL)
Features: Mostly ANSI SQL2 compliant Transactions Stored procedures Auto_increment fields Copyright © 2004

More features Cross-database joins Outer joins
API: C/C++, Eiffel, Java, PHP, Perl, Python, TCL Runs on Windows, UNIX, and Mac High performance Copyright © 2004

SQL syntax CREATE TABLE people (name CHAR(10))
INSERT INTO people VALUES (‘Joe’) SELECT name FROM people WHERE name like ‘J%’ Copyright © 2004

SQL commands SHOW DATABASES SHOW TABLES
Data types: INT, REAL, CHAR(l), VARCHAR(l), TEXT(l), DATE, TIME ALTER TABLE mytable MODIFY mycolumn TEXT(100) ENUM(‘cat’,’dog’,’rabbit’,’pig’) Copyright © 2004

SQL commands CREATE DATABASE dbname
CREATE TABLE tname (id NOT NULL PRIMARY KEY AUTO_INCREMENT) CREATE INDEX part_of_name ON customer (name(10)) INSERT INTO tname (c1, …, cn) values (v1, …, vn) Copyright © 2004

JOINs and ALIASing SELECT book.title, author.name FROM author, book
WHERE books.author = author.id SELECT very_long_column_name AS col FROM tname WHERE col=‘5’ Copyright © 2004

Loading text files Comma-separated files (*.csv)
LOAD DATA LOCAL INFILE "whatever.csv" INTO TABLE tname Copyright © 2004

Aggregate queries SELECT position FROM people GROUP by position
SELECT position, AVG (salary) FROM people GROUP BY position HAVING AVG (salary) > Copyright © 2004

Full text search CREATE TABLE WebCache (
url VARCHAR (255) NOT NULL PRIMARY KEY, ptext TEXT NOT NULL, FULLTEXT (ptext)); INSERT INTO WebCache (url, ptext) VALUES (‘index.html’, ‘Welcome to the University of Michigan’); SELECT url from WebCache WHERE MATCH (ptext) against (‘Michigan’); Copyright © 2004

Advanced features Transactions Table locking Functions Unions
Outer joins Copyright © 2004

Installing MySQL on Windows
Copyright © 2004

use test; CREATE TABLE STATION (ID INTEGER PRIMARY KEY, CITY CHAR(20), STATE CHAR(2), LAT_N REAL, LONG_W REAL); DESCRIBE STATION; INSERT INTO STATION VALUES (13, 'Phoenix', 'AZ', 33, 112); INSERT INTO STATION VALUES (44, 'Denver', 'CO', 40, 105); INSERT INTO STATION VALUES (66, 'Caribou', 'ME', 47, 68); SELECT * FROM STATION; SELECT * FROM STATION WHERE LAT_N > 39.7; Copyright © 2004

SELECT ID, CITY, STATE FROM STATION; ID CITY STATE ;
WHERE LAT_N > 39.7; CREATE TABLE STATS (ID INTEGER REFERENCES STATION(ID), MONTH INTEGER CHECK (MONTH BETWEEN 1 AND 12), TEMP_F REAL CHECK (TEMP_F BETWEEN -80 AND 150), RAIN_I REAL CHECK (RAIN_I BETWEEN 0 AND 100), PRIMARY KEY (ID, MONTH)); INSERT INTO STATS VALUES (13, 1, 57.4, 0.31); INSERT INTO STATS VALUES (13, 7, 91.7, 5.15); INSERT INTO STATS VALUES (44, 1, 27.3, 0.18); INSERT INTO STATS VALUES (44, 7, 74.8, 2.11); INSERT INTO STATS VALUES (66, 1, 6.7, 2.10); INSERT INTO STATS VALUES (66, 7, 65.8, 4.52); SELECT * FROM STATS; Copyright © 2004

SELECT * FROM STATION, STATS
WHERE STATION.ID = STATS.ID; SELECT MONTH, ID, RAIN_I, TEMP_F FROM STATS ORDER BY MONTH, RAIN_I DESC; SELECT LAT_N, CITY, TEMP_F FROM STATS, STATION WHERE MONTH = 7 AND STATS.ID = STATION.ID ORDER BY TEMP_F; SELECT MAX(TEMP_F), MIN(TEMP_F), AVG(RAIN_I), ID GROUP BY ID; SELECT * FROM STATION WHERE 50 < (SELECT AVG(TEMP_F) FROM STATS WHERE STATION.ID = STATS.ID); Copyright © 2004

CREATE VIEW METRIC_STATS (ID, MONTH, TEMP_C, RAIN_C) AS
SELECT ID, MONTH, (TEMP_F - 32) * 5 /9, RAIN_I * FROM STATS; SELECT * FROM METRIC_STATS; SELECT * FROM METRIC_STATS WHERE TEMP_C < 0 AND MONTH = 1 ORDER BY RAIN_C; UPDATE STATS SET RAIN_I = RAIN_I ; SELECT * FROM STATS; UPDATE STATS SET TEMP_F = 74.9 WHERE ID = 44 AND MONTH = 7; Copyright © 2004

DELETE FROM STATS WHERE MONTH = 7 OR ID IN (SELECT ID FROM STATION WHERE LONG_W < 90); DELETE FROM STATION WHERE LONG_W < 90; COMMIT WORK; SELECT * FROM STATION; SELECT * FROM STATS; SELECT * FROM METRIC_STATS; Copyright © 2004

CREATE TABLE animals ( id MEDIUMINT NOT NULL AUTO_INCREMENT, name CHAR(30) NOT NULL, PRIMARY KEY (id) ); INSERT INTO animals (name) VALUES ("dog"),("cat"),("penguin"), ("lax"),("whale"),("ostrich"); SELECT * FROM animals; CREATE TABLE shop ( article INT(4) UNSIGNED ZEROFILL DEFAULT '0000' NOT NULL, dealer CHAR(20) DEFAULT '' NOT NULL, price DOUBLE(16,2) DEFAULT '0.00' NOT NULL, PRIMARY KEY(article, dealer)); INSERT INTO shop VALUES (1,'A',3.45),(1,'B',3.99),(2,'A',10.99),(3,'B',1.45),(3,'C',1.69), (3,'D',1.25),(4,'D',19.95); SELECT * FROM shop; Copyright © 2004

CREATE TABLE articles (
id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY, title VARCHAR(200), body TEXT, FULLTEXT (title,body) ); INSERT INTO articles VALUES (NULL,'MySQL Tutorial', 'DBMS stands for DataBase ...'), (NULL,'How To Use MySQL Efficiently', 'After you went through a ...'), (NULL,'Optimizing MySQL','In this tutorial we will show ...'), (NULL,'1001 MySQL Tricks','1. Never run mysqld as root '), (NULL,'MySQL vs. YourSQL', 'In the following database comparison ...'), (NULL,'MySQL Security', 'When configured properly, MySQL ...'); SELECT * FROM articles WHERE MATCH (title,body) AGAINST ('database'); Copyright © 2004

# What's the highest item number?
SELECT MAX(article) AS article FROM shop; # Find number, dealer, and price of the most expensive article. SELECT MAX(price) FROM shop; SELECT article, dealer, price FROM shop WHERE price=19.95; ORDER BY price DESC LIMIT 1; # What's the highest price per article? SELECT article, MAX(price) AS price GROUP BY article; Copyright © 2004

CREATE TEMPORARY TABLE tmp (
article INT(4) UNSIGNED ZEROFILL DEFAULT '0000' NOT NULL, price DOUBLE(16,2) DEFAULT '0.00' NOT NULL); LOCK TABLES shop READ; INSERT INTO tmp SELECT article, MAX(price) FROM shop GROUP BY article; SELECT shop.article, dealer, shop.price FROM shop, tmp WHERE shop.article=tmp.article AND shop.price=tmp.price; UNLOCK TABLES; DROP TABLE tmp; SELECT article, SUBSTRING( MAX( CONCAT(LPAD(price,6,'0'),dealer) ), 7) AS dealer, 0.00+LEFT( MAX( CONCAT(LPAD(price,6,'0'),dealer) ), 6) AS price FROM shop GROUP BY article; Copyright © 2004

# find the articles with the highest and lowest price
SELECT FROM shop; SELECT * FROM shop WHERE OR # foreign keys CREATE TABLE person ( id SMALLINT UNSIGNED NOT NULL AUTO_INCREMENT, name CHAR(60) NOT NULL, PRIMARY KEY (id) ); CREATE TABLE shirt ( style ENUM('t-shirt', 'polo', 'dress') NOT NULL, color ENUM('red', 'blue', 'orange', 'white', 'black') NOT NULL, owner SMALLINT UNSIGNED NOT NULL REFERENCES person(id), Copyright © 2004

INSERT INTO person VALUES (NULL, 'Antonio Paz');
INSERT INTO shirt VALUES (NULL, 'polo', 'blue', LAST_INSERT_ID()), (NULL, 'dress', 'white', LAST_INSERT_ID()), (NULL, 't-shirt', 'blue', LAST_INSERT_ID()); INSERT INTO person VALUES (NULL, 'Lilliana Angelovska'); (NULL, 'dress', 'orange', LAST_INSERT_ID()), (NULL, 'polo', 'red', LAST_INSERT_ID()), (NULL, 'dress', 'blue', LAST_INSERT_ID()), (NULL, 't-shirt', 'white', LAST_INSERT_ID()); SELECT * FROM person; SELECT * FROM shirt; Copyright © 2004

SELECT s.* FROM person p, shirt s
WHERE p.name LIKE 'Lilliana%' AND s.owner = p.id AND s.color <> 'white'; # unions select id, style from shirt where color = 'blue' union select id, style from shirt where color = 'orange' # visits per day CREATE TABLE t1 (year YEAR(4), month INT(2) UNSIGNED ZEROFILL, day INT(2) UNSIGNED ZEROFILL); INSERT INTO t1 VALUES(2000,1,1),(2000,1,20),(2000,1,30),(2000,2,2), (2000,2,23),(2000,2,23); SELECT year,month,BIT_COUNT(BIT_OR(1<<day)) AS days FROM t1 GROUP BY year,month; Copyright © 2004

Chapter 8 Database Redesign

Need For Database Redesign
Database redesign is necessary To fix mistakes made during the initial database design To adapt the database to changes in system requirements New information systems cause changes in systems requirements because information systems and organizations create each other When a new system is installed, users can behave in new ways As the users behave in the new ways, they will want changes to the system to accommodate their new behaviors Copyright © 2004

Correlated Subqueries
A correlated subquery looks similar to a regular subquery A regular subquery can be processed from the bottom up For a correlated subquery, the processing is nested, i.e., a row from an upper query statement is used in comparison with rows in a lower-level query Copyright © 2004

Example: Correlated Subqueries
SELECT A.Name FROM ARTIST A WHERE A.ArtistID IN (SELECT W.ArtistID FROM WORK W WHERE W.Title =’Mystic Fabric’); SELECT W1.Title, W1.Copy FROM WORK W1 WHERE W1.Title IN (SELECT W2.Title FROM WORK W2 WHERE W1.Title = W2.Title AND W1.WorkID <> W2.WorkID); Copyright © 2004

EXISTS and NOT EXISTS EXISTS and NOT EXISTS are specialized forms of correlated subqueries An EXISTS condition is true if any row in the subquery meets the specified conditions A NOT EXISTS condition is true only if all rows in the subquery do not meet the specified condition Double use of NOT EXISTS can be used to find rows that have some specified condition to every row of a table Copyright © 2004

Example: EXISTS SELECT E1.Department, E1.BudgetCode FROM EMPLOYEE E1
WHERE EXISTS (SELECT * FROM EMPLOYEE E2 WHERE E1.Department = E2.Department AND E1.BudgetCode <> E2.BudgetCode); Copyright © 2004

Example: Double NOT EXISTS
SELECT A.Name FROM ARTIST AS A WHERE NOT EXISTS (SELECT C.CustomerID FROM CUSTOMER C (SELECT CI.CustomerID FROM CUSTOMER_artist_int CI WHERE C.CustomerID= CI.CustomerID AND A.ArtistID = CI.ArtistID)); Copyright © 2004

Database Redesign Three principles for database redesign:
Measure twice and cut once: understand the current structure and contents of the database before making any structure changes Test the new changes on a test database before making real changes Create a complete backup of the operational database before making any structure changes Technique: reverse engineering (RE) Copyright © 2004

Reverse Engineering Reverse engineering is the process of reading and producing a data model from a database schema A reverse engineered (RE) data model Provides a basis to begin the database redesign project Is neither truly a conceptual nor an internal schema as it has characteristics of both Should be carefully reviewed because it almost always has missing information Copyright © 2004

Example: Dependency Graph
Copyright © 2004

Database Backup and Test Databases
Before making any changes to an operational database A complete backup of the operational database should be made Any proposed changes should be thoroughly tested Three different copies of the database schema used in the redesign process A small test database for initial testing A large test database for secondary testing The operational database Copyright © 2004

Database Redesign Changes
Changing tables and columns Changing table names Adding and dropping table columns Changing data type or constraints Adding and dropping constraints Changing relationships Changing cardinalities Adding and deleting relationships Adding and removing relationship for de-normalization Copyright © 2004

Changing Table Names There is no SQL-92 command to change table name
The table needs to be re-created under the new name, tested, and the old table is dropped Changing a table name has a surprising number of potential consequences Therefore, using views defined as table aliases is more appropriate Only views that define the aliases would need to be changed when the source table name is changed Copyright © 2004

Adding Columns To add null columns to a table
ALTER TABLE WORK ADD COLUMN DateCreated Date NULL; Other column constraints, e.g., DEFAULT or UNIQUE, may be included with the column definition Newly added DEFAULT constraint will be applied to only new rows, existing rows will have null values Three steps to add a NOT NULL column: Add the column as NULL Add data to every row Alter the column constraint to NOT NULL Copyright © 2004

Dropping Columns To drop non-key columns
ALTER TABLE WORK DROP COLUMN DateCreated; To drop a foreign key column, the foreign key constraint must first be dropped To drop the primary key, all foreign keys using the primary key must first be dropped; follow by dropping the primary key constraint Copyright © 2004

Changing Data Type or Constraints
Use the ALTER TABLE ALTER COLUMN to change data types and constraints For some changes, data will be lost or the DBMS may refuse the change To change a constraint from NULL to NOT NULL, all rows must have a value first Copyright © 2004

Changing Data Type or Constraints
Converting more specific data type, e.g., date, money, and numeric, to char or varchar will usually succeed Changing a data type from char or varchar to a more specific type can be a problem Example ALTER TABLE ARTIST ALTER COLUMN Birthdate Numeric (4,0) NULL; Copyright © 2004

Adding and Dropping Constraints
Use the ALTER TABLE ADD (DROP) CONSTRAINT to add (remove) constraints Example ALTER TABLE ARTIST ADD CONSTRAINT NumericBirthYearCheck CHECK (Birthdate > 1900 and Birthdate < 2100); Copyright © 2004

Changing Minimum Cardinalities
On the parent side: To change from zero to one, change the foreign key constraint from NULL to NOT NULL Can only be done if all the rows in the table have a value To change from one to zero, change the foreign key constraint from NOT NULL to NULL On the child side: Add (to change from zero to one) or drop (to change from one to zero) triggers that enforce the constraint Copyright © 2004

Changing Maximum Cardinalities
Changing from 1:1 to 1:N If the foreign key is in the correct table, remove the unique constraint on the foreign key column If the foreign key is in the wrong table, move the foreign key to the correct table and do not place a unique constraint on that table Changing from 1:N to N:M Build a new intersection table and move the key and foreign key values to the intersection table Copyright © 2004

Reducing Cardinalities
Reducing cardinalities may result in data loss Reducing N:M to 1:N Create a foreign key in the parent table and move one value from the intersection table into that foreign key Reducing 1:N to 1:1 Remove any duplicates in the foreign key and then set a uniqueness constraint on that key Copyright © 2004

Adding and Deleting Relationships
Adding new tables and relationships Add the tables and relationships using CREATE TABLE statements with FOREIGN KEY constraints If an existing table has a child relationship to the new table, add a FOREIGN KEY constraint using the existing table Deleting relationships and tables Drop the foreign key constraints and then drop the tables Copyright © 2004

Adding Tables and Relationships for Normalization
Steps: Use correlated subqueries to determine whether the normalization assumption is justified If not, fix the data before proceeding Create a new table and move the normalized data into the new table Define the appropriate foreign key Copyright © 2004

Removing Relationships for Denormalization
Steps: Define the new columns in the table to be denormalized Fill the table with existing data Drop the child table and relationship Copyright © 2004

Forward Engineering Forward engineering is the process of applying data model changes to an existing database Results of forward engineering should be tested before using it on an operational database Some tools will show the SQL that will execute during the forward engineering process If so, that SQL should be carefully reviewed Copyright © 2004

Chapter 8 Database Redesign

Chapter 9 Managing Multi-User Databases

Database Administration
All large and small databases need database administration Data administration refers to a function concerning all of an organization’s data assets Database administration (DBA) refers to a person or office specific to a single database and its applications Copyright © 2004

DBA Tasks Managing database structure
Controlling concurrent processing Managing processing rights and responsibilities Developing database security Providing for database recovery Managing the DBMS Maintaining the data repository Copyright © 2004

Managing Database Structure
DBA’s tasks: Participate in database and application development Assist in requirements stage and data model creation Play an active role in database design and creation Facilitate changes to database structure Seek community-wide solutions Assess impact on all users Provide configuration control forum Be prepared for problems after changes are made Maintain documentation Copyright © 2004

Concurrency Control Concurrency control ensures that one user’s work does not inappropriately influence another user’s work No single concurrency control technique is ideal for all circumstances Trade-offs need to be made between level of protection and throughput Copyright © 2004

Atomic Transactions A transaction, or logical unit of work (LUW), is a series of actions taken against the database that occurs as an atomic unit Either all actions in a transaction occur or none of them do Copyright © 2004

Example: Atomic Transaction
Copyright © 2004

Concurrent Transaction
Concurrent transactions refer to two or more transactions that appear to users as they are being processed against a database at the same time In reality, CPU can execute only one instruction at a time Transactions are interleaved meaning that the operating system quickly switches CPU services among tasks so that some portion of each of them is carried out in a given interval Concurrency problems: lost update and inconsistent reads Copyright © 2004

Example: Concurrent Transactions
Copyright © 2004

Example: Lost Update Problem
Copyright © 2004

Lock Terminology Implicit locks are locks placed by the DBMS
Explicit locks are issued by the application program Lock granularity refers to size of a locked resource Rows, page, table, and database level Large granularity is easy to manage but frequently causes conflicts Types of lock An exclusive lock prohibits other users from reading the locked resource A shared lock allows other users to read the locked resource, but they cannot update it Copyright © 2004

Example: Explicit Locks
Copyright © 2004

Serializable Transactions
Serializable transactions refer to two transactions that run concurrently and generate results that are consistent with the results that would have occurred if they had run separately Two-phased locking is one of the techniques used to achieve serializability Copyright © 2004

Two-phased Locking Two-phased locking
Transactions are allowed to obtain locks as necessary (growing phase) Once the first lock is released (shrinking phase), no other lock can be obtained A special case of two-phased locking Locks are obtained throughout the transaction No lock is released until the COMMIT or ROLLBACK command is issued This strategy is more restrictive but easier to implement than two-phase locking Copyright © 2004

Deadlock Deadlock, or the deadly embrace, occurs when two transactions are each waiting on a resource that the other transaction holds Preventing deadlock Allow users to issue all lock requests at one time Require all application programs to lock resources in the same order Breaking deadlock Almost every DBMS has algorithms for detecting deadlock When deadlock occurs, DBMS aborts one of the transactions and rollbacks partially completed work Copyright © 2004

Optimistic/Pessimistic Locking
Optimistic locking assumes that no transaction conflict will occur DBMS processes a transaction; checks whether conflict occurred If not, the transaction is finished If so, the transaction is repeated until there is no conflict Pessimistic locking assumes that conflict will occur Locks are issued before transaction is processed, and then the locks are released Optimistic locking is preferred for the Internet and for many intranet applications Copyright © 2004

Example: Optimistic Locking
Copyright © 2004

Example: Pessimistic Locking
Copyright © 2004

Declaring Lock Characteristics
Most application programs do not explicitly declare locks due to its complication Instead, they mark transaction boundaries and declare locking behavior they want the DBMS to use Transaction boundary markers: BEGIN, COMMIT, and ROLLBACK TRANSACTION Advantage If the locking behavior needs to be changed, only the lock declaration need be changed, not the application program Copyright © 2004

Example: Marking Transaction Boundaries
Copyright © 2004

ACID Transactions Acronym ACID transaction is one that is Atomic, Consistent, Isolated, and Durable Atomic means either all or none of the database actions occur Durable means database committed changes are permanent Copyright © 2004

ACID Transactions (cont.)
Consistency means either statement level or transaction level consistency Statement level consistency: each statement independently processes rows consistently Transaction level consistency: all rows impacted by either of the SQL statements are protected from changes during the entire transaction With transaction level consistency, a transaction may not see its own changes Copyright © 2004

ACID Transactions (cont.)
Isolation means application programmers are able to declare the type of isolation level and to have the DBMS manage locks so as to achieve that level of isolation SQL-92 defines four transaction isolation levels: Read uncommitted Read committed Repeatable read Serializable Copyright © 2004

Cursor Type A cursor is a pointer into a set of records
It can be defined using SELECT statements Four cursor types Forward only: the application can only move forward through the recordset Scrollable cursors can be scrolled forward and backward through the recordset Static: processes a snapshot of the relation that was taken when the cursor was opened Keyset: combines some features of static cursors with some features of dynamic cursors Dynamic: a fully featured cursor Choosing appropriate isolation levels and cursor types is critical to database design Copyright © 2004

Database Security Database security ensures that only authorized users can perform authorized activities at authorized times Developing database security Determine users’ processing rights and responsibilities Enforce security requirements using security features from both DBMS and application programs Copyright © 2004

DBMS Security DBMS products provide security facilities
They limit certain actions on certain objects to certain users or groups Almost all DBMS products use some form of user name and password security Copyright © 2004

DBMS Security Guidelines
Run DBMS behind a firewall, but plan as though the firewall has been breached Apply the latest operating system and DBMS service packs and fixes Use the least functionality possible Support the fewest network protocols possible Delete unnecessary or unused system stored procedures Disable default logins and guest users, if possible Unless required, never allow all users to log on to the DBMS interactively Protect the computer that runs the DBMS No user allowed to work at the computer that runs the DBMS DBMS computer physically secured behind locked doors Access to the room containing the DBMS computer should be recorded in a log Copyright © 2004

DBMS Security Guidelines (cont.)
Manage accounts and passwords Use a low privilege user account for the DBMS service Protect database accounts with strong passwords Monitor failed login attempts Frequently check group and role memberships Audit accounts with null passwords Assign accounts the lowest privileges possible Limit DBA account privileges Planning Develop a security plan for preventing and detecting security problems Create procedures for security emergencies and practice them Copyright © 2004

Application Security If DBMS security features are inadequate, additional security code could be written in application program Application security in Internet applications is often provided on the Web server computer However, you should use the DBMS security features first The closer the security enforcement is to the data, the less chance there is for infiltration DBMS security features are faster, cheaper, and probably result in higher quality results than developing your own Copyright © 2004

SQL Injection Attack SQL injection attack occurs when data from the user is used to modify a SQL statement User input that can modify a SQL statment must be carefully edited to ensure that only valid input has been received and that no additional SQL syntax has been entered Example: users are asked to enter their names into a Web form textbox User input: Benjamin Franklin ' OR TRUE ' SELECT * FROM EMPLOYEE WHERE EMPLOYEE.Name = 'Benjamin Franklin' OR TRUE; Result: every row of the EMPLOYEE table will be returned Copyright © 2004

Database Recovery In the event of system failure, that database must be restored to a usable state as soon as possible Two recovery techniques: Recovery via reprocessing Recovery via rollback/rollforward Copyright © 2004

Recovery via Reprocessing
Recovery via reprocessing: the database goes back to a known point (database save) and reprocesses the workload from there Unfeasible strategy because The recovered system may never catch up if the computer is heavily scheduled Asynchronous events, although concurrent transactions, may cause different results Copyright © 2004

Rollback/Rollforward
Recovery via rollback/rollforward: Periodically save the database and keep a database change log since the save Database log contains records of the data changes in chronological order When there is a failure, either rollback or rollforward is applied Rollback: undo the erroneous changes made to the database and reprocess valid transactions Rollforward: restored database using saved data and valid transactions since the last save Copyright © 2004

Checkpoint A checkpoint is a point of synchronization between the database and the transaction log DBMS refuses new requests, finishes processing outstanding requests, and writes its buffers to disk The DBMS waits until the writing is successfully completed  the log and the database are synchronized Checkpoints speed up database recovery process Database can be recovered using after-images since the last checkpoint Checkpoint can be done several times per hour Most DBMS products automatically checkpoint themselves Copyright © 2004

Managing the DBMS DBA’s Responsibilities
Generate database application performance reports Investigate user performance complaints Assess need for changes in database structure or application design Modify database structure Evaluate and implement new DBMS features Tune the DBMS Copyright © 2004

Maintaining the Data Repository
DBA is responsible for maintaining the data repository Data repositories are collections of metadata about users, databases, and its applications The repository may be Virtual as it is composed of metadata from many different sources: DBMS, code libraries, Web page generation and editing tools, etc. An integrated product from a CASE tool vendor or from other companies The best repositories are active and they are part of the system development process Copyright © 2004

Chapter 9 Managing Multi-User Databases

Chapter 10 Managing Databases with Oracle 9i

Introduction Oracle is the world’s most popular DBMS
It is a powerful and robust DBMS that runs on many different operating systems Oracle DBMS engine: Personal Oracle and Enterprise Oracle Example of Oracle products SQL*Plus: a utility for processing SQL and creating components like stored procedures and triggers PL/SQL is a programming language that adds programming constructs to the SQL language Oracle Developer (Forms & Reports Builder) Oracle Designer Copyright © 2004

Creating an Oracle Database
Installing Oracle Install Oracle 9i Client to use an already created database Install Oracle 9i Personal Edition to create your own databases Three ways to create an Oracle database Via the Oracle Database Configuration Assistant Via the Oracle-supplied database creation procedures Via the SQL CREATE DATABASE command Copyright © 2004

SQL*Plus Oracle SQL*Plus or the Oracle Enterprise Manager Console may be used to manage an Oracle database SQL*Plus is a text editor available in all Oracle Except inside quotation marks of strings, Oracle commands are case-insensitive The semicolon (;) terminates a SQL statement The right-leaning slash (/) executes SQL statement stored in Oracle buffer SQL*Plus can be used to Enter SQL statements Submit SQL files created by text editors, e.g., notepad, to Oracle Copyright © 2004

SQL*Plus Buffer SQL*Plus keeps the current statements in a multi-line buffer without executing it LIST is used to see the contents of the buffer LIST [line_number] is used to change the current line CHANGE/astring/bstring/ is used to change the contents of the current line astring = the string you want to change bstring = what you want to change it to Example: change/Table_Name/*/ ‘Table_Name’ is replaced with ‘*’ Copyright © 2004

Creating Tables Some of the SQL-92 CREATE TABLE statements need to be modified for Oracle Oracle does not support a CASCADE UPDATE constraint Int data type is interpreted by Oracle as Number(38) Varchar data type is interpreted as VarChar2 Money or currency is defined in Oracle using the Numeric data type Oracle sequences must be used for surrogate keys DESCRIBE or DESC command is used to view table status Copyright © 2004

Oracle Sequences A sequence is an object that generates a sequential series of unique numbers It is the best way to work with surrogate keys in Oracle Two sequence methods NextVal provides the next value in a sequence CurrVal provides the current value in a sequence Using sequences does not guarantee valid surrogate key values because it is possible to have missing, duplicate, or wrong sequence value in the table Copyright © 2004

Example: Sequences Creating sequence Entering data using sequence
CREATE SEQUENCE CustID INCREMENT BY 1 START WITH 1000; Entering data using sequence INSERT INTO CUSTOMER (CustomerID, Name, AreaCode, PhoneNumber) VALUES (CustID.NextVal, ‘Mary Jones’, ‘350’, ‘555–1234); Retrieving the row just created SELECT * FROM CUSTOMER WHERE CustomerID = CustID.CurrVal Copyright © 2004

DROP and ALTER Statements
Drop statements may be used to remove structures from the database DROP TABLE MYTABLE; Any data in the MYTABLE table will be lost DROP SEQUENCE MySequence; ALTER statement may be used to drop (add) a column ALTER TABLE MYTABLE DROP COLUMN MyColumn; ALTER TABLE MYTABLE ADD C1 NUMBER(4); Copyright © 2004

TO_DATE Function Oracle requires dates in a particular format
TO_DATE function may be used to identify the format TO_DATE(‘11/12/2002’,’MM/DD/YYYY’) 11/12/2002 is the date value MM/DD/YYYY is the pattern to be used when interpreting the date TO_DATE function can be used with the INSERT and UPDATE statement to enter data INSERT INTO T1 VALUES (100, TO_DATE (‘01/05/02’, ‘DD/MM/YY’); Copyright © 2004

Creating Indexes Indexes are created to
Enforce uniqueness on columns Facilitate sorting Enable fast retrieval by column values Good candidates for indexes are columns that are frequently used with equal conditions in WHERE clause or in a join Example: CREATE INDEX CustNameIdx ON CUSTOMER(Name); CREATE UNIQUE INDEX WorkUniqueIndex ON WORK(Title, Copy, ArtistID); Copyright © 2004

Restrictions On Column Modifications
A column may be dropped at any time and all data will be lost A column may be added at any time as long as it is a NULL column To add a NOT NULL column Add a NULL column Fill the new column in every row with data Change its structure to NOT NULL ALTER TABLE T1 MODIFY C1 NOT NULL; Copyright © 2004

Creating Views SQL-92 CREATE VIEW command can be used to create views in SQL*Plus Oracle allows the ORDER BY clause in view definitions Only Oracle 9i supports the JOIN…ON syntax Example: CREATE VIEW CustomerInterests AS SELECT C.Name as Customer, A.Name as Artist FROM CUSTOMER C JOIN CUSTOMER_ARTIST_INT I ON C.CustomerID = I.CustomerID JOIN ARTIST A ON I.ArtistID = A.ArtistID; Copyright © 2004

Enterprise Manager Console
The Oracle Enterprise Manager Console provides graphical facilities for managing an Oracle database The utility can be used to manage Database structures such as tables and views User accounts, passwords, roles, and privileges The Manager Console includes a SQL scratchpad for executing SQL statements Copyright © 2004

Application Logic Oracle database application can be processed using
Programming language to invoke Oracle DBMS commands Stored procedures Start command to invoke database commands stored in .sql files Triggers Copyright © 2004

Stored Procedures A stored procedure is a PL/SQL or Java program stored within the database Stored procedures are programs that can Have parameters Invoke other procedures and functions Return values Raise exceptions A stored procedure must be compiled and stored in the database Execute or Exec command is used to invoke a stored procedure Exec Customer_Insert (‘Michael Bench’, ‘203’, ‘ ’, ‘US’); Copyright © 2004

Example: Stored Procedure
Insert Figure 10-20 IN signifies input parameters OUT signifies an output parameter IN OUT signifies a parameter used for both input and output Variables are declared after the keyword AS Copyright © 2004

Triggers Oracle triggers are PL/SQL or Java procedures that are invoked when specified database activity occurs Triggers can be used to Enforce a business rule Set complex default values Update a view Perform a referential integrity action Handle exceptions Copyright © 2004

Triggers (cont.) Trigger types
A command trigger will be fired once per SQL command A row trigger will be fired once for every row involved in the processing of a SQL command Three types of row triggers: BEFORE, AFTER, and INSTEAD OF BEFORE and AFTER triggers are placed on tables while INSTEAD OF triggers are placed on views Each trigger can be fired on insert, update, or delete commands Copyright © 2004

Data Dictionary Oracle maintains a data dictionary of metadata
The metadata of the dictionary itself are stored in the table DICT SELECT Table_Name, Comments FROM DICT WHERE Table_Name LIKE (‘%TABLES%’); USER_TABLES contains information about user or system tables DESC USER_TABLES; Copyright © 2004

Concurrency Control Oracle processes database changes by maintaining a System Change Number (SCN) SCN is a database-wide value that is incremented by Oracle when database changes are made With SCN, SQL statements always read a consistent set of values; those that were committed at or before the time the statement was started Oracle only reads committed changes; it will never reads dirty data Copyright © 2004

Oracle Transaction Isolation
Oracle supports the following transaction isolation levels Read Committed: Oracle’s default transaction isolation level since it never reads uncommitted data changes Serializable: Dirty reads are not possible, repeated reads yield the same results, and phantoms are not possible Read Only: All statements read consistent data. No inserts, updates, or deletions are possible Explicit locks: Not recommended Copyright © 2004

Oracle Security Oracle security components:
An ACCOUNT is a user account A PROFILE is a set of system resource maximums that are assigned to an account A PRIVILEGE is the right to perform a task A ROLE consists of groups of PRIVILEGEs and other ROLEs Copyright © 2004

Account System Privileges
Each ACCOUNT can be allocated many SYSTEM PRIVILEGEs and many ROLEs An ACCOUNT has all the PRIVILEGEs That have been assigned directly Of all of its ROLEs Of all of its ROLEs that are inherited through ROLE connections A ROLE can have many SYSTEM PRIVILEGEs and it may also have a relationship to other ROLEs ROLEs simplify the administration of the database A set of privileges can be assigned to or removed from a ROLE just once Copyright © 2004

Oracle Recovery Facilities
Three file types for Oracle recovery: Datafiles contain user and system data ReDo log files contain logs of database changes OnLine ReDo files are maintained on disk and contain the rollback segments from recent database changes Offline or Archive ReDo files are backups of the OnLine ReDo files Control files describe the name, contents, and locations of various files used by Oracle Copyright © 2004

Oracle Recovery Facilities (cont.)
Oracle can operate in either ARCHIVELOG or NOARCHIVELOG mode If running in ARCHIVELOG mode, Oracle logs all changes to the database When the OnLine ReDo files fill up, they are copied to the Archive ReDo files The Oracle Recovery Manager (RMAN) is a utility program used to create backups and to perform recovery Copyright © 2004

Types of Failure Oracle recovery techniques depend on the type of failure An application failure due to application logic errors An instance failure occurs when Oracle itself fails due to an operating system or computer hardware failure Oracle can recover from application and instance failure without using the archived log file A media failure occurs when Oracle is unable to write to a physical file because of a disk failure or corrupted files The database is restored from a backup Copyright © 2004

Oracle Backup Facilities
Two kinds of backups A consistent backup: Database activity must be stopped and all uncommitted changes have been removed from the datafiles Cannot be done if the database supports 24/7 operations An inconsistent backup: Backup is made while Oracle is processing the database An inconsistent backup can be made consistent by processing an archive log file Copyright © 2004

Chapter 10 Managing Databases with Oracle 9i

Instructor: Dragomir R. Radev Winter 2005
KDD and Data Mining Instructor: Dragomir R. Radev Winter 2005

Entropy (cont’d) E (age) = 5/14 I (s11,s21) + 4/14 I (s12,s22) + 5/14 I (S13,s23) = 0.694 Gain (age) = I (s1,s2) – E(age) = 0.246 Gain (income) = 0.029, Gain (student) = 0.151, Gain (credit) = 0.048 Copyright © 2004

arff files @data sunny,85,85,FALSE,no sunny,80,90,TRUE,no
overcast,83,86,FALSE,yes rainy,70,96,FALSE,yes rainy,68,80,FALSE,yes rainy,65,70,TRUE,no overcast,64,65,TRUE,yes sunny,72,95,FALSE,no sunny,69,70,FALSE,yes rainy,75,80,FALSE,yes sunny,75,70,TRUE,yes overcast,72,90,TRUE,yes overcast,81,75,FALSE,yes rainy,71,91,TRUE,no @relation weather @attribute outlook {sunny, overcast, rainy} @attribute temperature real @attribute humidity real @attribute windy {TRUE, FALSE} @attribute play {yes, no} Copyright © 2004

Instructor: Dragomir R. Radev Winter 2005
Text and XML databases Instructor: Dragomir R. Radev Winter 2005

XML-QL Two slides from Johannes Gehrke, Cornell University
<IMG SRC=“xysq.gif” ALT=“(x+y)^2”> <apply> <power/> <apply> <plus/> <ci>x</ci> <ci>y</ci> </apply> <cn>2</cn> </apply> WHERE <BOOK> <NAME><LAST>$1</LAST></NAME> </BOOK> in “ CONSTRUCT <RESULT> $1 </RESULT> Copyright © 2004

XML-QL (continued) WHERE <BOOK> $b <BOOK> IN “ <AUTHOR> $n </AUTHOR> <PUBLISHED> $p </PUBLISHED> in $e CONSTRUCT <RESULT> <PUBLISHED> $p </PUBLISHED> WHERE <LAST> $l </LAST> IN $n CONSTRUCT <LAST> $l </LAST> </RESULT> Copyright © 2004

XML-QL (continued) <!ELEMENT book (author+, title, publisher)>
<!ATTLIST book year CDATA> <!ELEMENT article (author+, title, year?, (shortversion|longversion))> <!ATTLIST article type CDATA> <!ELEMENT publisher (name, address)> <!ELEMENT author (firstname?, lastname)> Copyright © 2004

XML-QL (continued) <bib> <book year="1995">
 <title> An Introduction to Database Systems </title> <author> <lastname> Date </lastname> </author> <publisher> <name> Addison-Wesley </name > </publisher> </book> <book year="1998"> <title> Foundation for Object/Relational Databases: The Third Manifesto </title> <author> <lastname> Darwen </lastname> </author> </bib> Copyright © 2004

XML-QL (continued) <result>
<author> <lastname> Date </lastname> </author> <title> An Introduction to Database Systems </title> </result> <title> Foundation for Object/Relational Databases: The Third Manifesto </title> <author> <lastname> Darwen </lastname> </author> Copyright © 2004

XML-QL (continued) WHERE <book > $p</> IN " <title > $t</>, <publisher><name>Addison-Wesley</>> IN $p CONSTRUCT <result> <title> $t </> WHERE <author> $a </> IN $p CONSTRUCT <author> $a</> </> Copyright © 2004

XML-QL (continued) <result>
<title> An Introduction to Database Systems </title> <author> <lastname> Date </lastname> </author> </result> <title> Foundation for Object/Relational Databases: The Third Manifesto </title> <author> <lastname> Darwen </lastname> </author> Copyright © 2004

XML-QL (continued) WHERE <article> <author>
<firstname> $f </> // firstname $f <lastname> $l </> // lastname $l </> </> CONTENT_AS $a IN " <book year=$y> <firstname> $f </> // join on same firstname $f <lastname> $l </> // join on same lastname $l </> IN " y > 1995 CONSTRUCT <article> $a </> Copyright © 2004

XML-QL (continued) <person ID="o123">
<firstname>John</firstname> <lastname>Smith<lastname> </person> <person ID="o234"> . . . <article author="o123 o234"> <title> ... </title> <year> 1995 </year> </article> Copyright © 2004

XML-QL (continued) WHERE <article><author><lastname> $n</></></> IN "abc.xml” WHERE <article author=$i> <title> </> ELEMENT_AS $t </>, <person ID=$i> <lastname> </> ELEMENT_AS $l </> CONSTRUCT <result> $t $l</> Copyright © 2004

Transforming data <!ELEMENT book (author+, title, publisher)>
<!ATTLIST book year CDATA> <!ELEMENT article (author+, title, year?, (shortversion|longversion))> <!ATTLIST article type CDATA> <!ELEMENT publisher (name, address)> <!ELEMENT author (firstname?, lastname)> <!ELEMENT person (lastname, firstname, address?, phone?, publicationtitle*)> Copyright © 2004

Transforming data (cont’d)
WHERE <$> <author> <firstname> $fn </> <lastname> $ln </> </> <title> $t </> </> IN " CONSTRUCT <person ID=PersonID($fn, $ln)> <firstname> $fn </> <publicationtitle> $t </> Copyright © 2004

Query blocks WHERE <$e> <title> $t </>
<year> 1995 </> </> CONTENT_A $p IN " CONSTRUCT <result ID=ResultID($p)> <title> $t </> </> { WHERE $e = "journal-paper", <month> $m </> IN $p CONSTRUCT <result ID=ResultID($p)> <month> $m </> </> } { WHERE $e = "book", <publisher>$q </> IN $p CONSTRUCT <result ID=ResultID($p)> <publisher>$q </> </> Copyright © 2004

DTD <!ELEMENT bib (book* )>
<!ELEMENT book (title, (author+ | editor+ ), publisher, price )> <!ATTLIST book year CDATA #REQUIRED > <!ELEMENT author (last, first )> <!ELEMENT editor (last, first, affiliation )> <!ELEMENT title (#PCDATA )> <!ELEMENT last (#PCDATA )> <!ELEMENT first (#PCDATA )> <!ELEMENT affiliation (#PCDATA )> <!ELEMENT publisher (#PCDATA )> <!ELEMENT price (#PCDATA )> Copyright © 2004

Sample database Copyright © 2004 <bib> <book year="1994">
<title>TCP/IP Illustrated</title> <author> <last>Stevens</last> <first>W.</first> </author> <publisher>Addison-Wesley</publisher> <price> 65.95</price> </book> <book year="1992"> <title>Advanced Programming in the Unix environment</title> <price>65.95</price> <book year="2000"> <title>Data on the Web</title> <last>Abiteboul</last> <first>Serge</first></author> <last>Buneman</last> <first>Peter</first> <last>Suciu</last> <first>Dan</first> <publisher>Morgan Kaufmann Publishers</publisher> <price>39.95</price> </bib> Copyright © 2004

SI654 Database Application Design

Similar presentations

Presentation on theme: "SI654 Database Application Design"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

SI654 Database Application Design

Similar presentations

Presentation on theme: "SI654 Database Application Design"— Presentation transcript:

Similar presentations

About project

Feedback