Maturity DB Process Design Stage Review Logical Design Physical Design

Slides:



Advertisements
Similar presentations
Relational Database and Data Modeling
Advertisements

Chapter 5 Normalization of Database Tables
Chapter 10: Designing Databases
Module 2 Designing a Logical Database Model. Module Overview Guidelines for Building a Logical Database Model Planning for OLTP Activity Evaluating Logical.
Normalization of Database Tables
Client/Server Databases and the Oracle 10g Relational Database
1 Basic DB Terms Data: Meaningful facts, text, graphics, images, sound, video segments –A collection of individual responses from a marketing research.
The Relational Database Model:
Physical Database Monitoring and Tuning the Operational System.
Normalization of Database Tables
Chapter 5 Normalization of Database Tables
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
Chapter 4 Relational Databases Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall 4-1.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 5 Normalization of Database Tables.
Database Design Overview. 2 Database DBMS File Record Field Cardinality Keys Index Pointer Referential Integrity Normalization Data Definition Language.
Chapter 4 Relational Databases Copyright © 2012 Pearson Education 4-1.
Chapter 14 & 15 Conceptual & Logical Database Design Methodology
Logical Database Design Nazife Dimililer. II - Logical Database Design Two stages –Building and validating local logical model –Building and validating.
Entity-Relationship Design
IST Databases and DBMSs Todd S. Bacastow January 2005.
The Relational Database Model
Chapter 6 Physical Database Design. Introduction The purpose of physical database design is to translate the logical description of data into the technical.
Practical Database Design and Tuning. Outline  Practical Database Design and Tuning Physical Database Design in Relational Databases An Overview of Database.
Database Technical Session By: Prof. Adarsh Patel.
Lecture 9 Methodology – Physical Database Design for Relational Databases.
Software School of Hunan University Database Systems Design Part III Section 5 Design Methodology.
Concepts and Terminology Introduction to Database.
Physical Database Design Chapter 6. Physical Design and implementation 1.Translate global logical data model for target DBMS  1.1Design base relations.
Database Systems: Design, Implementation, and Management Tenth Edition
5 1 Chapter 5 Normalization of Database Tables Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel.
Database Systems: Design, Implementation, and Management Ninth Edition Chapter 6 Normalization of Database Tables.
1 DATABASE SYSTEMS DESIGN IMPLEMENTATION AND MANAGEMENT INTERNATIONAL EDITION ROB CORONEL CROCKETT Chapter 7 Normalisation.
Normalization (Codd, 1972) Practical Information For Real World Database Design.
BIS Database Systems School of Management, Business Information Systems, Assumption University A.Thanop Somprasong Chapter # 5 Normalization of Database.
Chapter 6 1 © Prentice Hall, 2002 The Physical Design Stage of SDLC (figures 2.4, 2.5 revisited) Project Identification and Selection Project Initiation.
DATABASE MGMT SYSTEM (BCS 1423) Chapter 5: Methodology – Conceptual Database Design.
Designing Databases Systems Analysis and Design, 7e Kendall & Kendall 13 © 2008 Pearson Prentice Hall.
10/10/2012ISC239 Isabelle Bichindaritz1 Physical Database Design.
DataBase Management System What is DBMS Purpose of DBMS Data Abstraction Data Definition Language Data Manipulation Language Data Models Data Keys Relationships.
Databases Shortfalls of file management systems Structure of a database Database administration Database Management system Hierarchical Databases Network.
Prepared By Prepared By : VINAY ALEXANDER ( विनय अलेक्सजेंड़र ) PGT(CS),KV JHAGRAKHAND.
Chapter 13 Designing Databases Systems Analysis and Design Kendall & Kendall Sixth Edition.
Normalization of Database Tables
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Information Access Mgt09/12/971 Entity-Relationship Design Information Level Design.
MIS 301 Information Systems in Organizations Dave Salisbury ( )
IS 320 Notes for April 15, Learning Objectives Understand database concepts. Use normalization to efficiently store data in a database. Use.
Chapter 4 Logical & Physical Database Design
Chapter 5 Index and Clustering
Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall Chapter 9 Designing Databases 9.1.
Data modeling Process. Copyright © CIST 2 Definition What is data modeling? –Identify the real world data that must be stored on the database –Design.
Logical Database Design and the Relational Model.
Chapter 3: Relational Databases
5 1 Chapter 5 Normalization of Database Tables Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel.
LECTURE TWO Introduction to Databases: Data models Relational database concepts Introduction to DDL & DML.
Database Planning Database Design Normalization.
1 Management Information Systems M Agung Ali Fikri, SE. MM.
IT 5433 LM3 Relational Data Model. Learning Objectives: List the 5 properties of relations List the properties of a candidate key, primary key and foreign.
Logical Database Design and the Rational Model
Practical Database Design and Tuning
Data Models.
Physical Database Design for Relational Databases Step 3 – Step 8
Modern Systems Analysis and Design Third Edition
Chapter 4 Relational Databases
CHAPTER 5: PHYSICAL DATABASE DESIGN AND PERFORMANCE
Chapter 6 Normalization of Database Tables
Practical Database Design and Tuning
Data Model.
Chapter 17 Designing Databases
Systems Analysis and Design, 7e Kendall & Kendall
Presentation transcript:

Maturity DB Process Design Stage Review Logical Design Physical Design DDL Script Review Coding Unit Test Integration Test Evaluation Stress Test Production

Design decide the system quality Design Stage Coding Stage Testing Stage Production

Design Stage Logical Design Physical Design Maintain Plan

I Logical Design

Data Model What is a logical data model? What is the purpose of data modeling? How to design logical data model?

What is Data Model? A model is an abstract representation of some real thing. Data modeling is the action of exploring data-oriented structures.  A logical data model is a graphical representation of the information requirements of a business area, it is not a database.

Data Models Concepts Conceptual data models Logical data models (LDMs).  Physical data models (PDMs).

What is the difference between a logical data model and a physical database design? THE LOGICAL MODEL THE PHYSICAL DATABASE DESIGN Includes all entities, relationships, and attributes (and their information types) whether supported by a technology or not. Uses business names. Captures and records information necessary for the business. Includes tables, columns, keys, datatypes, validation rules. DB triggers, stored procedures, domains, and access constraints (security). Names may be limited by the DBMS. Includes technology-specific data elements such as flags, switches, and timestamps. Includes unique identifiers. Includes primary keys, foreign keys, and indices for fast data access. Is normalized to at least 3rd normal form. May be de-normalized to meet performance requirements. Does not include any redundant data. May include redundant data elements. Does not include any derived data. May include results of complex or difficult to recreate calculations. Business experts drive the model. Designer drive the model.

A simple logical data model.

A simple physical data model

Logical Data Model Format Logical Data Model is in format known as “Entity Relationship Diagram” (ERD) Most popular data modeling tools are Erwin, ER Studio and Power Designer.

Data Model What is a logical data model? What is the purpose of data modeling? How to design logical data model?

Advantages to Using a Model Easier to understand model at a glance No need to trace through narrative descriptions of relationships Communicates one clear definition Understood by business and technical staff

Benefits of a Logical Data Model Using a Logical Data model speeds maintenance and eases the Transition to new technologies. Capture business requirements (ensure understanding) Ability to share data across enterprise resulting in: Accurate data Consistent data Reduced costs Easier to implement changes in your business Business requirements can be satisfied in database design

Data Model What is a logical data model? What is the purpose of data modeling? How to design logical data model?

Who uses the logical data model? The Business Area Experts own the logical data model. They describe their data requirements to the data modeler and review the models created. They use the models for impact analysis of changes to business requirements. The Data Modeler conducts facilitated sessions with business area experts to gather the data requirements and build the logical data model. The data modeler also works with the process analyst to link data with processes. The data modeler is responsible for getting approval of the logical data model from the business area experts and then works with the DBA to transition the logical model to the physical model. The DBA (Designer) builds the physical data model from the logical data model. To create a good quality database design, the DBA reviews the logical model to select technology appropriate keys, create indexes, detail data types, and build referential integrity to protect the data values. The database administrator may de-normalize the database for efficiency. DBAs also are responsible for creating db schemas, maintaining referential integrity, and monitoring database performance.

Actions in Data Modeling Identify – Determine which things are represented in the model. Name – Each thing represented in the model needs to have a unique and meaningful name. Describe – Name is important, but not sufficient. Description should be no more than three sentences, each with subject, object, and verb. Must answer: What is it? What it is not. Sometimes: What are some examples? Associate – Much of the meaning is in associations among the things represented in the model.

How to Model Data Identify entity types Identify attributes Assign keys Inversion Entries Identify relationships Normalize to Reduce Data Redundancy

What is an Entity? Entity: a person, place, thing, concept or event that the business wants to store information about A movie is an entertainment, documentary, or educational event which has been recorded in a moving picture format. MOVIE

Entity and Instance Each entity is made up by a group of objects, which are named as Instances. Each instance can be identified from other instances.

ENTITY Examples Mr.Koch People Ms.Chou HongKong Place R.O.C BMW 525i category ENTITY Instance Mr.Koch EMPLOYEE STUDENT OFFICE AUTOMOBILE CHEMICAL FUNDS TRANSFER TENNIS TOURNAMENT COUNTRY DEPARTMENT ORDER People Place Things Event concept Ms.Chou HongKong R.O.C BMW 525i Ammonia 42233 U.S. OPEN L789 I12345

What is an Attribute? Attribute: a fact or characteristic of an entity with only one meaning (atomic) Each entity type will have one or more data attributes attributes Employee Id Employee Last Name Employee First Name Employee Address Employee Phone Number EMPLOYEE ENTITY Name

Two kinds of Attributes Key Attributes Non-key Attributes Consultant Id Consultant Last Name Consultant First Name Consultant Specialization Consultant Hourly Rate CONSULTANT Key Attributes Non-key Attributes

Candidate Keys One single attribute or a group of attributes that can be used to identify each instance. TEACHER Teacher Last Name Teacher First Name Teacher Address Teacher Country Teacher Certificate Id Teacher Mother Maiden Name Teacher Phone Number Teacher Date of Birth

Primary Key A candidate key with the highest priority that be used to identify the instance EMPLOY ID First Name Last Name Address Department Phone Number Birthday Employee PK

Alternate Key All the candidate keys except PK Employee Id Employee Last Name (AK1) Employee First Name (AK1) Employee Address Employee City Employee State Employee Zip Code Employee Phone Number (AK2) Employee Date of Birth (AK1,AK2)

Inversion Entries Some of attributes be used to find out the instance wanted. The result may not be unique. Employee Id Employee Last Name (AK1,IE2) Employee First Name (AK1) Employee Address Employee City (IE1) Employee State (IE1) Employee Zip Code Employee Phone Number Employee Date of Birth (AK1) EMPLOYEE

What is a Relationship? Relationship: an association between occurrences of one or more entities which provides some relevant and valuable information MOVIE VIDEO TAPE is recorded on records

What is a Verb Phrase Parent-to-child verb phrase describes how the parent is related to the child. In the example to the left, the verb phrase states that “STORE rents A MOVIE.” Child-to-parent verb phrase describes how a child entity is related to a parent entity. In the example to the left, the verb phrase states that “MOVIE is rented from A STORE”

Cardinality of Relationship One-to-one One-to-many Many-to-one Many-to-many All types can be optional for one or both entities

Identifying Relationship An identifying relationship is a relationship between two entities in which an instance of a child entity is identified through its association with a parent entity, which means the child entity is dependent on the parent entity for its identify and cannot exist without it. MOVIE MASTER Movie Master Id Movie Name Movie Star Movie Type Movie Rating MOVIE COPY Movie Master Id (FK) Movie Copy Number Movie Copy Create Date Movie Copy Due Date Movie Copy Condition is rented as/ is created from

Mandatory non-identifying relationship A non-identifying relationship in which an instance of the child entity must be related to an instance of the parent entity. places/ is received from CUSTOMER Customer Id Customer Name Customer Address Customer Phone ORDER Order Number Customer Id (FK) Order Date Order Status Order Shipdate

Non-mandatory non-identifying relationship A non-identifying relationship in which an instance of the child entity can exist without being related to an instance of the parent entity. EMPLOYEE Employee Id Department Number (FK) Employee Name Employee Address employs/ belongs to Department Number Department Name Department Location DEPARTMENT

Many-to-Many Relationship A many-to-many relationship is one where a relationship and its inverse are both to-many (if you are used to entity-relationship modeling using a relational database. is ordered from /sends us PART SUPPLIER

Build Relationship 1:M Y N Start 1 : M M:M Cardinality of R M : M Draw and name an Identifying Relationship from Parent to Child M:M inheritable or Non-inheritable Draw and name a Non-identifying Relationship from Parent to Child FK - NO NULL FK - NULLS ALLOWED 1 : M M : M 1:M Cardinality of R Indentify Non-identify Start Y N

Normalize to Reduce Data Redundancy Data normalization is a process in which data attributes within a data model are organized to increase the cohesion of entity types. Level Rule First normal form (1NF) An entity type is in 1NF when it contains no repeating groups of data. Second normal form (2NF) An entity type is in 2NF when it is in 1NF and when all of its non-key attributes are fully dependent on its primary key. Third normal form (3NF) An entity type is in 3NF when it is in 2NF and when all of its attributes are directly dependent on the primary key.

Normalization Step by step process to verify and refine logical data model Condition of model at completion of each step is a “normal form” DOT standard is third normal form First normal form: Eliminate repeating groups Second normal form: Ensure that all attributes depend on the entity identifier Third normal form: Ensure that all attributes depend only on the entity identifier

1st Normal Form Eliminate repeating groups To remove the repeating group of fields, collapse them into a single field with multiple records in a new table, related back to the primary data.

2nd Normal Form Uniquely identify each instance Each table must contain attributes for a single subject and each table must contain an attribute (or set of attributes) that uniquely identify a single record within that table.

3rd Normal Form Eliminate columns not dependent on the key Each attribute must depend on the primary key, so the violating fields are moved into separate, related tables.

II Physical Design

Physical Design Mapping Logical Model to Physical Model Naming standard Identify table type Column Data Type Group tables Assign Keys Choose Index Denormalizate to improve performance Storage

Mapping Logical Model to Physical Model Entity -> Table Attribute -> Column Primary Key -> Primary Key Relationship -> Foreign Key Inversion Entry -> Index

Naming Standard Name the db objects under defined naming standard Example: table should have a prefix t_ Define abbreviation Example: Cargo -> CGO

Table Types Table Purpose Data Wave Data Size

Table Purpose Transaction Table Log Table / Analysis table Statistics Table Supporting Table

Data Wave Stable Table Increasing Table Volatile Table

Data Size Large Table Small Table

Group Table Group table by business module Group table by relationship

Column Data Type Choose data type Length LOB Char Varchar2 Number Integer Float Length LOB Store in row Store in another tablespace

Assign Primary Key Natural Key Surrogate Key Assign a natural key which is one or more existing data attributes that are unique to the business concept. Surrogate Key Introduce a new column, called a surrogate key, which is a key that has no business meaning. 

Natural Key Advantage Disadvantage No need introduce new column Meaningful and understandable Key value is transferable Disadvantage May changed by business requirement change May contain many columns in feature generation Key value may be updated which will also impact children tables

Surrogate Key Advantage Disadvantage Not related to business, be easily maintain Stable Just contain one single column, simplify the foreign key Disadvantage Will lead to recursive relationship Hard to understand the relationship and its type May add redundancy code

How to choose surrogate key? Key assigned by the RDBMS, e.g. SEQUENCE Max()+1 Universally Unique Identifiers (UUID) Global Unique Identifiers (GUID) High-Low strategy

Choose Key Strategies Unique Minimal Columns Not null Stable Fit to the application

Assign Foreign Key Ensure the data integration Delete/Update Cascade Which case no need assign Foreign Key?

How to choose index Proto-index from logical model Eliminate overlapped index Eliminate low-hit index Column sequence in index B-Tree .vs. Bitmap

Proto-index from logical model Inversion Entry Primary Key Candidate Key Foreign Key

Eliminate overlapped index Index overlap index Multiple Option Columns

Eliminate low-hit index Small Table / Cached Table Indexed Column cardinality (1/distinct_value_num)*total_value_num

Column sequence in index High searching column leading the index Low Cardinality column leading the index Conduce to eliminate duplicated index

B-Tree .vs. Bitmap B-Tree Index Bitmap Index OLTP table Low Cardinality Column Bitmap Index DSS/OLAP table High Cardinality Column

Denormalize to improve performance Adding redundancy data to avoid costly table joins can dramatically improve the query performance.

When denormalize? Repeatedly join two table together. Additional query item. Additional order by item.

Which column be redundancy Small data column Static and rarely updated column

Materialized View A materialized view is a database object that contains the results of a query. A view of tables; Query result be stored physically.

Redundancy & Integration Trigger Scheduled Job

Storage Tablesapce Table storage

Tablesapce Dictionary Management Tablespace (DMT) Local Management Tablespace (LMT)

ASSM ASSM (Automatic Segment Space Management) is a method used by Oracle to manage space inside data blocks. It eliminates the need to specify parameters like PCTUSED, Freelists and Freelist groups for objects created in the tablespace.

Table Storage Cached Table Index Organized Table Compressed Table Partition Table Cluster Table External Table Global Temporary Table

Cached Table For data that is accessed frequently, this clause indicates that the blocks retrieved for this table are placed at the most recently used end of the least recently used (LRU) list in the buffer cache when a full table scan is performed. This attribute is useful for small lookup tables. You cannot specify CACHE for an index-organized table. However, index-organized tables implicitly provide CACHE behavior.

Index Organized Table The data rows are held in an index defined on the primary key for the table. Best suited for primary key-based access and manipulation.

Compressed Table Enables data segment compression to reduce disk use. Only for heap-organized tables. LOB data segments are not compressed.

Partition Table Partition the table by rules. Data will be stored at different partition. Cannot partition a table that is part of a cluster. Cannot partition a table containing any LONG or LONG RAW columns.

Cluster Table Specify one column from the table for each column in the cluster key. A clustered table uses the cluster's space allocation. Object tables and tables containing LOB columns cannot be part of a cluster.

External Table It is a read-only table, whose metadata is stored in the database and table data stored in outside database, flat file. can specify only column, datatype, and inline_constraint. cannot specify constraints on an external table. cannot have object type columns, LOB columns, or LONG columns.

Global Temporary Table Table is temporary and that its definition is visible to all sessions. The data in a temporary table is visible only to the session that inserts the data into the table. it contains either session-specific or transaction-specific data, which decided by the ON COMMIT clause.

Maintain Plan Table Sizing Housekeeping Plan Analyze Statistics data

Table Sizing Data type length Index Data growth VARCHAR2 LOB Other type Index Rowid Data growth

Initial sizing method Calculate Row size by summing column length. Insert initial data & analyze table to get the row size Analyze exiting table to get the row size. Space fragment redundancy (5%~30%).

Housekeeping Plan Which table need by housekept? When to perform housekeeping? How to housekeep?

Which table need by housekept? Transaction table / Log table; Increasing table; Large table

When to perform housekeeping? Housekeeping is high cost operation. Should be performed at low-loading or down time. High housekeeping frequency will help to keep low HWM. Should be performed periodically.

How to housekeep? Housekeep condition Time Status Online data ->[Compressed Data ] -> [ Archived Data ] -> Deleted data Schedule Job / Manually

Analyze Statistics data Which table need be analyzed? When to analyze?

Which table need be analyzed? In CBO, all of tables need be analyzed. Different kinds of table have different analyze interval.

When to analyze? Table be online for a time, when data enough. Data volume changed dramatically. Table structure changed.

IV Example Student Course Management System

Student Course Management System Entities Student Course Course Student

Student Course Management System Attributes Student ID Name Sex Age Address College College Address Student Course ID Course Name Teacher ID Teacher Name Course

1NF – Eliminate Repeating Groups Student ID First Name Last Name Sex Age Address College College Address Student Course ID Course Name Teacher ID Teacher First Name Teacher Last Name Course

Student Course Management System Keys Student ID (PK) First Name (AK1) Last Name (AK1) Sex Age Address (AK1) College College Address Student Course ID (PK) Course Name (AK1) Teacher ID (AK1) Teacher First Name Teacher Last Name Course

Student Course Management System Inversion Entry Student ID (PK) First Name (AK1) Last Name (AK1) Sex Age Address (AK1) College (IE1) College Address Student Course ID (PK) Course Name (AK1) (IE1) Teacher ID (AK1) (IE2) Teacher First Name Teacher Last Name Course

Student Course Management System Relationship Student Elect Course Course Open For Student Student ID (PK) First Name (AK1) Last Name (AK1) Sex Age Address (AK1) College (IE1) College Address Student Course ID (PK) Course Name (AK1) (IE1) Teacher ID (AK1) (IE2) Teacher First Name Teacher Last Name Course

Student Course Management System Transform Many-to-Many to One-to-Many Student ID (PK) First Name (AK1) Last Name (AK1) Sex Age Address (AK1) College (IE1) College Address Student Course ID (PK) Course Name (AK1) (IE1) Teacher ID (AK1) (IE2) Teacher First Name Teacher Last Name Course Student ID(FK1) Course ID(FK2) Score Election Times Credit Hour Election Course Open For Student Student Elect Course

Student Course Management System 2NF -- Ensure that all attributes depend on the entity identifier Student ID (PK) First Name (AK1) Last Name (AK1) Sex Age Address (AK1) College (IE1) College Address Student Course ID (PK) Course Name (AK1) (IE1) Teacher ID (AK1) (IE2) Teacher First Name Teacher Last Name Credit Hour Course Student ID(FK1) Course ID(FK2) Score Election Times Election Course Open For Student Student Elect Course

Student Course Management System 3NF -- Ensure that all attributes depend only on the entity identifier Student Election Course Student Elect Course Course Open For Student Student ID (PK) First Name (AK1) Last Name (AK1) Sex Age Address (AK1) College ID(IE1)(FK1) Student ID(FK1) Course ID(FK2) Score Election Times Course ID (PK) Course Name (AK1) (IE1) Teacher ID(AK1)(IE2)(FK1) Credit Hour Teacher Teach Course Teacher ID Teacher First Name Teacher Last Name College ID(FK1) Teacher College ID College Name College Address Rector College Teacher Belong to College Student Belong to College

Q & A

Thanks! www.HelloDBA.com fuyuncat