Presentation is loading. Please wait.

Presentation is loading. Please wait.

Maturity DB Process Design Stage Review Logical Design Physical Design

Similar presentations


Presentation on theme: "Maturity DB Process Design Stage Review Logical Design Physical Design"— Presentation transcript:

1 Maturity DB Process Design Stage Review Logical Design Physical Design
DDL Script Review Coding Unit Test Integration Test Evaluation Stress Test Production

2 Design decide the system quality
Design Stage Coding Stage Testing Stage Production

3 Design Stage Logical Design Physical Design Maintain Plan

4 I Logical Design

5 Data Model What is a logical data model?
What is the purpose of data modeling? How to design logical data model?

6 What is Data Model? A model is an abstract representation of some real thing. Data modeling is the action of exploring data-oriented structures.  A logical data model is a graphical representation of the information requirements of a business area, it is not a database.

7 Data Models Concepts Conceptual data models
Logical data models (LDMs).  Physical data models (PDMs).

8 What is the difference between a logical data model and a physical database design?
THE LOGICAL MODEL THE PHYSICAL DATABASE DESIGN Includes all entities, relationships, and attributes (and their information types) whether supported by a technology or not. Uses business names. Captures and records information necessary for the business. Includes tables, columns, keys, datatypes, validation rules. DB triggers, stored procedures, domains, and access constraints (security). Names may be limited by the DBMS. Includes technology-specific data elements such as flags, switches, and timestamps. Includes unique identifiers. Includes primary keys, foreign keys, and indices for fast data access. Is normalized to at least 3rd normal form. May be de-normalized to meet performance requirements. Does not include any redundant data. May include redundant data elements. Does not include any derived data. May include results of complex or difficult to recreate calculations. Business experts drive the model. Designer drive the model.

9 A simple logical data model.

10 A simple physical data model

11 Logical Data Model Format
Logical Data Model is in format known as “Entity Relationship Diagram” (ERD) Most popular data modeling tools are Erwin, ER Studio and Power Designer.

12 Data Model What is a logical data model?
What is the purpose of data modeling? How to design logical data model?

13 Advantages to Using a Model
Easier to understand model at a glance No need to trace through narrative descriptions of relationships Communicates one clear definition Understood by business and technical staff

14 Benefits of a Logical Data Model
Using a Logical Data model speeds maintenance and eases the Transition to new technologies. Capture business requirements (ensure understanding) Ability to share data across enterprise resulting in: Accurate data Consistent data Reduced costs Easier to implement changes in your business Business requirements can be satisfied in database design

15 Data Model What is a logical data model?
What is the purpose of data modeling? How to design logical data model?

16 Who uses the logical data model?
The Business Area Experts own the logical data model. They describe their data requirements to the data modeler and review the models created. They use the models for impact analysis of changes to business requirements. The Data Modeler conducts facilitated sessions with business area experts to gather the data requirements and build the logical data model. The data modeler also works with the process analyst to link data with processes. The data modeler is responsible for getting approval of the logical data model from the business area experts and then works with the DBA to transition the logical model to the physical model. The DBA (Designer) builds the physical data model from the logical data model. To create a good quality database design, the DBA reviews the logical model to select technology appropriate keys, create indexes, detail data types, and build referential integrity to protect the data values. The database administrator may de-normalize the database for efficiency. DBAs also are responsible for creating db schemas, maintaining referential integrity, and monitoring database performance.

17 Actions in Data Modeling
Identify – Determine which things are represented in the model. Name – Each thing represented in the model needs to have a unique and meaningful name. Describe – Name is important, but not sufficient. Description should be no more than three sentences, each with subject, object, and verb. Must answer: What is it? What it is not. Sometimes: What are some examples? Associate – Much of the meaning is in associations among the things represented in the model.

18 How to Model Data Identify entity types Identify attributes
Assign keys Inversion Entries Identify relationships Normalize to Reduce Data Redundancy

19 What is an Entity? Entity: a person, place, thing, concept or event that the business wants to store information about A movie is an entertainment, documentary, or educational event which has been recorded in a moving picture format. MOVIE

20 Entity and Instance Each entity is made up by a group of objects, which are named as Instances. Each instance can be identified from other instances.

21 ENTITY Examples Mr.Koch People Ms.Chou HongKong Place R.O.C BMW 525i
category ENTITY Instance Mr.Koch EMPLOYEE STUDENT OFFICE AUTOMOBILE CHEMICAL FUNDS TRANSFER TENNIS TOURNAMENT COUNTRY DEPARTMENT ORDER People Place Things Event concept Ms.Chou HongKong R.O.C BMW 525i Ammonia 42233 U.S. OPEN L789 I12345

22 What is an Attribute? Attribute: a fact or characteristic of an entity with only one meaning (atomic) Each entity type will have one or more data attributes attributes Employee Id Employee Last Name Employee First Name Employee Address Employee Phone Number EMPLOYEE ENTITY Name

23 Two kinds of Attributes
Key Attributes Non-key Attributes Consultant Id Consultant Last Name Consultant First Name Consultant Specialization Consultant Hourly Rate CONSULTANT Key Attributes Non-key Attributes

24 Candidate Keys One single attribute or a group of attributes that can be used to identify each instance. TEACHER Teacher Last Name Teacher First Name Teacher Address Teacher Country Teacher Certificate Id Teacher Mother Maiden Name Teacher Phone Number Teacher Date of Birth

25 Primary Key A candidate key with the highest priority that be used to identify the instance EMPLOY ID First Name Last Name Address Department Phone Number Birthday Employee PK

26 Alternate Key All the candidate keys except PK Employee Id
Employee Last Name (AK1) Employee First Name (AK1) Employee Address Employee City Employee State Employee Zip Code Employee Phone Number (AK2) Employee Date of Birth (AK1,AK2)

27 Inversion Entries Some of attributes be used to find out the instance wanted. The result may not be unique. Employee Id Employee Last Name (AK1,IE2) Employee First Name (AK1) Employee Address Employee City (IE1) Employee State (IE1) Employee Zip Code Employee Phone Number Employee Date of Birth (AK1) EMPLOYEE

28 What is a Relationship? Relationship: an association between occurrences of one or more entities which provides some relevant and valuable information MOVIE VIDEO TAPE is recorded on records

29 What is a Verb Phrase Parent-to-child verb phrase describes how the parent is related to the child. In the example to the left, the verb phrase states that “STORE rents A MOVIE.” Child-to-parent verb phrase describes how a child entity is related to a parent entity. In the example to the left, the verb phrase states that “MOVIE is rented from A STORE”

30 Cardinality of Relationship
One-to-one One-to-many Many-to-one Many-to-many All types can be optional for one or both entities

31 Identifying Relationship
An identifying relationship is a relationship between two entities in which an instance of a child entity is identified through its association with a parent entity, which means the child entity is dependent on the parent entity for its identify and cannot exist without it. MOVIE MASTER Movie Master Id Movie Name Movie Star Movie Type Movie Rating MOVIE COPY Movie Master Id (FK) Movie Copy Number Movie Copy Create Date Movie Copy Due Date Movie Copy Condition is rented as/ is created from

32 Mandatory non-identifying relationship
A non-identifying relationship in which an instance of the child entity must be related to an instance of the parent entity. places/ is received from CUSTOMER Customer Id Customer Name Customer Address Customer Phone ORDER Order Number Customer Id (FK) Order Date Order Status Order Shipdate

33 Non-mandatory non-identifying relationship
A non-identifying relationship in which an instance of the child entity can exist without being related to an instance of the parent entity. EMPLOYEE Employee Id Department Number (FK) Employee Name Employee Address employs/ belongs to Department Number Department Name Department Location DEPARTMENT

34 Many-to-Many Relationship
A many-to-many relationship is one where a relationship and its inverse are both to-many (if you are used to entity-relationship modeling using a relational database. is ordered from /sends us PART SUPPLIER

35 Build Relationship 1:M Y N Start 1 : M M:M Cardinality of R M : M
Draw and name an Identifying Relationship from Parent to Child M:M inheritable or Non-inheritable Draw and name a Non-identifying Relationship from Parent to Child FK - NO NULL FK - NULLS ALLOWED 1 : M M : M 1:M Cardinality of R Indentify Non-identify Start Y N

36 Normalize to Reduce Data Redundancy
Data normalization is a process in which data attributes within a data model are organized to increase the cohesion of entity types. Level Rule First normal form (1NF) An entity type is in 1NF when it contains no repeating groups of data. Second normal form (2NF) An entity type is in 2NF when it is in 1NF and when all of its non-key attributes are fully dependent on its primary key. Third normal form (3NF) An entity type is in 3NF when it is in 2NF and when all of its attributes are directly dependent on the primary key.

37 Normalization Step by step process to verify and refine logical data model Condition of model at completion of each step is a “normal form” DOT standard is third normal form First normal form: Eliminate repeating groups Second normal form: Ensure that all attributes depend on the entity identifier Third normal form: Ensure that all attributes depend only on the entity identifier

38 1st Normal Form Eliminate repeating groups
To remove the repeating group of fields, collapse them into a single field with multiple records in a new table, related back to the primary data.

39 2nd Normal Form Uniquely identify each instance
Each table must contain attributes for a single subject and each table must contain an attribute (or set of attributes) that uniquely identify a single record within that table.

40 3rd Normal Form Eliminate columns not dependent on the key
Each attribute must depend on the primary key, so the violating fields are moved into separate, related tables.

41 II Physical Design

42 Physical Design Mapping Logical Model to Physical Model
Naming standard Identify table type Column Data Type Group tables Assign Keys Choose Index Denormalizate to improve performance Storage

43 Mapping Logical Model to Physical Model
Entity -> Table Attribute -> Column Primary Key -> Primary Key Relationship -> Foreign Key Inversion Entry -> Index

44 Naming Standard Name the db objects under defined naming standard
Example: table should have a prefix t_ Define abbreviation Example: Cargo -> CGO

45 Table Types Table Purpose Data Wave Data Size

46 Table Purpose Transaction Table Log Table / Analysis table
Statistics Table Supporting Table

47 Data Wave Stable Table Increasing Table Volatile Table

48 Data Size Large Table Small Table

49 Group Table Group table by business module Group table by relationship

50 Column Data Type Choose data type Length LOB Char Varchar2
Number Integer Float Length LOB Store in row Store in another tablespace

51 Assign Primary Key Natural Key Surrogate Key
Assign a natural key which is one or more existing data attributes that are unique to the business concept. Surrogate Key Introduce a new column, called a surrogate key, which is a key that has no business meaning. 

52 Natural Key Advantage Disadvantage No need introduce new column
Meaningful and understandable Key value is transferable Disadvantage May changed by business requirement change May contain many columns in feature generation Key value may be updated which will also impact children tables

53 Surrogate Key Advantage Disadvantage
Not related to business, be easily maintain Stable Just contain one single column, simplify the foreign key Disadvantage Will lead to recursive relationship Hard to understand the relationship and its type May add redundancy code

54 How to choose surrogate key?
Key assigned by the RDBMS, e.g. SEQUENCE Max()+1 Universally Unique Identifiers (UUID) Global Unique Identifiers (GUID) High-Low strategy

55 Choose Key Strategies Unique Minimal Columns Not null Stable
Fit to the application

56 Assign Foreign Key Ensure the data integration Delete/Update Cascade
Which case no need assign Foreign Key?

57 How to choose index Proto-index from logical model
Eliminate overlapped index Eliminate low-hit index Column sequence in index B-Tree .vs. Bitmap

58 Proto-index from logical model
Inversion Entry Primary Key Candidate Key Foreign Key

59 Eliminate overlapped index
Index overlap index Multiple Option Columns

60 Eliminate low-hit index
Small Table / Cached Table Indexed Column cardinality (1/distinct_value_num)*total_value_num

61 Column sequence in index
High searching column leading the index Low Cardinality column leading the index Conduce to eliminate duplicated index

62 B-Tree .vs. Bitmap B-Tree Index Bitmap Index OLTP table
Low Cardinality Column Bitmap Index DSS/OLAP table High Cardinality Column

63 Denormalize to improve performance
Adding redundancy data to avoid costly table joins can dramatically improve the query performance.

64 When denormalize? Repeatedly join two table together.
Additional query item. Additional order by item.

65 Which column be redundancy
Small data column Static and rarely updated column

66 Materialized View A materialized view is a database object that contains the results of a query. A view of tables; Query result be stored physically.

67 Redundancy & Integration
Trigger Scheduled Job

68 Storage Tablesapce Table storage

69 Tablesapce Dictionary Management Tablespace (DMT)
Local Management Tablespace (LMT)

70 ASSM ASSM (Automatic Segment Space Management) is a method used by Oracle to manage space inside data blocks. It eliminates the need to specify parameters like PCTUSED, Freelists and Freelist groups for objects created in the tablespace.

71 Table Storage Cached Table Index Organized Table Compressed Table
Partition Table Cluster Table External Table Global Temporary Table

72 Cached Table For data that is accessed frequently, this clause indicates that the blocks retrieved for this table are placed at the most recently used end of the least recently used (LRU) list in the buffer cache when a full table scan is performed. This attribute is useful for small lookup tables. You cannot specify CACHE for an index-organized table. However, index-organized tables implicitly provide CACHE behavior.

73 Index Organized Table The data rows are held in an index defined on the primary key for the table. Best suited for primary key-based access and manipulation.

74 Compressed Table Enables data segment compression to reduce disk use.
Only for heap-organized tables. LOB data segments are not compressed.

75 Partition Table Partition the table by rules.
Data will be stored at different partition. Cannot partition a table that is part of a cluster. Cannot partition a table containing any LONG or LONG RAW columns.

76 Cluster Table Specify one column from the table for each column in the cluster key. A clustered table uses the cluster's space allocation. Object tables and tables containing LOB columns cannot be part of a cluster.

77 External Table It is a read-only table, whose metadata is stored in the database and table data stored in outside database, flat file. can specify only column, datatype, and inline_constraint. cannot specify constraints on an external table. cannot have object type columns, LOB columns, or LONG columns.

78 Global Temporary Table
Table is temporary and that its definition is visible to all sessions. The data in a temporary table is visible only to the session that inserts the data into the table. it contains either session-specific or transaction-specific data, which decided by the ON COMMIT clause.

79 Maintain Plan Table Sizing Housekeeping Plan Analyze Statistics data

80 Table Sizing Data type length Index Data growth VARCHAR2 LOB
Other type Index Rowid Data growth

81 Initial sizing method Calculate Row size by summing column length.
Insert initial data & analyze table to get the row size Analyze exiting table to get the row size. Space fragment redundancy (5%~30%).

82 Housekeeping Plan Which table need by housekept?
When to perform housekeeping? How to housekeep?

83 Which table need by housekept?
Transaction table / Log table; Increasing table; Large table

84 When to perform housekeeping?
Housekeeping is high cost operation. Should be performed at low-loading or down time. High housekeeping frequency will help to keep low HWM. Should be performed periodically.

85 How to housekeep? Housekeep condition
Time Status Online data ->[Compressed Data ] -> [ Archived Data ] -> Deleted data Schedule Job / Manually

86 Analyze Statistics data
Which table need be analyzed? When to analyze?

87 Which table need be analyzed?
In CBO, all of tables need be analyzed. Different kinds of table have different analyze interval.

88 When to analyze? Table be online for a time, when data enough.
Data volume changed dramatically. Table structure changed.

89 IV Example Student Course Management System

90 Student Course Management System
Entities Student Course Course Student

91 Student Course Management System
Attributes Student ID Name Sex Age Address College College Address Student Course ID Course Name Teacher ID Teacher Name Course

92 1NF – Eliminate Repeating Groups
Student ID First Name Last Name Sex Age Address College College Address Student Course ID Course Name Teacher ID Teacher First Name Teacher Last Name Course

93 Student Course Management System
Keys Student ID (PK) First Name (AK1) Last Name (AK1) Sex Age Address (AK1) College College Address Student Course ID (PK) Course Name (AK1) Teacher ID (AK1) Teacher First Name Teacher Last Name Course

94 Student Course Management System
Inversion Entry Student ID (PK) First Name (AK1) Last Name (AK1) Sex Age Address (AK1) College (IE1) College Address Student Course ID (PK) Course Name (AK1) (IE1) Teacher ID (AK1) (IE2) Teacher First Name Teacher Last Name Course

95 Student Course Management System
Relationship Student Elect Course Course Open For Student Student ID (PK) First Name (AK1) Last Name (AK1) Sex Age Address (AK1) College (IE1) College Address Student Course ID (PK) Course Name (AK1) (IE1) Teacher ID (AK1) (IE2) Teacher First Name Teacher Last Name Course

96 Student Course Management System
Transform Many-to-Many to One-to-Many Student ID (PK) First Name (AK1) Last Name (AK1) Sex Age Address (AK1) College (IE1) College Address Student Course ID (PK) Course Name (AK1) (IE1) Teacher ID (AK1) (IE2) Teacher First Name Teacher Last Name Course Student ID(FK1) Course ID(FK2) Score Election Times Credit Hour Election Course Open For Student Student Elect Course

97 Student Course Management System
2NF -- Ensure that all attributes depend on the entity identifier Student ID (PK) First Name (AK1) Last Name (AK1) Sex Age Address (AK1) College (IE1) College Address Student Course ID (PK) Course Name (AK1) (IE1) Teacher ID (AK1) (IE2) Teacher First Name Teacher Last Name Credit Hour Course Student ID(FK1) Course ID(FK2) Score Election Times Election Course Open For Student Student Elect Course

98 Student Course Management System
3NF -- Ensure that all attributes depend only on the entity identifier Student Election Course Student Elect Course Course Open For Student Student ID (PK) First Name (AK1) Last Name (AK1) Sex Age Address (AK1) College ID(IE1)(FK1) Student ID(FK1) Course ID(FK2) Score Election Times Course ID (PK) Course Name (AK1) (IE1) Teacher ID(AK1)(IE2)(FK1) Credit Hour Teacher Teach Course Teacher ID Teacher First Name Teacher Last Name College ID(FK1) Teacher College ID College Name College Address Rector College Teacher Belong to College Student Belong to College

99 Q & A

100 Thanks! fuyuncat


Download ppt "Maturity DB Process Design Stage Review Logical Design Physical Design"

Similar presentations


Ads by Google