5Data Model What is a logical data model? What is the purpose of data modeling?How to design logical data model?
6What is Data Model?A model is an abstract representation of some real thing.Data modeling is the action of exploring data-oriented structures. A logical data model is a graphical representation of the information requirements of a business area, it is not a database.
7Data Models Concepts Conceptual data models Logical data models (LDMs). Physical data models (PDMs).
8What is the difference between a logical data model and a physical database design? THE LOGICAL MODELTHE PHYSICAL DATABASE DESIGNIncludes all entities, relationships, and attributes(and their information types) whether supportedby a technology or not.Uses business names.Captures and records information necessary for the business.Includes tables, columns, keys, datatypes, validation rules. DB triggers, stored procedures, domains, and access constraints (security).Names may be limited by the DBMS.Includes technology-specific data elements such as flags, switches, and timestamps.Includes unique identifiers.Includes primary keys, foreign keys, and indices for fast data access.Is normalized to at least 3rd normal form.May be de-normalized to meet performance requirements.Does not include any redundant data.May include redundant data elements.Does not include any derived data.May include results of complex or difficult to recreate calculations.Business experts drive the model.Designer drive the model.
11Logical Data Model Format Logical Data Model is in format known as “Entity Relationship Diagram” (ERD)Most popular data modeling tools are Erwin, ER Studio and Power Designer.
12Data Model What is a logical data model? What is the purpose of data modeling?How to design logical data model?
13Advantages to Using a Model Easier to understand model at a glanceNo need to trace through narrative descriptions of relationshipsCommunicates one clear definitionUnderstood by business and technical staff
14Benefits of a Logical Data Model Using a Logical Data model speeds maintenance and eases the Transition to new technologies.Capture business requirements (ensure understanding)Ability to share data across enterprise resulting in:Accurate dataConsistent dataReduced costsEasier to implement changes in your businessBusiness requirements can be satisfied in database design
15Data Model What is a logical data model? What is the purpose of data modeling?How to design logical data model?
16Who uses the logical data model? The Business Area Experts own the logical data model.They describe their data requirements to the data modeler and review the models created. They use the models for impact analysis of changes to business requirements.The Data Modeler conducts facilitated sessions with business area experts to gather the data requirements and build the logical data model.The data modeler also works with the process analyst to link data with processes. The data modeler is responsible for getting approval of the logical data model from the business area experts and then works with the DBA to transition the logical model to the physical model.The DBA (Designer) builds the physical data model from the logical data model.To create a good quality database design, the DBA reviews the logical model to select technology appropriate keys, create indexes, detail data types, and build referential integrity to protect the data values. The database administrator may de-normalize the database for efficiency. DBAs also are responsible for creating db schemas, maintaining referential integrity, and monitoring database performance.
17Actions in Data Modeling Identify – Determine which things are represented in the model.Name – Each thing represented in the model needs to have a unique and meaningful name.Describe – Name is important, but not sufficient. Description should be no more than three sentences, each with subject, object, and verb. Must answer:What is it?What it is not.Sometimes: What are some examples?Associate – Much of the meaning is in associations among the things represented in the model.
18How to Model Data Identify entity types Identify attributes Assign keysInversion EntriesIdentify relationshipsNormalize to Reduce Data Redundancy
19What is an Entity?Entity: a person, place, thing, concept or event that the business wants to store information aboutA movie is an entertainment, documentary, or educational event which has been recorded in a moving picture format.MOVIE
20Entity and InstanceEach entity is made up by a group of objects, which are named as Instances.Each instance can be identified from other instances.
21ENTITY Examples Mr.Koch People Ms.Chou HongKong Place R.O.C BMW 525i category ENTITY InstanceMr.KochEMPLOYEESTUDENTOFFICEAUTOMOBILECHEMICALFUNDS TRANSFERTENNIS TOURNAMENTCOUNTRYDEPARTMENTORDERPeoplePlaceThingsEventconceptMs.ChouHongKongR.O.CBMW 525iAmmonia42233U.S. OPENL789I12345
22What is an Attribute?Attribute: a fact or characteristic of an entity with only one meaning (atomic)Each entity type will have one or more data attributesattributesEmployee IdEmployee Last NameEmployee First NameEmployee AddressEmployee Phone NumberEMPLOYEEENTITY Name
23Two kinds of Attributes Key AttributesNon-key AttributesConsultant IdConsultant Last NameConsultant First NameConsultant SpecializationConsultant Hourly RateCONSULTANTKey AttributesNon-key Attributes
24Candidate KeysOne single attribute or a group of attributes that can be used to identify each instance.TEACHERTeacher Last NameTeacher First NameTeacher AddressTeacher CountryTeacher Certificate IdTeacher Mother Maiden NameTeacher Phone NumberTeacher Date of Birth
25Primary KeyA candidate key with the highest priority that be used to identify the instanceEMPLOY IDFirst NameLast NameAddressDepartmentPhone NumberBirthdayEmployeePK
26Alternate Key All the candidate keys except PK Employee Id Employee Last Name (AK1)Employee First Name (AK1)Employee AddressEmployee CityEmployee StateEmployee Zip CodeEmployee Phone Number (AK2)Employee Date of Birth (AK1,AK2)
27Inversion EntriesSome of attributes be used to find out the instance wanted. The result may not be unique.Employee IdEmployee Last Name (AK1,IE2)Employee First Name (AK1)Employee AddressEmployee City (IE1)Employee State (IE1)Employee Zip CodeEmployee Phone NumberEmployee Date of Birth (AK1)EMPLOYEE
28What is a Relationship?Relationship: an association between occurrences of one or more entities which provides some relevant and valuable informationMOVIEVIDEO TAPEis recorded onrecords
29What is a Verb PhraseParent-to-child verb phrase describes how the parent is related to the child. In the example to the left, the verb phrase states that “STORE rents A MOVIE.”Child-to-parent verb phrase describes how a child entity is related to a parent entity. In the example to the left, the verb phrase states that “MOVIE is rented from A STORE”
30Cardinality of Relationship One-to-oneOne-to-manyMany-to-oneMany-to-manyAll types can be optional for one or both entities
31Identifying Relationship An identifying relationship is a relationship between two entities in which an instance of a child entity is identified through its association with a parent entity, which means the child entity is dependent on the parent entity for its identify and cannot exist without it.MOVIE MASTERMovie Master IdMovie NameMovie StarMovie TypeMovie RatingMOVIE COPYMovie Master Id (FK) Movie Copy NumberMovie Copy Create DateMovie Copy Due DateMovie Copy Conditionis rented as/is created from
32Mandatory non-identifying relationship A non-identifying relationship in which an instance of the child entity must be related to an instance of the parent entity.places/is received fromCUSTOMERCustomer IdCustomer NameCustomer AddressCustomer PhoneORDEROrder NumberCustomer Id (FK)Order DateOrder StatusOrder Shipdate
33Non-mandatory non-identifying relationship A non-identifying relationship in which an instance of the child entity can exist without being related to an instance of the parent entity.EMPLOYEEEmployee IdDepartment Number (FK)Employee NameEmployee Addressemploys/belongs toDepartment NumberDepartment NameDepartment LocationDEPARTMENT
34Many-to-Many Relationship A many-to-many relationship is one where a relationship and its inverse are both to-many (if you are used to entity-relationship modeling using a relational database.is ordered from /sends usPARTSUPPLIER
35Build Relationship 1:M Y N Start 1 : M M:M Cardinality of R M : M Draw and name an Identifying Relationship from Parent to ChildM:MinheritableorNon-inheritableDraw and name a Non-identifying Relationship from Parent to ChildFK - NO NULLFK - NULLS ALLOWED1 : MM : M1:MCardinality of RIndentifyNon-identifyStartYN
36Normalize to Reduce Data Redundancy Data normalization is a process in which data attributes within a data model are organized to increase the cohesion of entity types.LevelRuleFirst normal form (1NF)An entity type is in 1NF when it contains no repeating groups of data.Second normal form (2NF)An entity type is in 2NF when it is in 1NF and when all of its non-key attributes are fully dependent on its primary key.Third normal form (3NF)An entity type is in 3NF when it is in 2NF and when all of its attributes are directly dependent on the primary key.
37NormalizationStep by step process to verify and refine logical data modelCondition of model at completion of each step is a “normal form”DOT standard is third normal formFirst normal form: Eliminate repeating groupsSecond normal form: Ensure that all attributes depend on the entity identifierThird normal form: Ensure that all attributes depend only on the entity identifier
381st Normal Form Eliminate repeating groups To remove the repeating group of fields, collapse them into a single field with multiple records in a new table, related back to the primary data.
392nd Normal Form Uniquely identify each instance Each table must contain attributes for a single subject and each table must contain an attribute (or set of attributes) that uniquely identify a single record within that table.
403rd Normal Form Eliminate columns not dependent on the key Each attribute must depend on the primary key, so the violating fields are moved into separate, related tables.
49Group TableGroup table by business moduleGroup table by relationship
50Column Data Type Choose data type Length LOB Char Varchar2 Number Integer FloatLengthLOBStore in rowStore in another tablespace
51Assign Primary Key Natural Key Surrogate Key Assign a natural key which is one or more existing data attributes that are unique to the business concept.Surrogate KeyIntroduce a new column, called a surrogate key, which is a key that has no business meaning.
52Natural Key Advantage Disadvantage No need introduce new column Meaningful and understandableKey value is transferableDisadvantageMay changed by business requirement changeMay contain many columns in feature generationKey value may be updated which will also impact children tables
53Surrogate Key Advantage Disadvantage Not related to business, be easily maintainStableJust contain one single column, simplify the foreign keyDisadvantageWill lead to recursive relationshipHard to understand the relationship and its typeMay add redundancy code
54How to choose surrogate key? Key assigned by the RDBMS, e.g. SEQUENCEMax()+1Universally Unique Identifiers (UUID)Global Unique Identifiers (GUID)High-Low strategy
55Choose Key Strategies Unique Minimal Columns Not null Stable Fit to the application
56Assign Foreign Key Ensure the data integration Delete/Update Cascade Which case no need assign Foreign Key?
57How to choose index Proto-index from logical model Eliminate overlapped indexEliminate low-hit indexColumn sequence in indexB-Tree .vs. Bitmap
58Proto-index from logical model Inversion EntryPrimary KeyCandidate KeyForeign Key
59Eliminate overlapped index Index overlap indexMultiple Option Columns
60Eliminate low-hit index Small Table / Cached TableIndexed Column cardinality(1/distinct_value_num)*total_value_num
61Column sequence in index High searching column leading the indexLow Cardinality column leading the indexConduce to eliminate duplicated index
62B-Tree .vs. Bitmap B-Tree Index Bitmap Index OLTP table Low Cardinality ColumnBitmap IndexDSS/OLAP tableHigh Cardinality Column
63Denormalize to improve performance Adding redundancy data to avoid costly table joins can dramatically improve the query performance.
64When denormalize? Repeatedly join two table together. Additional query item.Additional order by item.
65Which column be redundancy Small data columnStatic and rarely updated column
66Materialized ViewA materialized view is a database object that contains the results of a query.A view of tables;Query result be stored physically.
69Tablesapce Dictionary Management Tablespace (DMT) Local Management Tablespace (LMT)
70ASSMASSM (Automatic Segment Space Management) is a method used by Oracle to manage space inside data blocks. It eliminates the need to specify parameters like PCTUSED, Freelists and Freelist groups for objects created in the tablespace.
72Cached TableFor data that is accessed frequently, this clause indicates that the blocks retrieved for this table are placed at the most recently used end of the least recently used (LRU) list in the buffer cache when a full table scan is performed. This attribute is useful for small lookup tables.You cannot specify CACHE for an index-organized table. However, index-organized tables implicitly provide CACHE behavior.
73Index Organized TableThe data rows are held in an index defined on the primary key for the table.Best suited for primary key-based access and manipulation.
74Compressed Table Enables data segment compression to reduce disk use. Only for heap-organized tables.LOB data segments are not compressed.
75Partition Table Partition the table by rules. Data will be stored at different partition.Cannot partition a table that is part of a cluster.Cannot partition a table containing any LONG or LONG RAW columns.
76Cluster TableSpecify one column from the table for each column in the cluster key.A clustered table uses the cluster's space allocation.Object tables and tables containing LOB columns cannot be part of a cluster.
77External TableIt is a read-only table, whose metadata is stored in the database and table data stored in outside database, flat file.can specify only column, datatype, and inline_constraint.cannot specify constraints on an external table.cannot have object type columns, LOB columns, or LONG columns.
78Global Temporary Table Table is temporary and that its definition is visible to all sessions.The data in a temporary table is visible only to the session that inserts the data into the table.it contains either session-specific or transaction-specific data, which decided by the ON COMMIT clause.
79Maintain PlanTable SizingHousekeeping PlanAnalyze Statistics data
80Table Sizing Data type length Index Data growth VARCHAR2 LOB Other typeIndexRowidData growth
81Initial sizing method Calculate Row size by summing column length. Insert initial data & analyze table to get the row sizeAnalyze exiting table to get the row size.Space fragment redundancy (5%~30%).
82Housekeeping Plan Which table need by housekept? When to perform housekeeping?How to housekeep?
83Which table need by housekept? Transaction table / Log table;Increasing table;Large table
84When to perform housekeeping? Housekeeping is high cost operation.Should be performed at low-loading or down time.High housekeeping frequency will help to keep low HWM.Should be performed periodically.
85How to housekeep? Housekeep condition TimeStatusOnline data ->[Compressed Data ] -> [ Archived Data ] -> Deleted dataSchedule Job / Manually
86Analyze Statistics data Which table need be analyzed?When to analyze?
87Which table need be analyzed? In CBO, all of tables need be analyzed.Different kinds of table have different analyze interval.
88When to analyze? Table be online for a time, when data enough. Data volume changed dramatically.Table structure changed.
921NF – Eliminate Repeating Groups Student IDFirst NameLast NameSexAgeAddressCollegeCollege AddressStudentCourse IDCourse NameTeacher IDTeacher First NameTeacher Last NameCourse
93Student Course Management System KeysStudent ID (PK)First Name (AK1)Last Name (AK1)SexAgeAddress (AK1)CollegeCollege AddressStudentCourse ID (PK)Course Name (AK1)Teacher ID (AK1)Teacher First NameTeacher Last NameCourse
94Student Course Management System Inversion EntryStudent ID (PK)First Name (AK1)Last Name (AK1)SexAgeAddress (AK1)College (IE1)College AddressStudentCourse ID (PK)Course Name (AK1) (IE1)Teacher ID (AK1) (IE2)Teacher First NameTeacher Last NameCourse
95Student Course Management System RelationshipStudent Elect CourseCourse Open For StudentStudent ID (PK)First Name (AK1)Last Name (AK1)SexAgeAddress (AK1)College (IE1)College AddressStudentCourse ID (PK)Course Name (AK1) (IE1)Teacher ID (AK1) (IE2)Teacher First NameTeacher Last NameCourse
96Student Course Management System Transform Many-to-Many to One-to-ManyStudent ID (PK)First Name (AK1)Last Name (AK1)SexAgeAddress (AK1)College (IE1)College AddressStudentCourse ID (PK)Course Name (AK1) (IE1)Teacher ID (AK1) (IE2)Teacher First NameTeacher Last NameCourseStudent ID(FK1)Course ID(FK2)ScoreElection TimesCredit HourElectionCourse Open For StudentStudent Elect Course
97Student Course Management System 2NF -- Ensure that all attributes depend on the entity identifierStudent ID (PK)First Name (AK1)Last Name (AK1)SexAgeAddress (AK1)College (IE1)College AddressStudentCourse ID (PK)Course Name (AK1) (IE1)Teacher ID (AK1) (IE2)Teacher First NameTeacher Last NameCredit HourCourseStudent ID(FK1)Course ID(FK2)ScoreElection TimesElectionCourse Open For StudentStudent Elect Course
98Student Course Management System 3NF -- Ensure that all attributes depend only on the entity identifierStudentElectionCourseStudent Elect CourseCourse Open For StudentStudent ID (PK)First Name (AK1)Last Name (AK1)SexAgeAddress (AK1)College ID(IE1)(FK1)Student ID(FK1)Course ID(FK2)ScoreElection TimesCourse ID (PK)Course Name (AK1) (IE1)Teacher ID(AK1)(IE2)(FK1)Credit HourTeacher Teach CourseTeacher IDTeacher First NameTeacher Last NameCollege ID(FK1)TeacherCollege IDCollege NameCollege AddressRectorCollegeTeacher Belong to CollegeStudent Belong to College