IMS 4212: Database Implementation 1 Dr. Lawrence West, Management Dept., University of Central Florida Physical Database Implementation—Topics.

Slides:



Advertisements
Similar presentations
Storage Management Lecture 7.
Advertisements

Faculty of Information Technology Department of Computer Science Computer Organization Chapter 7 External Memory Mohammad Sharaf.
RAID (Redundant Arrays of Independent Disks). Disk organization technique that manages a large number of disks, providing a view of a single disk of High.
RAID A RRAYS Redundant Array of Inexpensive Discs.
RAID Oh yes Whats RAID? Redundant Array (of) Independent Disks. A scheme involving multiple disks which replicates data across multiple drives. Methods.
Chapter 18 Methodology – Monitoring and Tuning the Operational System Transparencies © Pearson Education Limited 1995, 2005.
RAID Redundant Array of Independent Disks
RAID Redundant Arrays of Inexpensive Disks –Using lots of disk drives improves: Performance Reliability –Alternative: Specialized, high-performance hardware.
R.A.I.D. Copyright © 2005 by James Hug Redundant Array of Independent (or Inexpensive) Disks.
2P13 Week 11. A+ Guide to Managing and Maintaining your PC, 6e2 RAID Controllers Redundant Array of Independent (or Inexpensive) Disks Level 0 -- Striped.
REDUNDANT ARRAY OF INEXPENSIVE DISCS RAID. What is RAID ? RAID is an acronym for Redundant Array of Independent Drives (or Disks), also known as Redundant.
Topic Denormalisation S McKeever Advanced Databases 1.
Chapter Physical Database Design Methodology Software & Hardware Mapping Logical Design to DBMS Physical Implementation Security Implementation Monitoring.
IS 4420 Database Fundamentals Chapter 6: Physical Database Design and Performance Leon Chen.
By : Nabeel Ahmed Superior University Grw Campus.
RAID Ref: Stallings. Introduction The rate in improvement in secondary storage performance has been considerably less than the rate for processors and.
CSE 321b Computer Organization (2) تنظيم الحاسب (2) 3 rd year, Computer Engineering Winter 2015 Lecture #4 Dr. Hazem Ibrahim Shehata Dept. of Computer.
L/O/G/O External Memory Chapter 3 (C) CS.216 Computer Architecture and Organization.
1 Distributed and Parallel Databases. 2 Distributed Databases Distributed Systems goal: –to offer local DB autonomy at geographically distributed locations.
Chapter 6 Physical Database Design. Introduction The purpose of physical database design is to translate the logical description of data into the technical.
IMS 4212: Distributed Databases 1 Dr. Lawrence West, Management Dept., University of Central Florida Distributed Databases Business needs.
CSC271 Database Systems Lecture # 30.
IT The Relational DBMS Section 06. Relational Database Theory Physical Database Design.
Lecture 9 of Advanced Databases Storage and File Structure (Part II) Instructor: Mr.Ahmed Al Astal.
Redundant Array of Inexpensive Disks aka Redundant Array of Independent Disks (RAID) Modified from CCT slides.
Database Design – Lecture 16
Chapter 8 Physical Database Design
Disk Structure Disk drives are addressed as large one- dimensional arrays of logical blocks, where the logical block is the smallest unit of transfer.
Chapter 7: Database Systems Succeeding with Technology: Second Edition.
Physical Database Design & Performance. Optimizing for Query Performance For DBs with high retrieval traffic as compared to maintenance traffic, optimizing.
IMS 6217: Primary Key Reference 1 Dr. Lawrence West, MIS Dept., University of Central Florida Primary Keys Uniqueness of Table Rows Identifier.
Chapter 8 Physical Database Design
CE Operating Systems Lecture 20 Disk I/O. Overview of lecture In this lecture we will look at: Disk Structure Disk Scheduling Disk Management Swap-Space.
IMS 4212: Normalization 1 Dr. Lawrence West, Management Dept., University of Central Florida Normalization—Topics Functional Dependency.
© Pearson Education Limited, Chapter 15 Physical Database Design – Step 7 (Consider Introduction of Controlled Redundancy) Transparencies.
IMS 4212: Data Modeling—Attributes 1 Dr. Lawrence West, Management Dept., University of Central Florida Attributes and Domains Nonkey.
Database Management COP4540, SCS, FIU Physical Database Design (ch. 16 & ch. 3)
Distributed Database. Introduction A major motivation behind the development of database systems is the desire to integrate the operational data of an.
IMS 4212: Introduction to Data Modeling—Relationships 1 Dr. Lawrence West, Management Dept., University of Central Florida Relationships—Topics.
Methodology – Monitoring and Tuning the Operational System.
The concept of RAID in Databases By Junaid Ali Siddiqui.
Sekolah Tinggi Ilmu Statistik (STIS). Main Topics Denormalizing and introducing controlled redundancy  Meaning of denormalization.  When to denormalize.
Topic Distributed DBMS Database Management Systems Fall 2012 Presented by: Osama Ben Omran.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
1 Agenda: 04/22 and 04/24 Answer questions about Replica Toys post-sales project. Total points remaining for the project = 140. Currently split into 30%
IMS 4212: Database Implementation 1 Dr. Lawrence West, Management Dept., University of Central Florida Physical Database Implementation—Topics.
 Distributed Database Concepts  Parallel Vs Distributed Technology  Advantages  Additional Functions  Distribution Database Design  Data Fragmentation.
IMS 6217: Relational Data Model 1 Dr. Lawrence West, MIS Dept., University of Central Florida Introduction to Databases—Topics Information.
IMS 4212: Indexes (Indices) 1 Dr. Lawrence West, Management Dept., University of Central Florida Indexes—Topics Reasons for concern Data.
1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September 2008, revised October 2009.
Hands-On Microsoft Windows Server 2008 Chapter 7 Configuring and Managing Data Storage.
IMS 4212: Normalization 1 Dr. Lawrence West, Management Dept., University of Central Florida Normalization—Topics Functional Dependency.
Ahsan Abdullah 1 Data Warehousing Lecture-8 De-normalization Techniques Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
Enhanced Availability With RAID CC5493/7493. RAID Redundant Array of Independent Disks RAID is implemented to improve: –IO throughput (speed) and –Availability.
1 Chapter 22 Distributed DBMSs - Concepts and Design Simplified Transparencies © Pearson Education Limited 1995, 2005.
Multiple Platters.
Physical Changes That Don’t Change the Logical Design
RAID Non-Redundant (RAID Level 0) has the lowest cost of any RAID
Physical Database Design and Performance
Methodology – Monitoring and Tuning the Operational System
CHAPTER 5: PHYSICAL DATABASE DESIGN AND PERFORMANCE
RAID RAID Mukesh N Tekwani
Physical Database Design
Chapter 6: Physical Database Design and Performance
Practical Database Design and Tuning
Overview Continuation from Monday (File system implementation)
UNIT IV RAID.
Methodology – Monitoring and Tuning the Operational System
RAID RAID Mukesh N Tekwani April 23, 2019
Storage Management Lecture 7.
Presentation transcript:

IMS 4212: Database Implementation 1 Dr. Lawrence West, Management Dept., University of Central Florida Physical Database Implementation—Topics Denormalization Partitioning Tables (relations) Parallel Processing & RAID

IMS 4212: Database Implementation 2 Dr. Lawrence West, Management Dept., University of Central Florida Denormalization Denormalizing is the process of reshuffling attributes and sometimes entities to create entities that violate the rules of normalization We are trading off (again) storage efficiency and anomaly avoidance for better retrieval efficiency Denormalizing includes: –Storing derived attributes explicitly –Allowing transitive dependencies (violating second, third, or Boyce-Codd normal form) –Merging entities in 1:1 relationships

IMS 4212: Database Implementation 3 Dr. Lawrence West, Management Dept., University of Central Florida Denormalization (cont.) Derived attributes –Storing derived attributes is one of the most common means of improving processing efficiency –How many tables/row examinations are avoided by storing total grade points and total credit hours with the STUDENT entity? –What new operations must be introduced to keep the data current? –Explicitly storing derived attributes gives rise to new operational business rules to enforce accuracy

IMS 4212: Database Implementation 4 Dr. Lawrence West, Management Dept., University of Central Florida Denormalization (cont.) 1:1 Relationships –It may be possible to collapse data from one entity in a 1:1 relationship into the other. –Usually the pervasive entity survives –Alternately, both entities may be retained but the data from one may be copied into the other to avoid a table look-up

IMS 4212: Database Implementation 5 Dr. Lawrence West, Management Dept., University of Central Florida Denormalizing (cont.) 1:M Relationships –You may consider moving or duplicating attributes from the “one” side of a 1:M relationship into the “many” side –This will result in considerable data duplication –Considerations There should be many records on the “one” side Frequent access should be directly into the “many” side

IMS 4212: Database Implementation 6 Dr. Lawrence West, Management Dept., University of Central Florida Denormalization (cont.) 1:M Relationships (cont.) Similar technique may be used by collapsing or copying attributes into the associative entity between two entities in a M:M relationship

IMS 4212: Database Implementation 7 Dr. Lawrence West, Management Dept., University of Central Florida Denormalization (cont.) The goal of denormalizing is to avoid accessing a (large) table for high frequency critical transactions Denormalizing usually requires additional business rules to guarantee that data remains accurate in the face of updates

IMS 4212: Database Implementation 8 Dr. Lawrence West, Management Dept., University of Central Florida Partitioning Partitioning entities divides one table into many –Horizontal partitioning Each table has all fields from the original table Each table has a subset of records –Vertical partitioning Each table has the PK of the original table Each table has all records Each table has a subset of fields –May partition both vertically and horizontally Very powerful technique with historical data

IMS 4212: Database Implementation 9 Dr. Lawrence West, Management Dept., University of Central Florida Partitioning (cont.) Horizontal Partitioning –How many records in the STUDENT table? –How many of them are currently enrolled? –How frequently do we need to access both current and former students in the same query or operation? –It may make sense to partition tables based on a historical context Active records vs. archived records –May also partition based on geographic considerations –Whole table can be reconstructed using UNION query

IMS 4212: Database Implementation 10 Dr. Lawrence West, Management Dept., University of Central Florida Partitioning (cont.) Vertical Partitioning –Librarian, Registrar, Athletic Department, and Health Center may all need a different subset of fields from the STUDENT entity –It may make sense to create separate tables containing the necessary attributes for each view –Common PK creates 1:1 cardinality between all tables –Whole logical record can be assembled using SQL when needed –We are actually backing into a supertype/subtype relationship

IMS 4212: Database Implementation 11 Dr. Lawrence West, Management Dept., University of Central Florida RAID Storage Devices In conventional drives data is laid down sequentially along a track in the disk –Read/Write head must move along the track to read the data –Each read/write operation must finish before the next can begin –A drive failure can result in loss of all data

IMS 4212: Database Implementation 12 Dr. Lawrence West, Management Dept., University of Central Florida RAID Storage Devices RAID is for Redundant Array of Inexpensive Disks –Multiple disks appear as a single logical drive to the computer –May be implemented in hardware or software (OS) Various RAID levels provide for different levels of performance and redundancy Most RAID levels enable the rebuilding of entire lost physical drives through parity storage

IMS 4212: Database Implementation 13 Dr. Lawrence West, Management Dept., University of Central Florida RAID Storage Devices—Raid 3 Records are striped across multiple physical devices –Part of each record is laid down across multiple physical drives –Much faster Read/Write time since disk rotation needed to read whole record/block is much shorter –However only one request can be serviced concurrently –Not commonly used in practice A single parity disk allows reconstruction of data on damaged drives * Image source: Wikipedia *

IMS 4212: Database Implementation 14 Dr. Lawrence West, Management Dept., University of Central Florida RAID Storage Devices—Raid 4 Blocks are stored independently on the drives –Block A1 can be serviced just by Drive 0 –Simultaneous requests for Blocks B2 or D3 can also be serviced A single parity drive enables recovery of lost data Write operations may be slower—simultaneous write operations to Drives 0-2 must wait on the parity calculation and writing on Drive 3 * Image source: Wikipedia *

IMS 4212: Database Implementation 15 Dr. Lawrence West, Management Dept., University of Central Florida RAID Storage Devices—Raid 5 Similar to Raid 4 except that parity storage is distributed across multiple drives –Rotating allocation –Lessens the chance that writes on two drives will wait on parity updates on a single parity drive * Image source: Wikipedia *

IMS 4212: Database Implementation 16 Dr. Lawrence West, Management Dept., University of Central Florida Parallel Processing More and more computers support parallel processing (multiple CPUs on the same computer) Some tasks can be split among multiple processors In an SQL SELECT query the usual method requires the RDBMS to scan each record to determine if it matches the WHERE clause or JOIN criteria In parallel processing part of the whole table is passed to each processor Availability depends on hardware, OS, and RDBMS