Download presentation
Presentation is loading. Please wait.
Published byHomer Powers Modified over 9 years ago
1
Physical Database Design DeSiaMorePowered by DeSiaMore 1
2
Lecture Objectives Overview of Physical Database Design Process. Describing volume and usage analysis. Exploring the designing of fields. Designing of physical records and denomalization. DeSiaMore Powered by DeSiaMore 2
3
What is Physical Database Design Physical database design involves taking the results from the logical design process and fine-tuning them against the usage, performance and storage requirements of some applications. Logical database design is about implementation independence. Physical database design is about implementation dependence. DeSiaMore Powered by DeSiaMore 3
4
Introduction The purpose of physical database design is to translate the logical description of data into the technical specifications for storing and retrieving data The goal is to create a design for storing data that will provide adequate performance, and insure database integrity, security, and recoverability DeSiaMore Powered by DeSiaMore 4
5
Inputs to Physical Design Normalized relations Attribute definitions Estimations of data processing volume Descriptions of where and when data are entered, retrieved, deleted, and updated Response time expectations/requirements Requirements for data security, backup, recovery, retention, and integrity Characteristics of the DBMS to be used DeSiaMore Powered by DeSiaMore 5
6
What is Physical Database Design The following activities are part of physical database design. Volume and Usage Analysis Integrity analysis Control Security Analysis Data Distribution Analysis. DeSiaMore Powered by DeSiaMore 6
7
Volume Analysis It is the first step to be taken to move from logical to physical design. It aims at establishing estimates of the possible number of instances per entity. This is useful because it estimates how many instances are most likely to be stored the system on average. DeSiaMore Powered by DeSiaMore 7
8
Volume Analysis The table below summarises sizing estimates for the student database. DeSiaMore Powered by DeSiaMore 8
9
Volume Analysis Data volumes reflect number of records in tables. Access frequencies reflect number of table record accesses per unit of time Note what attributes are used in table accesses (to aid design of table indexes) DeSiaMore Powered by DeSiaMore 9
10
Usage Analysis Usage analysis requires that we identify the major transactions required for a database system. Transactions considered here consists of series of insertions, updates, retrieavals, or a mixture of all fours. DeSiaMore Powered by DeSiaMore 10
11
A sample of Transactions Below are simple transactions common to College Database Register new students Add new courses Assign a lecturer to a course DeSiaMore Powered by DeSiaMore 11
12
Group Exercise Given particular supermarket database design, you are required to draw on various transactions that can be done. The logical design of the database consists of products, customer, and supplier tables. DeSiaMore Powered by DeSiaMore 12
13
Physical Design Decisions Specify the data type for each attribute from the logical data model Specify physical records by grouping attributes from the logical data model Specify the file organization technique to use for physical storage of data records Specify indexes to optimize data retrieval Specify query optimization strategies DeSiaMore Powered by DeSiaMore 13
14
Designing Fields Field: smallest unit of data in database Field design Choosing data type Coding, compression, encryption Controlling data integrity DeSiaMore Powered by DeSiaMore 14
15
Choosing Data Types CHAR–fixed-length character VARCHAR2–variable-length character (memo) LONG–large number NUMBER–positive/negative number INTEGER–positive/negative whole number DATE–actual date BLOB–binary large object (good for graphics, sound clips, etc.) DeSiaMore Powered by DeSiaMore 15
16
Designing Fields Choosing the field data type: Select from available types such as: text, memo, number, date/time, currency, etc. Seek to: Minimize storage space e.g., Integer vs. Floating Point Represent all possible values e.g., Floating Point vs. Integer Improve data integrity (more on next slide) e.g., Yes/No Support all data manipulations e.g., Date/Time DeSiaMore Powered by DeSiaMore 16
17
Designing Fields Controlling data integrity Default value e.g., value “FL” for State field Range control e.g., value “<=100” for Test_Score field Null value control e.g.,prohibit leaving Date_of_Birth field blank Referential integrity e.g., restrict valid values for Part_No field in Order table to the contents of this field in the Part table DeSiaMore Powered by DeSiaMore 17
18
Designing Fields Fixed-Length Fields: Make it easy to locate a specific record in a file and/or a specific field in that record Each field has its maximum length specified and unused space in any given field is padded with spaces (text) or leading zeros (numeric) Variable-Length Fields: When the need arises for a variable-length field (e.g., a memo field), this field can be stored separate from the rest of the record with a pointer used to locate it when needed DeSiaMore Powered by DeSiaMore 18
19
Physical Records Physical Record: “A group of fields stored in adjacent memory locations and retrieved together as a unit.” Page: “The amount of data read or written in one secondary memory (disk) input or output operation.” Blocking Factor: “The number of physical records per page.” DeSiaMore Powered by DeSiaMore 19
20
Database Access Model The goal in structuring physical records is to minimize performance bottlenecks resulting from disk accesses (accessing data from disk is slow compared to main memory) DeSiaMore Powered by DeSiaMore 20
21
Optimization Decisions Denormalization Partitioning Selection of File Organization Creation of Indexes DeSiaMore Powered by DeSiaMore 21
22
Denormalisation The main problem with a fully normalised database is that it has many tables. To perform useful queries such tables have to be reconstituted via expensive join operations. Updates frequently have to be performed across more than one table. DeSiaMore Powered by DeSiaMore 22
23
Denormalisation One obvious way of improving retrieval or update performance is to go back from a fully normalized database and introduce some controlled redundancy. DeSiaMore Powered by DeSiaMore 23
24
Definition “The process of transforming normalized relations into unnormalized physical record specifications [for the purpose of improving overall database performance].” or DeSiaMore Powered by DeSiaMore 24
25
Definition Denormalization is a technique to move from higher to lower normal forms of database modeling in order to speed up database access. You may apply Denormalization in the process of deriving a physical data model from a logical form. DeSiaMore Powered by DeSiaMore 25
26
Example Four examples of strict violations of normalization are shown in the model of schema below: ORDER (Order No, Customer No, Customer Name, Customer Address, Order Date) ORDER LINE (Order No, Line No, Customer No, Customer Name, Customer Address, Product Code, Unit Count, Unit Price, Total Price, Required By Date) DeSiaMore Powered by DeSiaMore 26
27
Example From the schema above It can be assumed that Customer Name and Customer Address have been copied from a Customer table with primary key Customer No. Customer No has been copied from the Order table to the Order Line table. DeSiaMore Powered by DeSiaMore 27
28
Example It can be assumed that Unit Price has been copied from a Product table with primary key Product Code. Total Price can be calculated by multiplying Unit Price by Unit Count. DeSiaMore Powered by DeSiaMore 28
29
Example……Benefits Changes such as this are intended to offer performance benefits for some transactions. For example, a query on the Order Line table that also requires the Customer No does not have to also access the Order table. DeSiaMore Powered by DeSiaMore 29
30
Example……Benefits However, there is a down side: each such additional column must be carefully controlled. It should not be able to be updated directly by users. It must be updated automatically by the application (e.g., via a DBMS trigger). DeSiaMore Powered by DeSiaMore 30
31
Partitioning Horizontal Partitioning: Distributing the rows of a table into two or more separate files e.g., Customer table is partitioned into four separate files, one for each geographical region Vertical Partitioning: Distributing the columns of a table into two or more separate files e.g., Employee table is partitioned into public file (name, office, extension, etc.) and private file (salary, health history, etc.) Note: the primary key is repeated in each file DeSiaMore Powered by DeSiaMore 31
32
Partitioning Advantages of Partitioning: Records used together are grouped together Each partition can be optimized for performance Security and recovery Partitions stored on different disks: less contention Parallel processing capability Disadvantages of Partitioning: Slower retrievals when across partitions Complexity for application programmers Anomalies and extra storage space requirements due to duplication of data across partitions DeSiaMore Powered by DeSiaMore 32
33
Physical Files Physical File: A file as stored on disk Constructs to link two pieces of data: Sequential storage Pointers File Organization: How the files are arranged on the disk. Access Method: How the data can be retrieved based on the file organization Relative - data accessed as an offset from the most recently referenced point in secondary memory Direct - data accessed as a result of a calculation to generate the beginning address of a record DeSiaMore Powered by DeSiaMore 33
34
File Organizations “A technique for physically arranging the records of a file on secondary storage devices.” Goals in selecting: (trade-offs exist, of course) Fast data retrieval High throughput for input and maintenance Efficient use of storage space Protection from failures or data loss Minimal need for reorganization Accommodation for growth Security from unauthorized use DeSiaMore Powered by DeSiaMore 34
35
File Organizations Sequential Indexed Indexed Sequential Indexed Nonsequential DeSiaMore Powered by DeSiaMore 35
36
Sequential File Organization Records of the file are stored in sequence by the primary key field values DeSiaMore Powered by DeSiaMore 36
37
Sequential Retrieval Consider a file of 10,000 records each occupying 1 page Queries that require processing all records will require 10,000 accesses e.g., Find all items of type 'E' Many disk accesses are wasted if few records meet the condition However, very effective if most or all records will be accessed (e.g., payroll) DeSiaMore Powered by DeSiaMore 37
38
Indexed File Organization Index concept is like index in a book Indexed-sequential file organization: The records are stored sequentially by primary key values and there is an index built on the primary key field (and possibly indexes built on other fields, also) DeSiaMore Powered by DeSiaMore 38
39
Indexing An index is a table file that is used to determine the location of rows in another file that satisfy some condition DeSiaMore Powered by DeSiaMore 39
40
Querying with an Index Read the index into memory Search the index to find records meeting the condition Access only those records containing required data Disk accesses are substantially reduced when the query involves few records DeSiaMore Powered by DeSiaMore 40
41
Maintaining an Index Adding a record requires at least two disk accesses: Update the file Update the index Trade-off: 4Faster queries 8Slower maintenance (additions, deletions, and updates of records) Thus, more static databases benefit more overall DeSiaMore Powered by DeSiaMore 41
42
Rules of Thumb for Using Indexes 1. Indexes are most useful on larger tables 2. Index the primary key of each table (may be automatic, as in Access) 3. Indexes are useful on search fields (WHERE) 4. Indexes are also useful on fields used for sorting (ORDER BY) and categorizing (GROUP BY) 5. Most useful to index on a field when there are many different values for that field DeSiaMore Powered by DeSiaMore 42
43
Rules of Thumb for Using Indexes 6. Find out the limits placed on indexing by your DBMS (Access allows 32 indexes per table, and no index may contain more than 10 fields) 7. Depending on the DBMS, null values may not be referenced from an index (thus, rows with a null value in the field that is indexed may not be found by a search using the index) DeSiaMore Powered by DeSiaMore 43
44
Group Exercise Consider a college database consisting of three tables, Student, Lecture, and Course. Denormalize your tables so that you increase the performance of the following query: Give all students who take database development course lectured by Bajuna DeSiaMore Powered by DeSiaMore 44
45
Next Topic Client/Server and Middleware DeSiaMore Powered by DeSiaMore 45
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.