1 CG171 - Database Implementation and Development (Physical Database Design) – Lecture 7 Storage Allocation & Data Access Methods By Dr. Akhtar Ali.

Slides:



Advertisements
Similar presentations
5 Copyright © 2005, Oracle. All rights reserved. Managing Database Storage Structures.
Advertisements

©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Part C Part A:  Index Definition in SQL  Ordered Indices  Index Sequential.
Quick Review of Apr 10 material B+-Tree File Organization –similar to B+-tree index –leaf nodes store records, not pointers to records stored in an original.
DBMS 2001Notes 4.2: Hashing1 Principles of Database Management Systems 4.2: Hashing Techniques Pekka Kilpeläinen (after Stanford CS245 slide originals.
Hashing and Indexing John Ortiz.
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree.
1 Introduction to Database Systems CSE 444 Lectures 19: Data Storage and Indexes November 14, 2007.
File Management Chapter 12. File Management A file is a named entity used to save results from a program or provide data to a program. Access control.
Database Systems: A Practical Approach to Design, Implementation and Management International Computer Science S. Carolyn Begg, Thomas Connolly Lecture.
Chapter Physical Database Design Methodology Software & Hardware Mapping Logical Design to DBMS Physical Implementation Security Implementation Monitoring.
IS 4420 Database Fundamentals Chapter 6: Physical Database Design and Performance Leon Chen.
Data Indexing Herbert A. Evans. Purposes of Data Indexing What is Data Indexing? Why is it important?
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Part A Part A:  Index Definition in SQL  Ordered Indices  Index Sequential.
Physical design. Stage 6 - Physical Design Retrieve the target physical environment Create physical data design Create function component implementation.
Harvard University Oracle Database Administration Session 5 Data Storage.
Chapter 17 Methodology – Physical Database Design for Relational Databases Transparencies © Pearson Education Limited 1995, 2005.
Team Dosen UMN Physical DB Design Connolly Book Chapter 18.
Oracle Database Administration Database files Logical database structures.
Indexing. Goals: Store large files Support multiple search keys Support efficient insert, delete, and range queries.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 File Organizations and Indexing Chapter 5, 6 of Elmasri “ How index-learning turns no student.
File Management Chapter 12. File Management File management system is considered part of the operating system Input to applications is by means of a file.
Systems analysis and design, 6th edition Dennis, wixom, and roth
Practical Database Design and Tuning. Outline  Practical Database Design and Tuning Physical Database Design in Relational Databases An Overview of Database.
IT The Relational DBMS Section 06. Relational Database Theory Physical Database Design.
Oracle Data Block Oracle Concepts Manual. Oracle Rows Oracle Concepts Manual.
Physical DB Design CSE2132 Database Systems Week 10 Lecture Physical Database Design - File Structures.
1 Physical Data Organization and Indexing Lecture 14.
1 © Prentice Hall, 2002 Physical Database Design Dr. Bijoy Bordoloi.
Lecture 9 Methodology – Physical Database Design for Relational Databases.
7202ICT Database Administration Lecture 7 Managing Database Storage Part 2 Orale Concept Manuel Chapter 3 & 4.
TM 7-1 Copyright © 1999 Addison Wesley Longman, Inc. Physical Database Design.
Physical Database Design Chapter 6. Physical Design and implementation 1.Translate global logical data model for target DBMS  1.1Design base relations.
Extents, segments and blocks in detail. Database structure Database Table spaces Segment Extent Oracle block O/S block Data file logical physical.
Chapter 6 1 © Prentice Hall, 2002 The Physical Design Stage of SDLC (figures 2.4, 2.5 revisited) Project Identification and Selection Project Initiation.
1 Index Structures. 2 Chapter : Objectives Types of Single-level Ordered Indexes Primary Indexes Clustering Indexes Secondary Indexes Multilevel Indexes.
Chapter 16 Practical Database Design and Tuning Copyright © 2004 Pearson Education, Inc.
© Pearson Education Limited, Chapter 13 Physical Database Design – Step 4 (Choose File Organizations and Indexes) Transparencies.
10/10/2012ISC239 Isabelle Bichindaritz1 Physical Database Design.
Database Management COP4540, SCS, FIU Physical Database Design (ch. 16 & ch. 3)
Database structure and space Management. Database Structure An ORACLE database has both a physical and logical structure. By separating physical and logical.
C-Store: Data Model and Data Organization Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY May 17, 2010.
Methodology – Physical Database Design for Relational Databases.
Storage Structures. Memory Hierarchies Primary Storage –Registers –Cache memory –RAM Secondary Storage –Magnetic disks –Magnetic tape –CDROM (read-only.
File and Database Design Class 22. File and database design: 1. Choosing the storage format for each attribute from the logical data model. 2. Grouping.
CE Operating Systems Lecture 17 File systems – interface and implementation.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Chapter 5 Index and Clustering
Session 1 Module 1: Introduction to Data Integrity
9-1 © Prentice Hall, 2007 Topic 9: Physical Database Design Object-Oriented Systems Analysis and Design Joey F. George, Dinesh Batra, Joseph S. Valacich,
Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall Chapter 9 Designing Databases 9.1.
Database structure and space Management. Database Structure An ORACLE database has both a physical and logical structure. By separating physical and logical.
11-1 © Prentice Hall, 2004 Chapter 11: Physical Database Design Object-Oriented Systems Analysis and Design Joey F. George, Dinesh Batra, Joseph S. Valacich,
Chapter 5 Record Storage and Primary File Organizations
Dale Roberts 1 Department of Computer and Information Science, School of Science, IUPUI Dale Roberts, Lecturer Computer Science, IUPUI
1 Indexes ► Sort data logically to improve the speed of searching and sorting operations. ► Provide rapid retrieval of specified rows from the table without.
Select Operation Strategies And Indexing (Chapter 8)
IT 5433 LM4 Physical Design. Learning Objectives: Describe the physical database design process Explain how attributes transpose from the logical to physical.
Data Indexing Herbert A. Evans.
Physical Database Design and Performance
Methodology – Physical Database Design for Relational Databases
Database Management Systems (CS 564)
CHAPTER 5: PHYSICAL DATABASE DESIGN AND PERFORMANCE
Database structure and space Management
Physical Database Design
Practical Database Design and Tuning
Indexing and Hashing Basic Concepts Ordered Indices
The Physical Design Stage of SDLC (figures 2.4, 2.5 revisited)
Physical Storage Structures
Indexing 4/11/2019.
Presentation transcript:

1 CG171 - Database Implementation and Development (Physical Database Design) – Lecture 7 Storage Allocation & Data Access Methods By Dr. Akhtar Ali

2 CG171 - Database Implementation and Development (Physical Database Design) – Lecture 7 1. Storage Allocation

3 CG171 - Database Implementation and Development (Physical Database Design) – Lecture 7 Storage Allocation – Logical and Physical View

4 CG171 - Database Implementation and Development (Physical Database Design) – Lecture 7 Storage Allocation - Physical Files Allocation How will the database be physically stored ? n One physical file or many ? –e.g. all data in one physical file ? –or each table or record type in its own physical file ? –data definitions (metadata) or indexes in separate files ? n On one disk or over several ? »or even distributed across a network ? n What is the optimum block size for each file ? –large block size allows more records to be read together in one physical read »useful for sequential access or when related records are stored together –small block size is more efficient if records are accessed in a random manner – block size should be chosen to accommodate the most frequently accessed physical groups of records »usually operation system specific - e.g. x*512 bytes for small up to 4k for large blocks - Windows NT, 2k for small and up to 32k for large - UNIX

5 CG171 - Database Implementation and Development (Physical Database Design) – Lecture 7 Storage Allocation – Physical Data Distribution

6 CG171 - Database Implementation and Development (Physical Database Design) – Lecture 7 Storage Allocation – Physical Memory Allocation

7 CG171 - Database Implementation and Development (Physical Database Design) – Lecture 7 Example – File Allocation in Oracle SQL n CREATE DATABASE DATAFILE... »specifies a.CTL file to hold all control data »specifies also several system files containing all table data unless storage areas are explicitly specified n CREATE [TEMPORARY] TABLESPACE DATAFILE... »used to create separate storage for system operations or database data »physical file will be automatically mapped by the DBMS to »the can include full path allowing using network files

8 CG171 - Database Implementation and Development (Physical Database Design) – Lecture 7 File Allocation in Oracle SQL - continued n Databases with explicit clauses for datafile control: »controls the overall growth of the database for physical storage of data through a set of specified parameters n Datafile parameters –MAXDATAFILES - limits the number of datafiles which can be opened for one database –AUTOEXTEND (On or Off) - allows allocating additional memory for the next data segments after the file gets full –NEXT - the size of the next physical block for extending the file –MAXSIZE - controls the limit for extending of a datafile n Example CREATE DATABASE newtest DATAFILE 'diska:dbone.dat' SIZE 2M MAXDATAFILES 10 DATAFILE 'disk1:df1.dbf' AUTOEXTEND ON 'disk2:df2.dbf' AUTOEXTEND ON NEXT 10M MAXSIZE 128M

9 CG171 - Database Implementation and Development (Physical Database Design) – Lecture 7 Storage Allocation - Database Tables Storage How will the database tables be physically spread? n Entirely on the disk and/or in the cashed memory? »Frequent vs. infrequent data use n In one physical storage area (block) or in several? –All data is static, no growth of tables projected –Dynamic data, table growth predicted n What is the size for each physical and logical storage area to be used? –Initial storage size –Size and number of the automatics extensions –Limits for extending the storage area

10 CG171 - Database Implementation and Development (Physical Database Design) – Lecture 7 Storage Allocation – Database Tables storage - cntd

11 CG171 - Database Implementation and Development (Physical Database Design) – Lecture 7 Example - Table Storage in Oracle SQL - continued n Tables with clauses for explicit tablespace control »control the growth of the tablespace segments used for physical storage of database tables through a set of specified parameters n Tablespace parameters –INITIAL [K|M] - the original size of the tablespace –NEXT [K|M] - the size of the first physical block for extending the tablespace (extent) –MINEXTENTS - indicative number of extensions –MAXEXTENTS - limiting number of extensions –PCTINCREASE - the percentage of increase of NEXT –OPTIMAL [K|M] | NULL - recommended value for NEXT n Example CREATE TABLE salgrade (grade NUMBER CONSTRAINT pk_salgrad PRIMARY KEY, losal NUMBER, hisal NUMBER) TABLESPACE human_resource STORAGE (INITIAL 64 NEXT 64 MINEXTENTS 1 MAXEXTENTS 5)

12 CG171 - Database Implementation and Development (Physical Database Design) – Lecture 7 Storage Allocation - Records Placement For each record type, it is necessary to specify how and where it will be stored n Each record type should be stored in a way which gives best performance for the most important functions –the most frequent, on-line functions are likely to be most important –infrequent or off-line (batch) functions are probably less important –but also depends on the business perspective n Analyse the types of access required by these functions : –e.g. store new record ? –access an individual record directly via the primary key ? –access a range of records sequentially in primary key sequence ? –access a record or records from a related master record ? –access via a secondary key ? –access records in no particular sequence ?

13 CG171 - Database Implementation and Development (Physical Database Design) – Lecture 7 Storage Allocation - Records Placement - cntd n Records may be stored continuously, but record placement will also depend on the number of records –e.g. if there only a few records then they can be stored in one physical block –e.g. related records can be stored together, but not if the number is large n Records may be stored serially as they arrive –simply add new records to the end of the file, and extend file when full –a good method for storing transaction data or archiving »where the main overhead is storing new records »but the data is infrequently accessed n Records may be stored sequentially in primary key order for fast range search and direct match –allows sequential access for batch processing of similar data n Records may be stored randomly using prim. key algorithm –allows fast access for processing of single matching data

14 CG171 - Database Implementation and Development (Physical Database Design) – Lecture 7 Storage Allocation - Records Placement - cntd n Indexed Sequential - the most popular –the primary key index can be a very efficient ‘limit’ index »the index only needs to record the highest key value in each block »the index does not need updating when records are added or deleted –e.g. store Order records in Order number sequence to allow efficient production of pick lists, invoices etc. IndexB1R4 B2R10 B3R14 B4R20 Database File - blocks B1, B2 etc, containing data records R1, R3 etc. R1R3 R4 R6R7R11R14R15R16 R10 R18R20 B1B2 B3 B4 where will record R12 be stored ? where will record R5 be stored ?

15 CG171 - Database Implementation and Development (Physical Database Design) – Lecture 7 Storage Allocation - Records Placement - cntd n Records may be stored randomly using an algorithm on the primary key (hashing) –allows direct, fast access to individual records –no need to maintain or access an index –but sequential access will be very inefficient »it will require an index to be maintained, or the records sorted –e.g. store Customer records according to an algorithm on Cust ref –algorithm = divide key value by 1000 and use remainder as address B1B2 B3 B4 R1R1001 R3001 R2002R2R1003R4003R1004R3004 R2003 R4 no need for an indexwhere will record R5123 be stored ? how many blocks in file ?

16 CG171 - Database Implementation and Development (Physical Database Design) – Lecture 7 Storage Allocation - Records Placement - cntd n Records may be stored in physical groups of related records (clusters or partitions) –the master record can be stored as required - serial / sequential / random –the detail records are then stored in the same or adjacent block(s) –e.g. store Order Header and Order Item records together in same block(s) –related records can be read together in one physical read from disk –but if detail records need to be accessed independently of master then they will have to be indexed additionally n Both random and sequential storage require overflow facilities and periodic reorganisation B1B2 B3 B4 H23I23/1 I23/2 H92I92/1H16I16/1H74I74/1 I16/2 I92/2I92/4

17 CG171 - Database Implementation and Development (Physical Database Design) – Lecture 7 Clustered tables

18 CG171 - Database Implementation and Development (Physical Database Design) – Lecture 7 Example - Clustered tables in Oracle SQL n CREATE CLUSTER [.] ( ) [TABLESPACE ] … »clusters store records from different tables sharing the same cluster key »clusters can be sorted or hashed for fast information retrieval n Example: hashed cluster containing two tables CREATE CLUSTER personnel (deptno NUMBER(2), phoneno INTEGER) HASHKEYS 20; CREATE TABLE dept (deptno NUMBER(2), dname VARCHAR2(9),loc VARCHAR2(9)) CLUSTER personnel (deptno); CREATE TABLE emp (empno NUMBER(4), ename VARCHAR2 (30), phoneno INTEGER) CLUSTER personnel (deptno, phoneno) For physical grouping of records into single storage area

19 CG171 - Database Implementation and Development (Physical Database Design) – Lecture 7 Partitioned tables

20 CG171 - Database Implementation and Development (Physical Database Design) – Lecture 7 Example - Partitioned tables in Oracle SQL n Used for both table and index data storage n Both physical (e.g. size) and logical criteria for partitioning (e.g. interval of values) n Partitions are accessible by name directly in SQL n Example: table partitioning by the date values of an attribute CREATE TABLE xansactions (trade_date DATE, num_shares NUMBER(10), price NUMBER(5,2)…) STORAGE (INITIAL 100K NEXT 50K) LOGGING PARTITION BY RANGE (trade_date) (PARTITION sx1992 VALUES LESS THAN (TO_DATE('01-JAN-93','DD-MON-YY')) TABLESPACE ts0, PARTITION sx1993 VALUES LESS THAN (TO_DATE('01-JAN-94','DD-MON-YY')) TABLESPACE ts1, … For logical partitioning of physical storage area into parts

21 CG171 - Database Implementation and Development (Physical Database Design) – Lecture 7 Indexed-organized tables

22 CG171 - Database Implementation and Development (Physical Database Design) – Lecture 7 Example - Index organized tables in Oracle SQL n The primary key of the table is ordered for fast exact match and range search n All attributes are stored together with the primary key directly into the index space, so any new placements or updates do not require reordering CREATE TABLE docindex(token char(20), doc_id NUMBER, token_frequency NUMBER, token_offsets VARCHAR2(512), CONSTRAINT pk_idx PRIMARY KEY (token, doc_id)) ORGANIZATION INDEX TABLESPACE ind_tbs... For sequential ordering of the physical location of table records

23 CG171 - Database Implementation and Development (Physical Database Design) – Lecture 7 Storage Allocation - Records Placement - ctnd Record type : CUSTOMER Type of access Functions On-line/ StorePrimary key Direct Off-lineDirectSequentialCust name New CustomerOn100/day Place OrderOn1000/day Print InvoicesOff5000/week EnquiryOn200/day100/day Add other access types and functions as required It may be useful to analyse the record access requirements:

24 CG171 - Database Implementation and Development (Physical Database Design) – Lecture 7 Storage Allocation - Linking Related Records For each relationship type, how will the physical access path, from one record to its related records, be implemented ? n By physical grouping (i.e. clustering) –i.e. by storing records together as described above –a relationship where the master and its detail records are stored in the same physical group is called a ‘primary’ relationship in SSADM –other relationships, where the master and detail records are physically separated are known as ‘secondary’ relationships in SSADM n By logical separating (i.e. partitioning) –Storing records in subsequent partitions, i.e. splitting the year into monts –Each partition can be managed separately (storing, searching, backup, etc.) –Each partition can be also indexed and the indexes can be also partitioned

25 CG171 - Database Implementation and Development (Physical Database Design) – Lecture 7 Storage Allocation - Linking Related Records - cntd n Records may be stored in physical sequences (chains) by linked lists –the addresses of related records are stored with the data record itself »e.g. a Customer record might hold the address of the latest Order record for that Customer »each Order record could hold the address of the previous Order record for that Customer, and the address of the Customer record itself Customer record address of latest Order record address of previous Order address of Cust record Order record 1089 address of previous Order address of Cust record Order record 972

26 CG171 - Database Implementation and Development (Physical Database Design) – Lecture 7 Storage Allocation - Linking Related Records - cntd n By primary key ordering (i.e. record sorting) –requires an index on the foreign key in the detail record –gives a relatively inefficient access path for more records »the index will create an overhead whenever new detail record is added »to find a record from a secondary index may require several reads –but it is easy to add or change relationship types to database schema n By foreign key ordering (i.e. storage indexing) –the key values and address of detail records can be held in a small index stored directly with the master record; so they can be found quickly »e.g. for every Customer record create an index for their Order records –in a relational database, this could be done by creating a link table containing only the key values of the master and detail record : Link Table :MasterDetail M1 D2 M1 D9 M2 D5 M3 D1 etc.

27 CG171 - Database Implementation and Development (Physical Database Design) – Lecture 7 2. Data Access Methods

28 CG171 - Database Implementation and Development (Physical Database Design) – Lecture 7 Data Access - Accessing Records How will records need to be accessed ? this will have been analysed already to determine the record placement n Individual, direct access using the primary key value ? –may be provided by algorithmic random or indexed sequential record placement –otherwise, create a hashed or sorted, unique primary key index n Via related records ? –master-detail and base-lookup relations –see ‘Linking Related Records’ above n Sequential access in primary key order? –may be provided by indexed sequential record placement –otherwise, create a sorted, unique primary key index to read indirectly n By secondary keys, in a group or individually? –create additional sorted indexes for each such key –create additional hashed indexes for any secondary keys where only individual, direct access is ever required

29 CG171 - Database Implementation and Development (Physical Database Design) – Lecture 7 Data Access - Index Types Indexing can be applied to both the data records (logical) and their storage (physical). There are usually two types of indexes: n Hashed indexes –the key values are stored within the index using a hashing algorithm »allows fast direct access to data records via the hash key »does not allow sequential access n Sorted indexes –the key values and record addresses are sorted into a key sequence –the index usually has a tree structure (B-tree index), but it can be also just simple enumeration –data records can be found fairly quickly directly –the index can be used to read the data records sequentially »but not as efficiently as with sequential record placement n Functional indexes –the key values are calculated using pre-specified function

30 CG171 - Database Implementation and Development (Physical Database Design) – Lecture 7 Data Access - Index Types - ctd n B-tree indexes –b-tree indexes are organized into ‘tables’ (of key values and addresses) –i.e. a tree structure of index levels from a ‘root’ through ‘branches’ to ‘leaves’ –the leaf tables contain the key values and addresses of the data records –the branch tables index the leaves or lower-level branches –to find a record, the root is checked, then the appropriate branches down the tree are read to find the index table containing the record address and hence the data record itself –as leaf tables fill up, they are split and the branch tables are updated –indexes need periodic rebuilding to minimise table-splitting –do not create unnecessary indexes root leaves branches records

31 CG171 - Database Implementation and Development (Physical Database Design) – Lecture 7 Data Access - Processing Indexed Data n Indexing the data records do not change the result of processing, but have substantial impact on the performance –database without indexes can work only when small number of records –data records may have more then one index for different operations –in principle, all the attributes in a data record could be indexed separately and/or jointly using composite indexes (fully indexed tables) n Secondary indexes will degrade performance for updates –the index must be updated every time a record is added or deleted or the key value amended –this may involve several physical updates of the index for each record update n Indexes can be processed as normal data records –i.e. partitioned data should have partitioned indexes as well n When loading data into a database –remove all indexes from the schema –load the data –rebuild the indexes

32 CG171 - Database Implementation and Development (Physical Database Design) – Lecture 7 Indexing options in Oracle SQL n CREATE [UNIQUE | BITMAP] INDEX ON ( ) [ ]... »Index the table using column directly selected from the indexed table n CREATE [UNIQUE | BITMAP] INDEX ON [ ] … »Index the table using column selected from a cluster of tables with common columns, in which the indexed table belongs For creating indexes, specifying different index clauses and options and allocating storage for them

33 CG171 - Database Implementation and Development (Physical Database Design) – Lecture 7 Indexing options in Oracle SQL - continued n index can be stored in the same or different physical files to data records (depending on the frequency of table updates) n index can be independent or functionally dependent on the indexed columns (index function) n record placement is defined by the type of index –a hashed index gives hashed record placement –a sorted index gives logically sequential record placement –bitmap indexes use physical storage locators for record placement n additional clauses allow records (rows) of the table to be distributed over more than one physical file, as well as their indexes –either ‘randomly’ (i.e. arbitrarily, not hashed) –or partitioned ‘horizontally’ by key value hashing

34 CG171 - Database Implementation and Development (Physical Database Design) – Lecture 7 Indexing options in Oracle SQL - continued n Example: hashed index CREATE INDEX sales_idx ON sales(item) STORE IN (tbs1, tbs2) n Example: bitmapped index (Oracle 8) CREATE BITMAP INDEX partno_ix ON lineitem (partno) TABLESPACE ts1 n Example: partitioned index (Oracle 8i) CREATE INDEX stock_ix ON stock (stock_symbol, stock_line) GLOBAL PARTITION BY RANGE (stock_symbol) PARTITION VALUES LESS THAN ('N') TABLESPACE ts3, PARTITION VALUES LESS THAN (MAXVALUE) TABLESPACE ts4)

35 CG171 - Database Implementation and Development (Physical Database Design) – Lecture 7