Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Term 2, 2004, Lecture 5, Physical DesignMarian Ursu, Department of Computing, Goldsmiths College Physical Design 3.

Similar presentations


Presentation on theme: "1 Term 2, 2004, Lecture 5, Physical DesignMarian Ursu, Department of Computing, Goldsmiths College Physical Design 3."— Presentation transcript:

1 1 Term 2, 2004, Lecture 5, Physical DesignMarian Ursu, Department of Computing, Goldsmiths College Physical Design 3

2 2 Term 2, 2004, Lecture 5, Physical DesignMarian Ursu, Department of Computing, Goldsmiths College Introduction Database Design Methodology requirements specification ER/EER modelling validation of ER/EER models; aggregation of different views transformation of ER/EER model into relational model normalisation physical design monitor and tune operational system

3 3 Term 2, 2004, Lecture 5, Physical DesignMarian Ursu, Department of Computing, Goldsmiths College Outline overview of physical design base relations and enterprise constraints known from before transactions analysis file organisation and indexes

4 4 Term 2, 2004, Lecture 5, Physical DesignMarian Ursu, Department of Computing, Goldsmiths College 1

5 5 Physical Design – Rationale all the documentation produced until physical design represents a detailed specification of what we intend to build, of what is required informally, physical design is the process that transforms these specifications into a good working system, using the functionality provided by the chosen DMBS

6 6 Term 2, 2004, Lecture 5, Physical DesignMarian Ursu, Department of Computing, Goldsmiths College in next lecture Overview of Physical Design express logical model using the DDL language of the chosen/target DBMS design optimal storage design user views design security mechanisms

7 7 Term 2, 2004, Lecture 5, Physical DesignMarian Ursu, Department of Computing, Goldsmiths College Express Logical Model in Target DBMSd essentially covered in the first term design and implement base relations analyse and document derived data design enterprise constraints

8 8 Term 2, 2004, Lecture 5, Physical DesignMarian Ursu, Department of Computing, Goldsmiths College Design Base Relations are domains supported? are attribute constraints supported? NOT NULL, UNIQUE are keys supported? candidate primary alternate (UNIQUE + NOT NULL) foreign referential integrity FK rules

9 9 Term 2, 2004, Lecture 5, Physical DesignMarian Ursu, Department of Computing, Goldsmiths College Analyse and Document Derived Data derived attributes not present in the relational model; but present in the EER model it is possible that not all derived attributes are documented thus far then, this issue can be given further consideration at this point trade-off calculate a derived attribute each time it is being used may be too time-expensive store it in the database redundancy, therefore more space required (space-expensive) and possibility of inconsistencies to maintain consistency – integrity constraints of active rules; now this may become time-expensive

10 10 Term 2, 2004, Lecture 5, Physical DesignMarian Ursu, Department of Computing, Goldsmiths College Design Enterprise Constraints support for integrity constraints e.g., SQLs CONSTRAINT … CHECK … NOT EXISTS … support for active rules or triggers e.g., Postgres: CREATE RULE … ON UPDATE TO … WHERE … DO … example?

11 11 Term 2, 2004, Lecture 5, Physical DesignMarian Ursu, Department of Computing, Goldsmiths College Design Optimal Storage criteria maximise transaction throughput maximise response time minimise storage space (they determine and are determined by the system resources) issues transactions analysis file organisation and indexes

12 12 Term 2, 2004, Lecture 5, Physical DesignMarian Ursu, Department of Computing, Goldsmiths College 2

13 13 Term 2, 2004, Lecture 5, Physical DesignMarian Ursu, Department of Computing, Goldsmiths College Analyse Transactions identify critical transactions to the functionality of the information system e.g., transactions that should never fail identify transactions that put a significant load on the DBMS run frequently processing time is high identify the periods when the database system is heavily used (down to individual transactions) identify type of users by whom or locations from where the database is going to be heavily used

14 14 Term 2, 2004, Lecture 5, Physical DesignMarian Ursu, Department of Computing, Goldsmiths College Map Transactions [Paths] to Relations identifies mostly used relations matrix transactions relations elements of the matrix: Y/N (i.e. used, not used) number of accesses (per hour, day …) rel 1 rel 2 rel 3 rel 4 rel 5 T1T2T3T4 I R U D I R U D I R U D I R U D

15 15 Term 2, 2004, Lecture 5, Physical DesignMarian Ursu, Department of Computing, Goldsmiths College Determine Frequency Information determine average, maximum and minimum number of times a transaction runs per hour, per day, … transaction usage map determine peak periods determine demanding transactions that have in common some of the resources they access problems due to locking

16 16 Term 2, 2004, Lecture 5, Physical DesignMarian Ursu, Department of Computing, Goldsmiths College Bookings for Personal Tutor (A) check availability see if tutor is available at a specific time (B) see my appointments list all my appointments for given period (C) see details of my appointments list all my appointments, including the tutors nameand office Tutor name office Student name programme Booking day time topic With For (A) (B) (C)

17 17 Term 2, 2004, Lecture 5, Physical DesignMarian Ursu, Department of Computing, Goldsmiths College Transaction Usage Maps frequency: per hour Tutor 20 With For (A) (B) (C) Student 300 Booking 1500 avg: 20 max: * 1 avg: 40 max: 150 avg: 20 max: 100

18 18 Term 2, 2004, Lecture 5, Physical DesignMarian Ursu, Department of Computing, Goldsmiths College Analyse Data Usage more detailed analysis (relations) attributes and the type of access updated attributes – not candidates for access structures attributes used in predicates are candidates for access structures attributes involved in joins candidates for access structures attributes affecting performance of critical transactions higher priority for access structures transaction analysis form refer to Connolly, p.490

19 19 Term 2, 2004, Lecture 5, Physical DesignMarian Ursu, Department of Computing, Goldsmiths College Analyse Transactions - Conclusion essentially, the transaction analysis identifies the critical aspects related to the usage of the database (e.g., relations used frequently, attributes involved in expensive predicates, …) on its basis, file organisations and indexes can be chosen

20 20 Term 2, 2004, Lecture 5, Physical DesignMarian Ursu, Department of Computing, Goldsmiths College 3

21 21 Term 2, 2004, Lecture 5, Physical DesignMarian Ursu, Department of Computing, Goldsmiths College File Organisation data has to be stored in an efficient way all 4 operations (insert, retrieve, update and delete) require efficiency efficient has different meanings in different contexts different storage structures or file organisations represent efficient ways for different contexts; e.g. heap structure is suitable for bulk-loading and bulk-retrieval (all student names, all programmes, …) hash structure is suitable for exact match queries (student name = Joe Bloggs) note that a structure that is efficient in one context may not be efficient in a different context

22 22 Term 2, 2004, Lecture 5, Physical DesignMarian Ursu, Department of Computing, Goldsmiths College Index a structure that allows the DBMS to locate records in a table/file more quickly the decision as of which attributes to be chosen as indexes, and which type of indexes they should be (which type of file organisation) … is determined by the results of the transaction analysis

23 23 Term 2, 2004, Lecture 5, Physical DesignMarian Ursu, Department of Computing, Goldsmiths College File Organisation vs Index file organisation – method of storing data on disk, with our without the use of indexes index – data structure used to access records more quickly primary and clustering index – part of the storage of the actual records; the records themselves are physically ordered according to the index secondary index – auxiliary data structure; the records may be (usually are) unordered according to the index; index is sometimes used to mean secondary index the issue of indexes is subsumed by the issue of file organisation

24 24 Term 2, 2004, Lecture 5, Physical DesignMarian Ursu, Department of Computing, Goldsmiths College File Structures/Types Heap / unordered Index Sequential Access Method Hash B + tree

25 25 Term 2, 2004, Lecture 5, Physical DesignMarian Ursu, Department of Computing, Goldsmiths College Heap / Unordered records are written in the file in the same way as they are inserted, at the end of the file insertion is efficient, in particular for bulk-loading there is no ordering retrieval is very inefficient if it involves predicates/conditions similarly update and deletion the space freed by deleted records is not automatically reused administrator has to run routines

26 26 Term 2, 2004, Lecture 5, Physical DesignMarian Ursu, Department of Computing, Goldsmiths College Indexed Sequential Access (ISAM) / Ordered records are ordered on the basis of some attributes – the index field primary index – ordering attributes are a key one index value to a tuple clustering index – ordering attributes are not a key one index value to a group of tuples sequentially ordered secondary indexes can also be created what is the difference be between a clustering index and a secondary sequential index?

27 27 Term 2, 2004, Lecture 5, Physical DesignMarian Ursu, Department of Computing, Goldsmiths College Use of ISAM Files recommended exact matching (based on the index field) range of values (based on the index field) drawback ISAM indexes are static (created when the file is created) not recommended updates to index field the access key sequence deteriorates

28 28 Term 2, 2004, Lecture 5, Physical DesignMarian Ursu, Department of Computing, Goldsmiths College Hash (random or direct access) a hash function calculates the address where each record is to be stored the calculation is performed on the basis of some fields records appear to be randomly distributed across the file space the function should be chosen such that it leads to an as good as possible distribution of the records in the available space problem : most hashing functions do not guarantee a unique address, because the file space much smaller than possible values of hash field address generated by hash function BUCKET (with SLOTS) COLLISION (SYNONYMS) same bucket, different slots hash attributes – secondary index

29 29 Term 2, 2004, Lecture 5, Physical DesignMarian Ursu, Department of Computing, Goldsmiths College Use of Hash Files recommended for retrievals based on exact matches, in particular when the access order (the order in which queries arise) is random not recommended retrieval based on pattern match retrieval based on ranges of values retrieval based on other fields than the exact hash filed when the hash field is frequently updated

30 30 Term 2, 2004, Lecture 5, Physical DesignMarian Ursu, Department of Computing, Goldsmiths College B + Tree B Balanced tree more versatile than the hash structure details – optional issue recommended exact match (index filed) pattern matching (index filed) range of values (index filed) part index filed specification advantage B + tree is dynamic – it grows as the relation grows performance does not deteriorate with updates B + tree – secondary index

31 31 Term 2, 2004, Lecture 5, Physical DesignMarian Ursu, Department of Computing, Goldsmiths College Choosing Indexes consider results from transactions analysis primary/clustering index attribute(s) used in joins attribute(s) most often used to access the relation secondary indexes trade-off: maintenance of an index vs. efficiency of queries work in class on maintenance operations choosing secondary indexes index primary key (if not already a primary index) index attributes often involved in joins and selection criteria do not index attributes which are frequently updated …

32 32 Term 2, 2004, Lecture 5, Physical DesignMarian Ursu, Department of Computing, Goldsmiths College File Organisation - Conclusion pre-defined file-structures exists that provide better efficiency of certain database operations in certain contexts

33 33 Term 2, 2004, Lecture 5, Physical DesignMarian Ursu, Department of Computing, Goldsmiths College –

34 34 Term 2, 2004, Lecture 5, Physical DesignMarian Ursu, Department of Computing, Goldsmiths College Conclusions physical design what it consists of transaction analysis identifies hot-spots of the database file organisation and indexes make work with the hot-spots more efficient


Download ppt "1 Term 2, 2004, Lecture 5, Physical DesignMarian Ursu, Department of Computing, Goldsmiths College Physical Design 3."

Similar presentations


Ads by Google