Download presentation
Presentation is loading. Please wait.
Published byLetitia Evans Modified over 9 years ago
1
22-09-2007NOEA/IT FEN Databases/PhysicalDB 1 Physical DB On hardware, disc etc. File Structures Hashing Index Structures Search Trees and B trees revisited Query Processing and Optimisation
2
22-09-2007NOEA/IT FEN Databases/PhysicalDB 2 Storage Hierarchy Internal storage: –Static RAM (cache) –Dynamic RAM (main memory) External Memory (secondary storage): –Flash memory (memory sticks) –Hard Disc –DVD –CD-ROM –Tape –Floppy disc
3
22-09-2007NOEA/IT FEN Databases/PhysicalDB 3 Hard Disc Drive Principle Fig 5.1
4
22-09-2007NOEA/IT FEN Databases/PhysicalDB 4 Sorted Files Fixed record-size (direct access) Records are kept sorted on a key field Binary search may be applied: examine middle element IF (NOT found) IF (search_element.key>middle_element.key) Search upper half of the file ELSE Search lower half of the file Search in log(n) - not n/2 (n = number of records) (log(1024) = 10)
5
22-09-2007NOEA/IT FEN Databases/PhysicalDB 5 Hashing Compute an direct access index from a key IF(collision) insert in a list Average number of collisions: –1/(1-LF), where LF is the Load Factor in decimal, for instance 80% of the entries are in use: LF = 0.8, average number of collisions: 1/(1-0.8) = 1/0.2 = 10/2 = 5 Note: The number of collisions is not dependent of the size of the hash table only of the load factor
6
22-09-2007NOEA/IT FEN Databases/PhysicalDB 6 Hashing on Disc Fig 5.12
7
22-09-2007NOEA/IT FEN Databases/PhysicalDB 7 Index Sorted files have problems with insertion and deletion of new records: –Records are to be moved around to keep the file sorted No possibilities for fast search on alternative keys –sort on SSN, how about searching on name, for instance? Hence index – especially multilevel index
8
22-09-2007NOEA/IT FEN Databases/PhysicalDB 8 2 Level Index Fig. 6.6
9
22-09-2007NOEA/IT FEN Databases/PhysicalDB 9 Binary Search Tree A Binary Tree is a tree-structure which is either empty or has a non-empty root element with a left and a right sub-tree, which themselves are binary trees. For a Binary Search Tree it also holds that if it is not empty: –all elements in the left sub-tree are less than the root element –all elements in the right sub-tree are greater than the root element –This property holds recursively down through the tree
10
22-09-2007NOEA/IT FEN Databases/PhysicalDB 10 Binary Search Tree - Ex.: Insert Z1, Z2, Z3,…?
11
22-09-2007NOEA/IT FEN Databases/PhysicalDB 11 Binary Search Tree - Ex.: Searchig the key value k: –Examine the root r: r.key == k – got it!!! r.key < k – search the right sub-tree r.key > k – search the left sub-tree Insertion of element x: –Search down the tree to an empty position and insert there
12
22-09-2007NOEA/IT FEN Databases/PhysicalDB 12 Binary Search Tree - Efficiency If the tree is balanced: –Searching in log(n) - (n number of elements in the tree) (ex.: n = 1024 => log(1024) = 10 elements are accessed) But binary search trees have a tendency to become unbalanced –(if for instance input is sorted or if insertions and deletions are made in an order that is uniformly distributed) It is expensive in running time to keep the tree balanced, hence Balaced Search Trees….
13
22-09-2007NOEA/IT FEN Databases/PhysicalDB 13 Multi-way Search Trees Fig. 6.8
14
22-09-2007NOEA/IT FEN Databases/PhysicalDB 14 B-Trees Fig. 6.10
15
22-09-2007NOEA/IT FEN Databases/PhysicalDB 15 B-Trees - Principles Node size (number of keys in a node) is adjusted according to the block size of the file system A node is always kept at least half-full: –If an element is inserted, so the node is over flooded, then the node is split into two and the middle element is moved up one level. If this causes overflow in the node on the next level, then this is split and the middle element is moved up, and so on recursively until eventually a new root is created –If an element is deleted, and this causes the node to become less than half-full, the node is merge with a sibling and elements are distributed between the new node and the parent. If this causes the parent to become less than half-full, the process is continued recursively up the tree until the root eventually is deleted. Hence a B-tree is always balanced, and searches, insertions and deletions can be performed in logarithmic time (log(n))
16
22-09-2007NOEA/IT FEN Databases/PhysicalDB 16 B+ Trees All data pointers are kept at the leaf level, and leafs are chained together. Hence a total order is defined:
17
22-09-2007NOEA/IT FEN Databases/PhysicalDB 17 Insertion in a B+ Tree Fig. 6.12
18
22-09-2007NOEA/IT FEN Databases/PhysicalDB 18 Deletion in a B+ Tree Fig. 6.13
19
22-09-2007NOEA/IT FEN Databases/PhysicalDB 19 Query-Optimisation Nested sub selects are handle in separate query-blocks Each query-block is transformed into a sequence equivalent algebra operations represented in a tree-structure This tree-structure can optimised using standard compiler optimisation techniques –For instance row-selection before join –Keep track of estimates of size of cross table relations
20
22-09-2007NOEA/IT FEN Databases/PhysicalDB 20 Join Algorithms Nested-loop or brute force –O(|A|*|B|) Single-loop: require index on at least one of the join attributes (in table B) –O(|A|*log(|B|)) Sort-merge: only if both tables are physically sorted on the join attribute –minimises disc access
21
22-09-2007NOEA/IT FEN Databases/PhysicalDB 21 Query-Optimisation Lots of other tricks: DB2 for instance: conditions on attributes with index are executed first. Assume index on lname: SELECT * FROM Employee WHEREfname=’Kurt’ ANDlname=’Jensen’ It will be much more efficient to find ’Jensen’ using an index and then ’Kurt’ using linear search than the other way around
22
22-09-2007NOEA/IT FEN Databases/PhysicalDB 22 Opgave Undersøg, hvor meget af det foregående (og i givet fald hvordan), der er understøttet af MS SQL Server.
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.