Presentation is loading. Please wait.

Presentation is loading. Please wait.

File Organization and Storage Structures Chapter 5.

Similar presentations


Presentation on theme: "File Organization and Storage Structures Chapter 5."— Presentation transcript:

1 File Organization and Storage Structures Chapter 5

2

3 Basic Concepts The database on secondary storage is organized into one or more files, where each file consists of a number of records. Each record consists of one or more fields. Typically, a record corresponds to an entity and a field to an attribute. The physical record is the unit of transfer between disk and primary storage, and vice versa. A physical record, sometimes called block or page, contains mostly several logical records, depending on the size of the records.

4 List structures Elementary list Singular list Circular list Symmetric list Symmetric circular list

5 Sequential insertion X(1) X(2) X(3) X(4) Free Zone X’(1)=X(1) X’(2)=Y X’(3)=X(2) X’(4)=X(3) free Zone X’(5)=X(4)

6 Insertion with pointer technique X(1) X(3) X(2) X(4) Y X’(1)=X(1) X’(4)=X(3) X’(3)=X(2) X’(5)=X(4) X’(2)=Y

7 Multi-list structure record with pointer record length 10 address list1 list2 list empty places 2000 3000 2020 2030 2050 2040 2010 2000 2060 3000...... A B K L

8 Insertion at beginning of list 2 list1 list2 2000 3000 2020 2030 2010 2050 2040 2000 2060 3000...... A B K L M List1: A B List2: M K L

9 General tree structure A B C DE FHJ KL MNPQR

10 Equivalent binary tree structure A B C DE F H JKL Q R M N P

11 Pointer Implementation A BC DE F H JKL QR MNP

12 Bi-directional tree X Y R S Z U T Entry -1 X Y -1 R -1 S -1 Z -1 U -1 T - first lower - higher - next

13 Ring structure X Y Z U V T R Entry X Y ZU V T R

14 File Organization  The physical arrangement of data into records and pages on secondary storage  Main types Heap or unordered Sorted Hash Access method  The steps involved in storing and retrieving records from a file

15 Sample Data SUPPLIER file SNUM SNAME STATUS CITY S1 De Smet 20 London S2 Janssens 10 Paris S3 Blanchart 30 Paris S4 Clark 20 London S5 Adams 30 Athens

16 Hash Files S300 Blanchart 30 Paris 0 1 2 3 4 5 6 7 8 9 1011 12 S200 Janssens 10 Paris S500 Adams 30 Athens S100 De Smet 20 London S400 Clark 20 London Hashing techniques Duplicate handling - open addressing - unchained overflow - Chained overflow - Multiple hashing Hashing algorithms - folding - mid-square - division by prime number Limitations: - inappropriate for value ranges - retrieval on the non-hash fields

17 An Index An index provides an ACCESS PATH to the file it is indexing  a file may have several associated indexes  the sequential access path is always available  an index imposes an ordering on the file it is indexing  it can be used for direct access  it speeds up retrieval and slows down updating  it is not the same thing as a key  can be build on combinations of fields  can be SRA or symbolic

18 Sample Data SUPPLIER file SNUM SNAME STATUS CITY S1 De Smet 20 London S2 Janssens 10 Paris S3 Blanchart 30 Paris S4 Clark 20 London S5 Adams 30 Athens

19 Supplier file with index on city Supplier file SNUM SNAME STATUS CITY S1 De Smet 20 London S2 Janssens 10 Paris S3 Blanchart 30 Paris S4 Clark 20 London S5 Adams 30 Athens City-index Athens. London. Paris.

20 Supplier file with two indexes 10 20 30 Supplier file City-index Athens. London. Paris. SNUM SNAME STATUS CITY S1 De Smet 20 London S2 Janssens 10 Paris S3 Blanchart 30 Paris S4 Clark 20 London S5 Adams 30 Athens

21 Non-dense index S2. S4. S5. block 1 block 2 block 3 SNUM-index SNUM SNAME STATUS CITY S1 De Smet 20 London S2 Janssens 10 Paris S3 Blanchart 30 Paris S4 Clark 20 London S5 Adams 30 Athens

22 Factoring out a field SNUM SNAME STATUS CITY-pointer S1 De Smet 20 S2 Janssens 10 S3 Blanchart 30 S4 Clark 20 S5 Adams 30 Supplier file CITY-file CITY Athens London Paris

23 Combining Indexing and factoring out S1 De Smet 20 S2 Janssens 10 S3 Blanchart 30 S4 Clark 20 S5 Adams 30 AthensLondon Paris

24 Parent - Child structure S1 De Smet 20 S2 Janssens 10 S3 Blanchart 30 S4 Clark 20 S5 Adams 30 AthensLondon Paris CITY file SUPPLIER file

25 Fully inverted file SNAME-index STATUS-index CITY-index Supplier- file De Smet S1-> 10 S1-> Athens S5-> S1 Janssens S2-> 20 S1->,S4-> London S1->,S4-> S2 Blanchart S3-> 30 S3->,S5-> Paris S2->,S3-> S3 Clark S4-> S4 Adams S5-> S5

26 File organization: Indexed-sequential multi-level index blocks data blocks Behr Dooms Fagin Adams Albert Behr Bodoo Claes Codd Dooms Ernest Fagin Ace Adamo Adams Ademar Aerts Alan Albert Alois Ball Behr Bens Bodoo parameters - index block size - data block size

27 B-tree concept BALANCED tree 25 144 9 -64 100196 - 1 4 - 9 16 - 25 36 49 64 81 - 100 121 - 144 169 - 196225250 non-dense index dense index

28 B-tree insertion non-dense index dense index same B-tree after insertion of record 32 64 - 25 - 144 - 9 -36 - 100 -196 - 1 4 - 9 16 - 25 32 - 36 49 - 64 81 - 100 121 - 144 169 - 196225256

29 B-tree deletion 25 81 9 -36 -144 196 non-dense index 1 4 -- 9 16 - 25 32 - 36 49 - 81 100 121 144169 - 196225 256 Deletion of 64


Download ppt "File Organization and Storage Structures Chapter 5."

Similar presentations


Ads by Google