Presentation is loading. Please wait.

Presentation is loading. Please wait.

File Organization Lecture 1

Similar presentations


Presentation on theme: "File Organization Lecture 1"— Presentation transcript:

1 File Organization Lecture 1
Introduction to the Design and Specification of File Structures File Organization

2 Previous Lecture

3 Lecture 0 Course Outline
Course Aims and Objectives. Course Contents. Course Textbook and Schedule. Course Link

4 Today Lecture

5 Lecture Objectives Introduce the primary design issues that characterize file structure design. Survey the history of file structure. Introduce the notions of file structure literacy and of a conceptual toolkit for file structure design. Discuss the need for precise specification of data structure and operations Develop an object-oriented toolkit that makes file structure easy to use.

6 Lecture Contents The heart of file structure design.
A short history of file structure design. A conceptual toolkit: File structure literacy. An object-oriented toolkit: Making file structure usable.

7 The heart of file structure design
Section 1.1 The heart of file structure design

8 File Structure Definition & Functions
A combination of representations for data in files and of operations for accessing the data. Functions Allowing applications to read, write and modify data. It might also support finding the data that matches some search criteria reading through the data in some particular order.

9 File Structure Design: Need of Study Data Storage
Computer Data can be stored in 3 kinds of locations: Primary Storage Memory Secondary Storage Tertiary Storage Archival Data Offline Disk Tape CD Rom “not accessed by the computer” Online Disk Tape CD Rom “accessed by the computer” Computer Memory

10 File Structure Design: Need of Study Memory versus Secondary Storage
Secondary storage such as disks can pack 1000’s of megabytes in a small physical location. Computer Memory (RAM) is limited. Comparing to Memory, access to secondary storage is extremely slow. Getting information from slow RAM takes seconds (= 120 nanoseconds) while getting information from Disk takes seconds (= 30 milliseconds) Roughly, 20 second on RAM ≈ 58 days on Disk

11 representation of the data the implementation of the operations
File Structure Design: Need of Study Improve Secondary Storage Access Time representation of the data the implementation of the operations ⇒ the efficiency of the file structure for particular applications

12 File Structure Design General Goals
Get the information we need with one access to the disk. If that’s not possible, then get the information with as few accesses as possible. Group information so that we are likely to get everything we need with only one trip to the disk.

13 File Structure Design Fixed versus Dynamic Files
It is relatively easy to come up with file structure designs that meet the general goals when the files never change. When files grow or shrink when information is added or deleted, it is much more difficult.

14 A short history of file structure design
Section 1.2 A short history of file structure design

15 Early Work Early Work assumed that files were on tape.
Access was sequential and the cost of access grew in direct proportion to the size of the file.

16 The emergence of Disks and Indexes
As files grew very large, unaided sequential access was not a good solution. Disks allowed for direct access. Indexes made it possible to keep a list of keys and pointers in a small file that could be searched very quickly. With the key and pointer, the user had direct access to the large, primary file.

17 The emergence of Tree Structures
As indexes also have a sequential flavor, when they grew too much, they also became difficult to manage. The idea of using tree structures to manage the index emerged in the early 60’s. Trees can grow very unevenly as records are added and deleted resulting in long searches requiring many disk accesses to find a record.

18 Balanced Trees In 1963, researchers came up with the idea of AVL trees for data in memory. heights of the two child subtrees of any node differ by at most one. named after its two inventors, G.M. Adelson-Velsky and E.M. Landis AVL trees, however, did not apply to files because they work well when tree nodes are composed of single records rather than dozens or hundreds of them.

19 Balanced Trees In the 1970’s came the idea of B-Trees
In B-trees, internal nodes can have a variable number of child nodes within some pre-defined range. When data is inserted or removed from a node, its number of child nodes changes. B-Trees can guarantee that one can find one file entry among millions of others with only 3 or 4 trips to the disk.

20 Hash Tables Retrieving entries in 3 or 4 accesses is good, but it does not reach the goal of accessing data with a single request. From early on, Hashing was a good way to reach this goal with files that do not change size greatly over time. Recently, Extendible Dynamic Hashing guarantees one or at most two disk accesses no matter how big a file becomes.

21 A conceptual toolkit: File structure literacy
Section 1.3 A conceptual toolkit: File structure literacy

22 Conceptual tools For File Structure Design
Tree Structure Direct Access Sequentially Decrease the number of disk accesses by collecting data into buffers, blocks, or buckets. Manage their growth by splitting them. Find a way to increase our address or index space. Find new ways to combine the basic tools.

23 An object-oriented toolkit: Making file structure usable
Section 1.4 An object-oriented toolkit: Making file structure usable

24 Object-Oriented Toolkit For File Structure Design
Making file structure usable needs application programming interfaces. Invoking an object-oriented approach Data types and operations are defined as classes.

25 Object-Oriented Toolkit Difficulties
Describing the classes for file structure design are: Progressive New classes are often modifications or extensions of other classes. Complicated The details of the data representations and operations become more complex.

26 Next Lecture

27 Fundamental File Processing Operations
Physical and logical file. Opening and closing files. Reading and writing. Seeking. Special Characters in files. Physical devices and logical files. File-related header files.

28 Questions?


Download ppt "File Organization Lecture 1"

Similar presentations


Ads by Google