CMPT 300: Operating Systems I

Name: CMPT 300: Operating Systems I
Uploaded: 2017-08-27T20:28:38+00:00
Duration: PTM24S7
Description: CMPT 300: Operating Systems I

CMPT 300: Operating Systems I
School of Computing Science Simon Fraser University CMPT 300: Operating Systems I Chapters 10, 11, 12: File System and Disk Scheduling Dr. Mohamed Hefeeda

Objectives Understand how to store and manage information on secondary storage systems Understand file system: Interface Structure Implementation Note: file system is the most visible part of the OS to users

Secondary Storage Systems
Various storage media Magnetic disks Magnetic tapes Optical disks …. Each medium has different physical characteristics Storing bits on disks is different from storing them on CDs Yet, OS provides a uniform logical view of storage to users To efficiently store, locate, and retrieve data from a storage system, OS creates one or more file systems on it

File System Challenges
File systems involve two design problems File system interface: how file system looks to users Define a file, file attributes, operations on files, and how files are organized into directories File system implementation: algorithms and data structures to map logical file system onto physical devices Block allocation, free-space management, searching a directory, data caching, …

File System: Layered Structure
Application Programs Interface: file and directory structure Maintains pointers to logical block addresses Logical File System File-organization Module Implementation: block allocation, … Maps logical into physical addresses Device Drivers Implementation: device-specific instructions Writes specific bit patterns to device controller Storage Devices

File System Interface: File Concept
From user’s perspective, a file is the smallest storage unit A file is a named collection of related information recorded on a secondary storage Information stored in a file could be of various types: Text, numeric data Binary data Source code Executable programs …..

File Attributes Name – only information kept in human-readable form
Identifier – unique tag (number) identifies file within file system Type – needed for systems that support different types Location – pointer to file location on device Size – current file size Protection – controls who can do reading, writing, executing Time, date, and user identification – data for protection, security, and usage monitoring Information about files are kept in a directory, which is maintained on the disk as well Each file has an entry in the directory

File Operations Create Write Read Reposition within file Delete
Truncate More operations (e.g., copy) can be composed of these primitives To perform these operations, we open the file (details later)

File System Interface: Directory Concept
Directory is a logical grouping of files A directory contains an entry for each file under it Some systems (UNIX) treat directories just as files In fact, UNIX treats everything as a file Operations on a directory Search for a file Create a file Delete a file List a directory Rename a file Traverse the file system

Directory Structure Design the directory structure to achieve
Efficiency – locating a file quickly Naming – convenient to users Two users can have same name for different files The same file can have several different names (aliases, links) Grouping – logical grouping of files by properties, (e.g., all Java programs, all games, …) Tree-structured directories are the most common

Tree-Structured Directories

Tree-Structured Directories (cont’d)
Efficient searching Grouping capability Things get complicated when we start adding links Directory is no longer a tree  acyclic-graph structure

Acyclic-Graph Directories
When file is deleted while some links still point to it  Dangling pointers!

Acyclic-Graph Directories (cont’d)
Solution for dangling links in Unix Symbolic link Just leave the dangling pointer for the user to delete Try: $ ln –s file.txt file_symLink.txt ls –l rm file.txt Hard link Keep a reference count on the file Only delete the physical file when all links to it are deleted $ ln file.txt file_link.txt rm file.txt Links may even create a cycle  creating a general graph

General Graph Directory

General Graph Directory (cont’d)
Suppose we are backing up the entire file system or searching for a file through the directory With links, we may visit the same subdirectory several times Very costly (remember directory is stored on disk) We may even loop for ever if we have cycles! Solution? Simply: Bypass links during directory traversal!

File System Mounting A file system must be mounted before it can be accessed OS is given name of the device and a mount point OS checks device to make sure it has a valid file system Then, OS makes the new file system available root File system on a storage device bob alice code data users

Virtual File Systems Multiple file systems can be mounted at same time (typical) disk: UFS (Unix), NTF (Windows), ext2 (Linux), ext3, … CD: iso 9660 File systems on other machines, e.g., Network File System (NFS) Each file system has its own file and directory structure, allocation methods, algorithms and data structure, … To shield users from all these differences, OS implements a virtual file system (VFS) layer VFS provides a common interface (API) to all file systems E.g., applications use open(), read(), write(), … without worrying about which file system(s) is (are) being used

Virtual File System

File System Implementation
To implement a file system, we need On-disk structures, e.g., directory structure, number of blocks, location of free blocks, boot information, … (In addition to data blocks, of course) In-memory structures to: improve performance (caching) manage file system

On-disk Structures Boot block: information to boot the OS
Volume control block: information about the volume (partition) number of blocks, block size, free block count, … UFS calls it superblock NTFS calls it master file table (relational database) File control block (FCB): per file, details about the file, e.g., size, location of data blocks, file permissions, ownership UFS calls it inode NTFS stores this info in the master file table Directory structure: how files are organized into directories UFS uses inodes

On-disk Structures File Free block Boot block Superblock
Directory structure File control block File Data block

In-memory Structures Mount table: info on each mounted volume (partition) Directory-structure cache: info on recently accessed directories System-wide open-file table: contains a copy of the FCB of each open file in the system And info on which process currently using which file Per-process open-file table: contains an entry for each file opened by this process, which has a pointer to the corresponding entry in the system-wide open file table and info regarding the usage of the file by this process, e.g., current file pointer, open mode (read, write), ..

Opening a File Search the directory to find the file control block
May need to bring (from disk) multiple directory blocks into memory, if they are not already cached Consider the case: open(“/dir1/dir2/dir3/file.txt”) Create an entry in the per-process open-file table (PFT) Check whether the system-wide open-file table has an entry for this file if it does increment its reference count Make the entry in PFT point to this entry If it does not Create a new entry, set its reference count to 1 Make the entry in PFT point to the new entry Return a pointer (file descriptor) to the entry in PFT Successive file operations (read, write, …) use the file descriptor

Opening and Reading from a File

Creating a File Allocate a new file control block (FCB)
For faster file creation, FCBs are usually pre-allocated  Find a free FCB Read relevant directory blocks in memory Update them to reflect the new file and write them back to disk Allocate free blocks for the data of to the file How do we allocate free blocks to files? And How do we know where the free blocks are?

Allocation Methods Problem: Allocate free blocks to files
Given: Disks allow random access of blocks Objectives: Efficient disk space utilization, and fast file access Three common allocation methods Contiguous Linked Indexed

Contiguous Allocation
Each file occupies a set of contiguous blocks Needs only start address (block #) and length (number of blocks) Mapping of logical address (LA) Physical block = Q + start Offset within block = R Block size = 512 LA/512 Q R

Contiguous Allocation (cont’d)
Pros Simple Supports random access efficiently Minimal disk head seeks  fast Cons? External fragmentation Files may not be able to grow

Linked Allocation Each file is a linked list of blocks
Blocks could be anywhere Each block has a pointer to the next block Need start block and end block (to append to file) pointer block data

Linked Allocation (cont’d)
Mapping of logical addresses Physical block is at Qth location in the chain but, how do we get to it? Traverse the chain! Offset within block = R + 1 Assume pointer takes 1 byte, and block size is 512 bytes LA/511 Q R

Linked Allocation (cont’d)
Pros No waste of space (except for pointers) Simple: need only start and end addresses Supports dynamic growing of files Cons No random access (or very costly to support) Reliability: one block is corrupted, the chain is broken

Indexed Allocation Bring all pointers together into an index block

Indexed Allocation (cont'd)
Mapping of logical addresses Q = displacement into index block R = offset within the block Pros Supports random access Supports dynamic growing of files No external fragmentation Cons Overhead of index blocks A file of one or a few data blocks needs an index block How do we choose the size of index blocks? LA/512 Q R

Indexed Allocation (cont'd)
First, consider a file with one index block Assume each pointer takes 4 bytes, and block size is 512 bytes What is the maximum file size supported? Index block may have up to 512/4 =128 entries  max file size = 128 * 512 = 64 KB Now how do we support larger files? Increase size of index blocks  waste space for small files Better solutions?

Indexed Allocation (cont’d)
Linked index blocks Last word in index block points to another index block May need to traverse the index linked list (long access time) Multilevel index First-level index block points to a set of second-level index blocks which refer to data blocks Shorter access time but more space overhead Combined (used in Unix File System) Multilevel and linked Each file has an index block (inode), which contains Pointers that point to data blocks directly (for small files) Pointers that point to index blocks, which in turn may point to either data blocks or another level of index blocks UNIX supports up to three level of index blocks

Combined Scheme: UNIX inode
Assume block size of 4KB, 4-byte pointers, 12 direct entries, 1 single, 1 double and1 triple indirect, what is the max file size supported? ( * *1024*1024) *4KB >> what the 32-bit file pointer can address (=4GB)!

How Do We Know Where Free Blocks Are?
Bit map Every block has a bit: 0 = occupied, 1 = free ………..1 Simple to implement Easy to find contiguous blocks Supported by hardware Single instruction to find offset of first bit with value 1 in a word (of 32 bits)  fast searching Disadvantages Bit map is stored on disk  slow to access Solution: cache it in memory Bit maps are not small for large disks  waste of space 40-GB disk with 1-KB blocks  40 M blocks  5-MB bitmap This makes it difficult to cache the entire bitmap

How Do We Know Where Free Blocks Are?
Linked List No waste of disk space But, not easy to get contiguous space

Disk Scheduling Processes issue disk read/write requests
Kernel maps these requests to physical block addresses These requests are sent to disk controller Problem: If there are multiple outstanding requests (in a disk queue), which one should be serviced first? Objectives Fast disk access time High disk bandwidth (#bytes/sec transferred between disk and memory) Fairness (may be!) Before presenting scheduling algorithms, let us understand the structure and operation of magnetic disks

Disk Physical Structure
Several platters, each is divided into circular tracks, which are subdivided into sectors Head moves horizontally from one track to another Disk rotates at high speed ( times/sec) Tracks accessed at same head position make a cylinder Drive can be directly attached to computer via I/O bus (EIDE, ATA, SCSI), or it could be attached through the network (ISCSI)

Disk Logical Structure
Disk is viewed as a one-dimensional array of logical blocks The logical block is the smallest unit of transfer Block = sector The array of blocks is mapped into sectors of the disk sequentially: Block 0 is at the first sector of the first track on the outermost cylinder Mapping proceeds in order through that track, Then the rest of the tracks in that cylinder, Then through the rest of the cylinders from outermost to innermost Block Address: <cylinder, track, sector>

Disk Operation Accessing (reading/writing) a block
Move the head to desired track (seek time) Wait for desired sector to rotate under the head (rotational latency time) Transfer the block to a local buffer, then to main memory (transfer time) We try to minimize the seek time, which is proportional to the seek distance (distance moved by the head)

Disk Scheduling Algorithms
Several algorithms exist to schedule the servicing of disk I/O requests FCFS SSTF SCAN, C-SCAN LOOK, C-LOOK We illustrate them with a request queue ( cylinders) 98, 183, 37, 122, 14, 124, 65, 67 Assume initial head position at cylinder 53

First Come First Served
Find the total head movements to service the request queue: , 183, 37, 122, 14, 124, 65, 67 Let us work it out Total head movements = 640 cylinders

Shortest Seek Time First
Select request with minimum seek time from current head position Total head movements = 236 cylinders May cause starvation of some requests

SCAN Disk arm starts at one end and moves toward the other end, servicing requests When it gets to the other end, movement is reversed Total head movements = 208 cylinders

Circular SCAN (C-SCAN)
Provides a more uniform wait time than SCAN The head moves from one end to the other, servicing requests as it goes When it reaches the other end, however, it immediately returns to the beginning of the disk, without servicing any requests on the return trip Treats cylinders as a circular list that wraps around from the last cylinder to the first one

C-SCAN (cont’d)

C-LOOK Version of C-SCAN
Arm only goes as far as the last request in each direction, Then reverses direction immediately

Selecting a Disk-Scheduling Algorithm
SSTF is common and has a natural appeal SCAN and C-SCAN perform better for systems that place a heavy load on the disk Performance depends on the number and types of requests Requests for disk service can be influenced by the file-allocation method The disk-scheduling algorithm should be written as a separate module, allowing it to be replaced with a different algorithm if necessary Either SSTF or LOOK is a reasonable choice for the default algorithm

Summary File system interface: File and Directory concepts
Directory Structure: tree and general graph Multiple file systems: common interface using Virtual FS File system implementation On-disk structures: directory structure, FCB, superblock, … In-memory structures: caches, open-file tables Details of opening, closing, accessing, and creating files Block allocation: contiguous, linked, indexed Free-space management: bitmap, linked list Disk structure: cylinders, tracks, sectors, logical blocks Transfer time and positioning time (latency + seek) Disk scheduling: To minimize seek time (head movements) FCFC, SSTF, SCAN, LOOK

CMPT 300: Operating Systems I

Similar presentations

Presentation on theme: "CMPT 300: Operating Systems I"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CMPT 300: Operating Systems I

Similar presentations

Presentation on theme: "CMPT 300: Operating Systems I"— Presentation transcript:

Similar presentations

About project

Feedback