Chapter 10 File-System Interface

Chapter 10 File-System Interface

Chapter 10: File-System Interface
File Concept Access Methods Directory Structure File-System Mounting File Sharing Protection

Chapter Objectives To explain the function of file systems
To describe the interfaces to file systems To discuss file-system design tradeoffs, including: Access methods File sharing File locking Directory structures To explore file-system protection

File Concept OS provides a uniform logical view of information storage to make the computer system convenient to use. OS abstracts from the physical properties of its storage devices to define the file as a logical storage unit. A file is a named collection of related information that is recorded on secondary storage. Files represent programs (both source and object forms) and data (numeric, character, binary). Files may be free form, such as text files, or may be formatted rigidly.

File Attributes A file's attributes vary from one operating system to another but typically consist of: Name – only information kept in human-readable form. Identifier – unique tag (usually a number) identifies the file within the file system. Type – needed for systems that support different types. Location – pointer to file location on device. Size – current file size. Protection – controls who can do reading, writing, and executing. Time, date, and user identification – useful data for protection, security, and usage monitoring. Information about all files are kept in the directory structure, which is resided on secondary storage.

File Operations File is an abstract data type.
OS must perform each of these six basic file operations: Create find space for the file in the file system (ch. 11). made an entry for the new file the directory. Write search the directory to find the file's location. keep a write pointer to next write location in the file. Read keep a read pointer to next read location in the file.

File Operations (cont.)
OS must perform each of these six basic file operations: Reposition within file (seek) the directory is searched for the appropriate entry. reposition the current-file-position pointer to a given value. Delete search the directory for the named file. release all file space and erase the directory entry. Truncate erase the contents of a file but its attributes. lets the file be reset to length zero and its space released.

Open Files Many systems require making an open() system call before first use of a file to avoid searching the directory for mentioned file entry in most file operations. OS keeps a small table, called open-file table, containing information about all open files. When a file operation is requested, the file is specified via an index into open-file table. The file is closed when it is no longer be used. OS removes the closed file entry from open-file table. Some systems open a file implicitly when the first reference to it is made .

Open Files (cont.) Several pieces of data are needed to manage open files: File pointer: pointer to last read/write location for each process that has the file open. File-open count: counter for number of times a file is open – to allow removal of data from open-file table when last processes closes it. Disk location of the file: cache data access information. Access rights: per-process access mode information, so the operating system can allow or deny subsequent requests. The implementation of the open() and close() operations is more complicated when several processes may open the file at the same time.

Open File Locking Some operating systems provide facilities for locking an open file (or sections of a file). File locks allow one process to lock a file and prevent other processes from accessing it. There are two types of locks: shared and exclusive: In a shared lock, as a reader lock, several processes can acquire the lock concurrently. An exclusive lock, like a writer lock, only one process at a time can acquire such a lock. Furthermore, operating systems may provide either mandatory or advisory mechanisms: Mandatory – access is denied depending on locks held and requested. Advisory – processes can find status of locks and decide what to do.

File Locking Example – Java API
import java.io.*; import java.nio.channels.*; public class LockingExample { public static final boolean EXCLUSIVE = false; public static final boolean SHARED = true; public static void main(String arsg[]) throws IOException { FileLock sharedLock = null; FileLock exclusiveLock = null; try { RandomAccessFile raf = new RandomAccessFile("file.txt", "rw"); // get the channel for the file FileChannel ch = raf.getChannel(); // this locks the first half of the file – exclusive exclusiveLock = ch.lock(0, raf.length()/2, EXCLUSIVE); /** Now modify the data */ // release the lock exclusiveLock.release();

File Locking Example – Java API (cont)
// this locks the second half of the file - shared sharedLock = ch.lock(raf.length()/2+1, raf.length(), SHARED); /** Now read the data */ // release the lock exclusiveLock.release(); } catch (java.io.IOException ioe) { System.err.println(ioe); } finally { if (exclusiveLock != null) if (sharedLock != null) sharedLock.release();

File Types – Extension If an operating system recognizes the type of a file, it can then operate on the file in reasonable ways. A common technique for implementing file types is to include the type as part of the file name. The name is split into two parts - a name and an extension, usually separated by a period character. The system uses the extension to indicate the type of the file and the type of operations that can be done on that file. Application programs also use extensions to indicate file types in which they are interested. When the user opens a file, by double-clicking on the icon representing it, the associated application is invoked automatically and the file is loaded.

Common File Types

File Structure File types also can be used to indicate the internal structure of the file. Common file structures: None - sequence of words, bytes. Simple record structure Lines Fixed or variable length Complex Structures Formatted document Relocatable load file Can simulate last two with first method by inserting appropriate control characters. Who decides: (operating system or program).

Access Methods Stored information in a file can be accessed in several ways. Some systems provide only one access method for files and other systems support many access methods. Choosing the right access method for a particular application is a major design problem. General access methods: Sequential access Direct access Indexed access

Sequential Access Simplest access method.
Based on a tape model of a file and works well on sequential-access and random-access devices. Information in the file is processed in order, one record after the other. A read operation – read next - reads the next portion of the file and automatically advances a file pointer. Similarly, the write operation - write next - appends to the end of the file and advances to the new end of file. A file can be reset to the beginning and may be able to skip forward or backward n records.

Sequential-access File

Direct Access Based on a disk model of a file, since disks allow random access to any file block. A file is made up of fixed length records that allow to read and write records in no particular order. File operations must be modified to include the block number as a parameter. We have read n and write n, where n is the block number. Direct-access files are of great use for immediate access to large amounts of information. Databases are often of this type. We can simulate sequential access on a direct- access file by keeping current position of the file.

Simulation of Sequential Access on a Direct-access File

Indexed Access Built on top of a direct-access method.
Involve the construction of an index for the file. The index contains pointers to the file blocks. To find a record in the file, we first search the index and then use the pointer to access the file directly and to find the desired record. The index itself of large file may become too large; therefore, it is better to create an index for the index file. The primary index file would contain pointers to secondary index files, which would point to the actual data items.

Example of Index and Relative Files

Directory Structure It is desirable to place multiple file systems on entire or parts; known as partitions; of a disk. The disk parts can also be combined to form larger structures known as volumes, and create file systems on them. Each volume that contains a file system must also contain information about the files in that system. This information is kept in entries in a volume table of contents or a device directory (simply known as a directory). Directories are also used to organize the huge file systems of computers.

Typical File-system Organization

Directory Structure (cont.)
The directory can be viewed as a symbol table that translates file names into their directory entries. Operations performed on a directory: Search for a file Create a file Delete a file List a directory Rename a file Traverse the file system to access every directory and every file within a directory structure.

Directory Structure (cont.)
The directory itself must be organized (logically) to obtain: Efficiency – locating a file quickly. Naming – convenient to users: Two users can have same name for different files. The same file can have several different names. Grouping – logical grouping of files by properties, (e.g., all Java programs, all games, …) Common schemes for defining the logical structure of a directory: Single-Level Directory Two-Level Directory Tree-Structured Directories … … …

Single-Level Directory
The simplest directory structure. All files are contained in the same directory, which is easy to support and understand. Limitations: when the number of files increases or when the system has more than one user. Naming problem Grouping problem

Two-Level Directory Separate directory for each user:
Each user has his own user file directory (UFD). Each UFD lists the files of a single user. When a user logs in, master file directory (MFD), indexed by user name, is searched and points to the UFD for that user. When a user refers to a particular file, only his own UFD is searched.

Two-Level Directory Although the name-collision problem is solved, it still has disadvantages: Isolates one user from another that disallow the cooperation between users. Must have the ability to name a file in another user's directory. Must give both the user name and the file name; which defines a path in the tree from the root (MFD) to a leaf (the specified file). A special case of isolation with the system files. One solution would be to copy the system files into each UFD. Another solution is to define a special user directory to contain the system files – creating search path. No grouping capability.

Tree-Structured Directories

Tree-Structured Directories (Cont)
Tree generalization allows users to create their own subdirectories and to organize their files accordingly. The tree has a root directory, and every file in the system has a unique path name. A directory (or subdirectory) contains a set of files or subdirectories. A directory is another file treated in a special way. One bit in each directory entry defines the entry as a file (0) or as a subdirectory (1). Special system calls are used to create and delete directories. All directories have the same internal format.

Normally, each process has a current directory. The accounting file contains a pointer to the user's initial directory. If a needed file is not in the current directory, then the user must: specify a path name or change the current directory to be a directory holding that file. Path names can be of two types: absolute and relative. An absolute path name begins at the root and follows a path down to the specified file. A relative path name defines a path from the current directory.

How to handle the deletion of a directory: If a directory is empty, it can simply be deleted. If the directory to be deleted is not empty but contains several files or subdirectories, then one of two approaches can be taken. Some systems, such as MS-DOS, will not delete a directory unless it is empty. An alternative approach, such as that taken by the Unix, when a request is made to delete a directory, all that directory's files and subdirectories are also to be deleted. Users can be allowed to access, in addition to their files, the files of other users. Some operating systems automate the search for executable programs to allow accessing programs without having to remember long paths.

Acyclic-Graph Directories

Acyclic-Graph Directories (Cont.)
An acyclic graph, is a graph with no cycles, allows directories to share subdirectories and files. It is a natural generalization and more flexible than the tree-structured scheme but it is also more complex. A shared directory or file will exist in two (or more) places at the same time. With a shared file, only one actual file exists, so any changes made in one place are immediately visible in the other. When people are working as a team, all the files they want to share can be put into one shared directory. Sharing is particularly important for subdirectories; a new file created by one person will automatically appear in all the shared subdirectories.

Acyclic-Graph Directories (Cont.)
Shared files and subdirectories can be implemented in several ways: A Common way is to create a new directory entry called a link as a pointer to another file or subdirectory. Another approach is to duplicate all information about shared files in both sharing directories; but a major problem is maintaining consistent file modification. Several problems must be considered carefully. Aliasing: distinct file names may refer to the same file. Do not want to traverse shared structures more than once when accumulating statistics or copying all files to backup storage. Deletion and dangling pointers to now-nonexistent files. Solution: preserve the file until all references to it are deleted by keeping a list (or a count) of all references to that file.

General Graph Directory

General Graph Directory (Cont.)
A serious problem with using an acyclic-graph structure is ensuring that there are no cycles: To simplify the algorithm of traversing the graph. One solution to prevent infinite loop searching through the cycle with general graph directory is to limit number of directories that will be accessed during a search. To determine when a file can be deleted - a value of the reference count is 0 The anomaly results from the possibility of self-referencing (or a cycle) in the directory structure can be solved using a garbage-collection scheme. Garbage collection involves traversing the entire file system, marking everything that can be accessed and collecting everything that is not marked onto a list of free space.

File System Mounting A file system must be mounted before it can be accessed. Mounting directory structure built out of multiple volumes to make them available within file-system name space. The mount procedure is started when OS is given the name of the device and the mount point. A mount point is the location within the file structure where the file system is to be attached. Next, OS verifies that the device contains a valid file system - expected directory format. Finally, OS notes in its directory structure that a file system is mounted at specified mount point.

File System Mounting Unmounted Partition Mount Point File System

File Sharing Sharing of files on multi-user systems is desirable for users who want to collaborate the effort required to achieve a computing goal. The system can either allow a user to access the files of other users by default or require that a user specifically grant access to the files. Most systems use the concepts of file (or directory) owner and group to implement sharing and protection. The owner is the user who has the most control over the file - can change attributes and grant access. The group attribute defines a subset of users who can share access to the file.

File Sharing – Multiple Users
User IDs identify users, allowing permissions and protections to be per-user. Group IDs allow users to be in groups, permitting group access rights. The owner and group IDs of given file (or directory) are stored with the other file attributes. When a user requests an operation on a file, the user ID can be compared with the owner attribute to determine if the requesting user is the owner of the file. Likewise, the group lDs can be compared. The result indicates which permissions can be applied to the requested operation and allows or denies it.

Remote File Systems Networking allows sharing remote file systems (and other resources) spread around the world. Manually transferring files via programs like FTP Automatically, using distributed file systems remote directories are visible from a local machine. Semi automatically via the world wide web A browser is needed to gain access to the remote files, and separate operations are used to transfer files. Client-server model allows clients to mount remote file systems from different servers. The server is the machine containing the file system. The client is the machine seeking access to the remote files.

Remote File Systems (cont.)
A server can serve multiple clients and a client can use multiple servers (many-to-many relationships), depending on the client-server facility. NFS is standard UNIX client-server file sharing protocol. Standard operating system file calls are translated into remote calls. Client and user-on-client identification is insecure or complicated. A client can be specified by a network name or other identifier, as an IP address, but these can be unsafe. More secure solutions include secure authentication of the client via encrypted keys and security of key exchanges.

Remote File Systems (cont.)
Distributed Information Systems provide unified access to the information needed for remote computing. The domain name system (DNS) provides host-name to network-address translations for the entire Internet. Sun introduced network information service (NIS) that centralizes storage of user names, host names, printer information, and the like. Microsoft institutes active directory as a new technology (available in Windows XP and Windows 2000) which provide a single name space for users and used by all clients and servers to authenticate users.

Remote File Systems – Failure Modes
In addition to failure modes of local file sharing, remote file systems add more failure modes due to network failure and server failure. Consider the remote file system is no longer reachable because any failure mode when a client in the midst of using a remote file. The system can terminate all operations to the lost server or delay operations until to be reachable again. Recovery from failure requires maintaining state information about status of each remote request on both the client and the server. Stateless DFS protocols, such as NFS, include all information in each request, allowing easy recovery but less security.

File Sharing – Consistency Semantics
These semantics specify how multiple users are to access a shared file simultaneously. Identify when modifications of data by one user will be observable by other users. Directly related to process-synchronization algorithms Tend to be less complex due to great disk I/O and network latency. Unix file system (UFS) implements the following semantics: Writes to an open file by a user are visible immediately to other users that have this file open. Allow sharing of a file pointer to let multiple users to read and write concurrently. Andrew File System (AFS) uses the following session semantics: Writes to an open file by a user are not visible immediately to other users that have the same file open. The changes made to a file are visible only in sessions starting after closing that file.

Protection Need to protect files is a direct result of the ability to access files of other users. Access is permitted or denied depending on several factors, one of which is the type of access requested. Read. Read from the file. Write. Write or rewrite the file. Execute. Load the file into memory and execute it. Append. Write new information at the end of the file. Delete. Delete the file and free its space. List. List the name and attributes of the file.

Protection (cont.) Many protection mechanisms have been proposed and each has advantages and disadvantages but must be appropriate for its intended application. Most common approach is to make access dependent on identity of the user. Different users may need different types of access to a file or directory. Associate with each file and directory an access-control list (ACL) specifying user names and the types of access allowed for each of them. When a user requests access to a particular file, the OS checks the access list associated with that file.

Protection (cont.) The main problem with access lists is their length.
If we want to allow everyone to read a file, we must list all users with read access. The directory entry needs to be of variable size, resulting in more complicated space management. To condense the length of the access-control list, many systems recognize three classifications of users associated with each file: Owner. The user who created the file. Group. A set of users who are sharing the file and need similar access. Universe. All other users in the system.

Protection (cont.) With the limited protection classification, only three fields are needed to define protection. A separate field is kept for the file Owner, for the file's group, and for all other users. Each field is a collection of bits, and each bit either allows or prevents the access associated with it. For example, the UNIX system defines: Nine bits per file to record protection information. Three fields of 3 bits each - rwx, where r controls read access, w controls write access, and x controls execution. Windows XP users typically manage access-control lists via the GUI.

Access Lists and Groups
Mode of access: read, write, execute Three classes of users RWX a) owner access 7  1 1 1 b) group access 6  1 1 0 c) public access 1  0 0 1 Ask manager to create a group (unique name), say G, and add some users to the group. For a particular file (say game) or subdirectory, define an appropriate access. owner group public chmod 761 game Attach a group to a file: chgrp G game

A Sample UNIX Directory Listing

Windows XP Access-control List

End of Chapter 10

Chapter 10 File-System Interface

Similar presentations

Presentation on theme: "Chapter 10 File-System Interface"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Chapter 10 File-System Interface

Similar presentations

Presentation on theme: "Chapter 10 File-System Interface"— Presentation transcript:

Similar presentations

About project

Feedback