Presentation on theme: "File Concept A file is a named collection of related information that is recorded on secondary storage. A file has a define structure, which we must know."— Presentation transcript:
File Concept A file is a named collection of related information that is recorded on secondary storage. A file has a define structure, which we must know in order to interpret its contents. Examples: text, image, executable, etc. Files have attributes, usually including the following: – Name: human-readable file name – Identifier: numeric identifier within the file system – Type: some systems formally support different file types – Location: address of the file in a storage device – Size: number of bytes (or words, or blocks) in the file – Protection: access-control information – Time, date, and user identification: may be useful for protection, security, and resource-monitoring
File Operations Since a file is an abstract data type, we should define the operations that can be performed on files. The operating system provides system calls to perform these operations – Create: the OS must find space in the file system and add an entry to the directory – Write: OS must find the location of a file and usually keeps a write pointer that indicates where the next write will occur – Read: OS must find the file and usually keeps a read pointer that indicates where the next read will occur – Reposition within a file: change the file position pointer to a given value (i.e. seek a given location) – Delete: release the space allocated to a file and update the directory – Truncate: erase the contents of a file, but keep its attributes
File Operations To reduce directory searching, many systems require an open() system call before a file is first used. The OS maintains an open-file table with information about all open files. When a process finishes using a file, it calls a close() system call. The open-file table contains the following information for each file: – File pointer: stores the current read/write location within a file; unique to a process accessing the file – File-open count: the number of processes accessing the file – Disk location of the file: to improve access speed, the location of the file on disk is stored in memory – Access rights: indicates what operations a process is allowed to do to a file
File Locking The operating system may provide processes the ability to lock an open file to prevent other processes from gaining access to it. Locks may be shared by several processes or exclusive to one process. Locks may be mandatory or advisory – Mandatory locks are enforced by the operating system – Advisory locks are not enforced by the OS; it is up to application programmers to ensure that locks are properly acquired and released Windows systems generally use mandatory locking, while UNIX systems generally use advisory locking. File locking in Java is accomplished via the lock() method of the FileChannel object associated with a file.
File Types An operating system may be designed to recognize and support various file types. File types are often stored in the file name, as a file extension (the part of the file name following a period). The system may use the file extension to indicate the type of operations that can be done to a file File extensions are often just hints, not guarantees that a certain file is of a given type. Mac OS X stores a creator attribute with each file: the name of the program that created the file, so that it can open files with the correct application UNIX uses a magic number stored at the beginning of some files to indicate the general type; users may add file extension hints, but the OS does not use them
File Structure Some operating systems impose certain structure on files or require that files conform to predetermined file types. For example, UNIX considers each file to be a sequence of 8-bit bytes, though it does not impose an interpretation of the bytes. All operating systems must support some sort of executable file so that users can run programs. Mac OS requires that files contain two parts: a resource fork (containing user-specific information) and a data fork (containing program code or data). Since disk I/O is performed in blocks of set size (e.g. 512 bytes), the OS must pack logical records into physical blocks to be stored on the disk.
Practice (10.2) Why do some systems keep track of the type of a file, while others leave it to the user and others simply do not implement multiple file types? Which system is “better”?
Practice (10.13) What are the advantages and disadvantages of recording the name of the creating program with the file’s attributes (as is done in the Macintosh Operating System)?
File Access Sequential Access: information is processed in order, one record after another Direct Access: views a file as a numbered sequence of blocks that may be accessed in any order Direct access can be extended to use an index to help find locations within a file, which can reduce access time for finding information in a large file.
Storage Structure Disk can be subdivided into partitions. – Partitions also known as minidisks or slices. – Disks or partitions can be RAID protected against failure. – Disk or partition can be used raw (without a file system) or formatted with a file system. Entity containing file system known as a volume. – Each volume containing a file system also tracks that file system’s info in a device directory or volume table of contents. As well as general-purpose file systems there are many special- purpose file systems, frequently all within the same operating system or computer.
Directory Overview A directory can be viewed as a symbol table that translates file names into directory entries. The following operations are performed on directories: – Search for a file – Create a file – Delete a file – List the contents of a directory – Rename a file – Traverse the file system: access every directory and file within a directory structure (e.g. for backup)
Directory Structure Single-Level Directory Simplest structure Requires files to have unique names Provides no facility for grouping files Not suitable for organizing large number of files or for multiuser systems
Directory Structure Two-Level Directory Provides a master file directory (MFD) for the system and user file directory (UFD) for each user Isolates one user’s files from other users (good for protection, bad for collaboration)
Directory Structure Tree-Structured Directory Allows users to create their own directories to organize files Each program must keep track of its current directory. Path names can be absolute or relative. A user could be allowed access to files of another user.
Directory Structure Acyclic-Graph Directory Allows subdirectories to be shared, existing in the file system in two (or more) places at once. This could be implemented by duplicating file information in different directories, but such an implementation may be hard to keep consistent UNIX: implements shared files and directories via links, which are pointers to other files or directories If a file is deleted, what happens to any links pointing to it?
Directory Structure General Graph Directory Allowing links could produce cycles in the directory graph. How could we guarantee no cycles? If we allow cycles, then we must design search algorithms so that no part of the file system is searched repeatedly.
Practice (10.4) Could you simulate a multilevel directory structure with a single-level directory structure in which arbitrarily long names can be used? If your answer is yes, explain how you can do so, and contrast this scheme with the multilevel directory scheme. If your answer is no, explain what prevents your simulation’s success. How would your answer change if file names were limited to seven characters?
File-System Mounting Each volume must be mounted before it can be available to processes on the system To mount a volume, the OS must be given the device name and the mount point (the location within the file structure where the new file system will be attached). What happens if a file system is mounted over a directory that contains files? Example: Macintosh searches new devices for file system; if it finds a file system, it automatically mounts it at root level Example: Windows maintains a two-level directory structure, with devices and volumes assigned drive letters, though recent versions allow a file system to be mounted at any point in the directory tree.
File Sharing If a system supports multiple users, how does it provide protection and sharing of files? Many systems associate an owner and group with each file and directory. – The owner is the user who can change attributes and has the most control over the file. – The group is a set of users who share access to the file. – If a user requests an operation on a file, the OS determines whether the user is the owner of the file or part of the group, and thus whether the requested operation is permitted.
Remote File Systems In a distributed file system (DFS), remote directories are visible from a local machine. The client-server model allows clients to mount remote file systems from servers (e.g. H:\ drive at Huntington University). Standard operating system file calls are translated into remote calls. Distributed Information Systems (distributed naming services) such as LDAP, DNS, and Active Directory implement unified access to information needed for remote computing Remote file systems add new failure modes, particularly due to network failure or server failure. Recovery from failure can involve state information about status of each remote request.
Consistency Semantics Consistency semantics specify how multiple users are to access a shared file simultaneously. Similar to process synchronization algorithms from Chapter 6, but less complex due to slow speed of disk I/O and network latency (for remote file systems). Unix file system (UFS) implements: – Writes to an open file visible are immediately to other users of the same open file. – Sharing file pointer to allow multiple users to read and write concurrently. Andrew File System (AFS) implemented complex remote file sharing semantics – Writes to an open file by one user are not immediately visible to other users who have the same file open. – Once a file is closed, the changes to it are visible only in sessions starting later; already open instances of the file do not change.
Protection Protection involves keeping files safe from improper access. Protection mechanisms limit the types of access that can be made. Operations that might be controlled include: read, write, execute, append, delete, list attributes Access control is usually dependent on the identity of the user – An access-control list (ACL) specifies the types of access allowed for each user – An ACL can be very long and difficult to manage – A simpler solution is to grant permissions to categories of users, such as owner, group, and universe. – ACLs may be combined with owner/group/universe permissions.
Protection UNIX allows read, write, and execute privileges to be granted or denied to each of owner, group, and universe. Windows provides protection options in the Security tab of the File Properties dialog box. Protection could also be achieved by requiring passwords in order to access files. Directories must be protected as well. If files may have numerous path names (when links exist), then a given user may have different access rights to a particular file, depending on the path name used.
Practice (10.8) Consider a system that supports 5,000 users. Suppose that you want to allow 4,990 of these users to be able to access one file. a)How would you specify this protection scheme in UNIX? b)Can you suggest another protection scheme that can be used more effectively for this purpose than the scheme provided by UNIX?
Practice (10.19) What are some advantages and disadvantages of associating with remote file systems (stored on file servers) a different set of failure semantics from that associated with local file systems?