Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 File Management How much does the operating system know? Some systems support different types Advantage – prevents you trying to read executable files.

Similar presentations


Presentation on theme: "1 File Management How much does the operating system know? Some systems support different types Advantage – prevents you trying to read executable files."— Presentation transcript:

1 1 File Management How much does the operating system know? Some systems support different types Advantage – prevents you trying to read executable files Disadvantage – added complexity can’t cope with new file types e.g MP3 Both MS-DOS and UNIX don’t care Considered to be a sequence of bytes with no structure However UNIX recognises Regular files – text data etc Directories Char/block – files which refer to devices Pipes – FIFO buffers MS-DOS only really has attributes System files Archive Hidden Read only Application packages do the rest

2 2 File System Services In one form or another, all file systems provide applications with the ability to: Create a file Remove a file Open an existing file Read from an open file Write to an open file Close an open file fetch metadata of a file Modify metadata of a file Metadata are the data about a file.e.g. file attributes, (name, size, data type, etc) and data about records or data structures (length, fields, columns, etc) and data about data (where it is located, how it is associated, ownership, etc.).

3 3 File Structure In the simplest scenario the data is totally unstructured and appears as a stream of bytes. –The disadvantage to this approach is that each application may treat data structures e.g. one program may treat the fields in a database in a totally different way. The second way is to store and process data in terms of records of 80 (or some other fixed number of) characters. –E.g. the first nine characters might be Social Security Number, the next 15 might be the first name etc. When dealing with fixed sized records, the record size is usually stored in the file’s metadata. However, there are disadvantages in having the operating system know about the file structures. –The principal of these is the resultant size and complexity of the system. –Additionally, a new application may require a file structure or access facility not implemented by the supplied system.

4 4 In this respect, UNIX adopts an extreme position; files are considered to be sequences of bytes with no structure. UNIX recognises a limited number of ‘file types’, which are described below: regular‘Ordinary’ files such as programs, text, data etc; in fact any file which is not of the other types. directoryFile containing references to other files; described shortly. char/block Not ‘true’ files at all but directory entries which refer to devices. pipeA pipe file is used as a queuing buffer which holds the standard output of one process and supplies this data as the standard input of another process.

5 5 Both Microsoft and UNIX use directories which are notional grouping of files; since directories reside on disk, they can be considered as special files. With the exception of directories, the nearest that Microsoft comes to having different file types is that files can have certain attributes. The possible attributes are: System Assigned to system files such as the operating system files ArchiveUsed by file back-up systems Hidden A file with this attribute is ignored by many system commands Read-only The file cannot be written to or deleted Attributes are not mutually exclusive; e.g. a ‘read-only’ file can also be ‘hidden’.

6 6 File identification Microsoft Windows original naming convention was the 8.3 filename convention BASENAME.EXT When Internet first arrived, Windows systems were still restricted to 8.3 filename formats had to create web pages with names ending in.HTM, while Macintosh or Unix used.html filename extension. Similar with Java, since source code files to have the extension.java and compiles object code with.class. Eventually, Windows introduced support for long file names, and removed the 8.3 name/extension split in file names. It changed the length restriction to 255 characters, and allowed a mix of upper case and lower case letters. The use of three-character extensions under Microsoft Windows has continued (although it could be longer, as long as the whole name is less than 255) mainly for backward compatibility Cannot use / \ ? : * “ | characters or control characters in a filename

7 Unix stored the file name as a single string, not split into base name and extension components, with the '.' being just another character. Some applications use suffixes to indicate file types, but they did not use them as much - for example, executables and ordinary text files had no suffixes in their names. 7

8 8 Directories Early operating systems ‘lumped together’ the files on a disk. Files belonging to several different users and/or applications cannot be readily distinguished, hence problems such as file naming, security and ‘housekeeping’. –For example, if several people were using the disk, the name of files would need to be strictly controlled by some person assigned to this task or by enforcing conventions which avoided name conflict. –This was not a problem in earlier systems where, in effect, access to the computer was centralised in the data processing department. Systems introduced directories as a logical grouping of files managed by using a special directory file which contains a list of the directory’s member files. The first directory systems were simply two level; the top level contained user names plus a pointer to another directory which held all the files for that user.

9 9 Simple Two Level Directory Structure

10 10 For each of its component files, the directory will generally hold information pertaining to the file e.g. Filename file type, if the system recognises different file types file attributes information indicating the location of the file on the disk access rights; i.e. an indication of who can access the file and how it can be accessed file size in bytes date information: e.g. date of creation, date of last access, date of last amendment Note that it is admissible to have two or more files with the same name within the system provided that they are in separate directories.

11 11 Managing file space Generally, space is allocated in units of a fixed size, called an allocation unit or block, which is a simple multiple of the disk physical sector size, usually 512 bytes. Typical sizes are 512, 1024 and 2048 bytes. Unix is generally 1kByte (1024 bytes). Each disk block has a unique address or disk block number

12 12 The actual representation of the set of free blocks generally takes on of several forms: Firstly, there is the free bitmap. In this representation, each block is represented by a single bit, which is 1 if the block is free and 0 if allocated. The second representation is a free list, normally implemented as a linked list. The links need only to be a single pointer to the head of the list.

13 13 The Third representation for free blocks is a simple list of free blocks. If there is at least one free block on the disk then the list can be stored in the free blocks themselves. However there must be a way to identify other blocks if the entire list doesn’t fit into one block. One approach is to create a linked list of these list blocks using the last pointer in the block to point to the next block in the list

14 14 Treatment of Devices and Files As far as the user is concerned, all sources of input and output in a Unix system are represented as files. Teminals, disk drives, files, communication mechanisms such as pipes and sockets, all look alike to the systems programmer and are treated in the same way. Users System Call Interface I/O Subsystem Device Driver Interface Drivers TerminalDisk Network

15 15 File Management File naming MS-DOS – up to 8 character name + dot + 3 character extension UNIX – typical length = 14 but Linux = 256 no structure required – any character except / or ok can’t use >, *, ? because they have special meaning Windows – up to 255 characters, can have spaces also generates MS-DOS filename ADAMS JONES SMITH PROG1 File PROG1 MASTER DIRECTORY USER DIRECTORY FOR ADAMS List of user names Two Level Directory System

16 16 File Management system cprogs editspool srcincludework prog1.cprog2.cprog1.efile.dat ROOT Directory normally holds following information filename file type file attributes location on disk access rights file size date information

17 17 File Management Clusters Disk Space as an array of clusters Allocated file Unused portion of cluster Cluster sizes range from 512 to 64kBytes Using LBA addressing is a 32 bit address for each cluster Therefore 2 32 = 4,294,967,296 addresses. At 512 bytes = 4,294,967,296 x 512 = 2,199,023,255,552 = 2 Tera bytes

18 18 File Management A A B A A C C C B B File A End of file File B End of file Free cluster A A B A A C C C B B From directory entries Space allocation – chained clusters Typical cluster allocation of several files

19 19 File Management (FAT Table) ORDERS DAT no attribs 9/12/00 11:23:44 40 11,230 Directory entry: Field 1- filename Field 2 – extension e.g.txt,.dat Field 3 – attributes e.g hidden, read only, directory Field 4 – date last modified Field 5 – time last modified Field 6 - starting cluster Field 7 – size in bytes File Allocation Table (FAT) Entry# Value ………………………… 39EOF 4041 4142 4244 43Bad 44102........................................ 102103 103EOF

20 20 Current Windows File Systems HPFS (High Performance File System) is used by OS/2 and is supported by Windows NT. It provides better performance than FAT on larger disk volumes and supports long file names. NTFS (New Technology File System) is the standard file system of Windows NT, including its later versions Windows 2000, Windows XP, Windows Server 2003, Windows Server 2008, Windows Vista, and Windows 7. NTFS supersedes the FAT file system as the preferred file system for Microsoft’s Windows operating systems. NTFS has several improvements over FAT and HPFS. NTFS supports long file names including Unicode filenames, large volumes, data security, and universal file sharing. Formatting a volume with the NTFS file system results in the creation of several system files and the Master File Table (MFT), which contains information about all the files and folders on the NTFS volume.

21 21 Master File Table Logically, the disk consists of allocation units called clusters. A cluster is a power-of-two multiple of the physical disk block size. The cluster size is set when the disk is formatted. The free list is a bitmap, each of whose bits describe one cluster. Clusters on the disk are numbered starting from zero to the maximum number of clusters (minus one). These numbers are called logical cluster numbers (LCN) and are used to name blocks (clusters) on disk.

22 22 MFT Standard information: This attribute includes the information that was standard in the MS-DOS world: read/write permissions, creation time, last modification time, count of how many directories point to this this file (hard link count. File Name: This attribute describes the file's name in the Unicode character set. Security Descriptor: This attribute lists which user owns the file and which users can access it (and how they can access it). Data: This attribute either contains the actual file data in the case of a small file or points to the data

23 23 MFT When dealing with large data, the Data attribute contains pointers to the data. The pointers to data are actually pointers to sequences of logical clusters on the disk. Each sequence is identified by three parts: –starting cluster in the file, called the virtual cluster number (VCN), –starting logical cluster (LCN) of the sequence on disk, –length, counted as the number of clusters. The run of clusters is called an extent.

24 24 Unix File Systems boot block - used to boot the operating system. super block - main function of the super block is to tell the file system how big the various pieces of the file system are. The super block contains the following information, to keep track of the entire file system. Size of the file system Number of free blocks on the system A list of free blocks Index to next free block on the list Size of the inode list Number of free the inodes A list of free inodes Index to next free inode on the list Lock fields for free block and free inode lists Flag to indicate modification of super block i-nodes followed by the block available for storage. Note that the free space is maintained as a linked list of available blocks.

25 25 Inodes

26 26 Inode Pointer Structure

27 27

28 END 28

29 29 Distributed Link Tracking maintains the integrity of shortcuts to files as well as OLE links within compound documents. Sparse Files Sparse files allow programs to create very large files but consume disk space only as needed.

30 30 Encryption The Encrypting File System (EFS) provides the core file encryption technology. Disk Quotas Disk quotas can be used to monitor and limit disk-space use. Reparse Points similar to Windows shortcuts and Unix symbolic links. For example, a reparse point would allow a folder such as C:\DVD to point to E:, the actual DVD drive. Volume Mount Points You already have one hard disk (Drive 1) mapped as C, and you don't want to map the second disk (Drive 2) as D. You can get around this problem by adding a mount point to the directory structure of Drive 1 that references Drive 2. Distributed Link Tracking maintains the integrity of shortcuts to files as well as OLE links within compound documents. Sparse Files Sparse files allow programs to create very large files but consume disk space only as needed.

31 31

32 32 Shared and Exclusive Access If a file is already open and another process wants access to it, the operating system has to decide whether to allow this or block it. –In practice both cases may be desirable. –For instance if both processes are reading the file o.k, however if both processes want to write to the file it may lead to inconsistent data. –Consequently most file systems allow for both. Two methods of requesting exclusive access are: The system call to open the file is passed a flag to say it is to be opened exclusively – if another process wants to access it, then it will have to wait. A system call which has the ability to lock a file or parts of it –The difference between locking a file and locking an area of memory is that processes declare when they intend to write to a file.

33 33 Access Patterns More often than not a process expects to open a file and begin reading and writing at the beginning. Each subsequent read or write continues where the last one left off. This type of sequential access requires that the operating system keeps tabs on the current location. However there are times when random access is required. This feature is sometimes included by using a rewind operation or seek. You can find both of these commands in the ‘C’ programming language.

34 34 Uniform treatment of devices and files

35 35 File Management 12451245 3 10 11 678678 File A File B File C DIRECTORY


Download ppt "1 File Management How much does the operating system know? Some systems support different types Advantage – prevents you trying to read executable files."

Similar presentations


Ads by Google