Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 UNIX Internals – The New Frontiers Chapters 8 & 9 File Systems.

Similar presentations

Presentation on theme: "1 UNIX Internals – The New Frontiers Chapters 8 & 9 File Systems."— Presentation transcript:

1 1 UNIX Internals – The New Frontiers Chapters 8 & 9 File Systems

2 2 Contents u The User Interface to Files u File System u File System Framework u The Vnode/VFS Architecture u Implementation Overview u File-System-Dependent Objects u Mounting a File System u Operations on Files u The System V File System(s5fs) u S5fs Kernel

3 3 8.2 The User Interface u files, directory, file descriptor, file systems u File & Directories u File: logically a container for data u A hierarchical, tree-structured name space u Pathname: all the components in the path from the root to the node, by “/” u “.” & “..” u Link: a directory entry for a file.

4 4 Directory tree

5 5 Operation on directory u dirp = opendir(const *filename); u direntp = readdir (dirp); u rewinddir(dirp); u status = closedir(firp); u struct dirent { int_t d_ino; char d_name[NAME_MAX +1]; };

6 6 File Attributes u Kept in the inode: index node u File attributes: u File type u Number of hard links u File size u Device ID u Inode number u User and Group Ids of the owner of the file. u Timestamps u Permissions and mode flags

7 7 Permissions and mode flags u 0wner, group, others (3 x 3 bits) u Read, write, execute (3 bits) u Mode flags - apply to executable files - suid, sgid – to set the user’s effective UID to that of the owner of the file, - stick – to retain file in swap area

8 8 System calls u link, unlink – to create and delete hard links u utimes – to change the access and modify timestamps, u chown – to change the owner UID and GID, u Chmode – to change permissions and mode flags.

9 9 File Descriptors u fd = open (path, oflag, mode); u fd is a per-process object.

10 10 File descriptors

11 11 File I/O u Random and sequential access u lseek – random access u nread = read(fd, buf, count); u Write has similar semantics u Operations are serialized u In append mode offset pointer set to the end of the file

12 12 Scatter-Gather I/O u nbytes = writev(fd, iov, iovcnt);

13 13 File Locking u Read and write are atomic. u Advisory locks: protect from cooperative processes, flock() in 4BSD; in SVR3 chmod must be enabled first u SVR4: r/w locks. u Mandatory locks:kernel u C library function lockf

14 14 8.3 File systems u Mount-on - a directory is covered by the mounted file system. - mount table (original) & vfs list (modern) u Restrictions - file cannot span file system, - each file system must reside on a single logical disk

15 15

16 16 Logical Disks u A logical disk is a storage abstraction that the kernel sees as a linear sequence of fixed sized, randomly accessible blocks. u newfs, mkfs, u Traditional: partition – physical storage of a file system u Modern configurations: u Volume (several disks combined), u Disk mirroring u Stripe sets u RAID(Redundant Array of Inexpensive Disks)

17 17 Special files u Generalization to include all kinds of I/O related objects such as directories, symbolic links, hardware devices (disks, terminals, printers, psuedodevices such as the system memory, and communications abstractions such as pipes and sockets; u Problems with hard links – may not span file systems,can be created by superuser only, ownership problems,

18 18 Special files u Symbolic links – special file that points to another file (linked-to file); the data portion of the file contains the pathname of the linked-to file; may be stored in the I-node of the symbolic link ( more on this in Practical UNIX Programming pp.90-96); u Pipes – created by pipe system call, deleted by the kernel automatically u FIFOs - created by mknod system call, must be explicitly deleted;

19 19 8.5 File System Framework u Traditional UNIX can not support >1 types of FS. u The new developments (DOS, file sharing, RFS, NFS) require the framework to change. u AT&T: file system switch u Sun Microsystem: vnode/vfs u DEC: gnode u SVR4:(AT&T+ vnode/vfs+NFS)-> de facto standard

20 20 8.6 The Vnode/Vfs Architecture u Objectives u Support several file system types simultaneously. u Different disk partitions may contain different types of file systems. u Support for sharing files over a network. u Vendors should be able to create their own file system types and add them to the kernel.

21 21 Lessons from Device I/O u Devices: block & character u Character device switch: struc cdevsw { int (*d_open)(); int (*d_close)(); int (*d_read)(); int (*d_write)(); } cdevsw[ ]; u Major device number: as the index

22 22 read system call(in traditional UNIX) 1) Use the file descriptor to get to the open file object; 2) Check the entry to see if the file is open for read; 3) Get the pointer to the in-core inode from this entry; 4) Lock the inode so as to serialize access to the file; 5) Check the inode mode field and find that the file is a character device file. 6) Use the major device number to index into a table of character devices and obtain the cdevsw entry for this device; 7) From the cdevsw, obtain the pointer to the d_read routine for this device; 8) Invoke the d_read operation to perform the device- specific processing of the read request. 9) Unlock the inode and return to the user.

23 23 Lessons from Device I/O u It is necessary to separate the file subsystem code into file-system- independent code and file-system- dependent code u The interface between these two parts is defined by a set of generic functions that are called by the file system- independent code

24 24 Object Oriented Design

25 25 Overview of the Vnode/Vfs Interface u Vnode represents a file in the UNIX kernel. u Vfs represents a file system

26 26 )

27 27 base class data and operations pointers u v_data: inode(s5fs), rnode(NFS), tmpnode(tmpfs), u v_op: vnodeops Example: to close the file associated with the vnode u #define VOP_CLOSE(vp,…) (*((vp)->v_opclose))(vp,…)

28 28 VFS base class

29 29 8.7 Implementation Overview u Objectives u Each operation must be carried out on behalf of the current process. u Certain operations may need to serialize access to the file. u The interface must be stateless and reentrant. u FS implementation should be allowed to use global resources, such as buffer cache. u The interface should be usable by the server side u The use of fixed-size static tables must be avoided.

30 30 Vnodes and Open Files u The vnode is the fundamental abstraction that represents an active file in the kernel. u access to a vnode: u by a file descriptor u by file-system-dependent data structures

31 31 Data structures Reference count

32 32 The Vnode struct vnode {u_short v_flag; u_short v_count; struct vfs *vfsmountedhere; struct vnodeops *v_op; struct vfs *vfsp; … }; // p242

33 33 Vnode Reference Count u It determines how long the vnode must remain in the kernel. u Reference versus lock: u Acquire a reference: u Open a file u A process holds a reference to its current directory. u When a new file system is mounted u Pathname traversal routine u file is deleted physically when reference count becomes zero.

34 34 The Vfs Object u struct vfs { u struct vfs *vfs_next; u struct vfsops * vfs_op; u struct vnode *vfs_vnodecovered; u int vfs_fstype; u caddr_t vfs_data; u dev_t vfs_dev; u … u }; //p243

35 35

36 36 8.8 File-System-Dependent Objects u The Per-File Private Data u Vnode is an abstract objects.

37 37 The vnodeops Vector struct vnodeops{ int (*vop_open)(); int (*vop_close)(); … }; //p245 For ufs: struct vnodeops ufs_vnodeops = { ufs_open; ufs_close; … }; //p246

38 38

39 39 File-System-Dependent Parts of the Vfs Layer struct vfsops { int (*vfs_mount)(); int (*vfs_unmount)(); int (*vfs_root)(); int (*vfs_statvfs)(); int (*vfs_sync)(); … }; //p246

40 40

41 41 8.9 Mounting a File System u mount(spec, dir, flags, type, dataptr, datalen) //SVR4 u Virtual File System Switch - a global table containing one entry for each file system type. struct vfssw{ char *vsw_name; int (*vsw_init)(); struct vfsops * vsw_vfsops; …. } vsfsw[];

42 42 mount Implementation u Adds the structure to the linked list headed by rootvfs. u Sets the vfs_op field to the vfsops vector specified in the switch entry. u Sets the vfs_vnodecovered field to point to the vnode of the mount point directory.

43 43 VFS_MOUNT processing u Verify permissions for the operation. u Allocate and initialize the private data object of the file system. u Store a pointer to it in the vfs_data field of the vfs object. u Access the root directory of the file system and initialize its vnode in memory.

44 44 8.10 Operations on Files Pathname Traversal lookuppn(): u_cdir 1. v_type is of a directory 2. “..” & system root – move on 3. “..” & a mounted system root – access the mount point 4. VOP_LOOKUP 5. Not found, last one - success, else – error ENOENT 6. A mount point - go to the mounted vfs root 7. A symbolic link – translate it and append 8. Release the directory 9. Go back to the top of the loop 10. Terminate, do not release the reference of the final vnode //p250

45 45 Opening a file fd = open(pathname, mode) 1. Allocate a descriptor 2. Allocate an open file object 3. Call lookuppn() 4. Check the vnode for permissions 5. Check for the operations 6. Not exist, O_Creat, VOP_CREAT; ENOENT 7. VOP_OPEN 8. If O_TRUNC, VOP_SETATTR 9. Initialize 10. Return the index of the file descriptor //p252

46 46 Other topics u File I/O u File attributes u User credentials u Analysis u Drawbacks of the SVR4 Implementation u The 4.4 BSD Model

47 47 Chapter 9 File System Implementations

48 48 9.2 The System V File System(s5fs) u The layout of s5fs partition: u Directories: u s5fs directory is a special file containing a list of files and subdirectories. B S inode list data blocks

49 49 Inodes u The inode contains administrative information,or meta data. u The node list contains all the inodes. u On-disk inode - see Tab. 9-1 u In-core inode have more fields

50 50 Inode Fields

51 51 di_mode Bit-fields

52 52 Block array of inode—di_addr inode 10, 10K 256, 256K 256*256=65K, 65M 256*256*256=16M, 16G

53 53 The superblock u Size in blocks of the file system u Size in blocks of the inode list u Number of free blocks and inodes u Free block list u Free inode list

54 54 Free block list

55 55 9.3 s5fs Kernel Organization u In-core Inodes u The vnode u Device ID u Inode number of the file u Flags for synchronization and cache management u Pointers to keep the inode on a free list u Pointers to keep the inode on a hash queue. u Block number of last block read

56 56 Allocating and Reclaiming Inodes u Inode table(LRU) containing the active inodes u Reference count of a vnode ==0 the reclaim the inode as free u Iget()(allocating):

57 57 Inode lookup u s5lookup() u Checks the directory name lookup cache u Directory name lookup cache Miss? Reads the directory one block at a time, searching the entries for the specified file name:Get it u If the file is in the directory, get the inode number, use iget() to locate the inode, u Inode in the table?get it: allocate a new inode, initialize, copy, put in the hash queue, also initialize the vnode(v_ops, v_data, vfs) u Return the pointer to the inode

58 58 File I/O (1) u Read(to a user buffer address) u Fd-> the open file object, verify mode-> vnode-> get the rw-lock->call s5read() u Offset -> block number & the offset -> uiomove()-> call copyout() u The page not in memory?page fault->the handler- >s5getpage()->call bmap() u logical to physical mapping, search vnode’s page list, not in?allocates a free page and call the disk driver to read the data from disk u Sleeps until the I/O completes. Before copying to user data space, verifies the user has access u s5read() returns, unlock, advances the offset, returns the number of bytes read

59 59 File I/O (2) u Write: u Not immediately to disk u May increase the file size u May require the allocation of data blocks u Read the entire block, write relevant data, write back all the block

60 60 Allocating and reclaiming Inodes u When the reference count drops to 0.. u When a file becomes inactive…. u It is better to reuse inodes…………

61 61 Analysis of s5fs u Reliability concern : super block u Performance: u 2 disk I/Os u Blocks randomly located u Block size: 512(SVR2), 1024(SVR3) u Name: 14 characters u Inodes limit: 65535

62 62 The Berkeley Fast File System u Hard disk structure u On-disk organization - Blocks and fragments - Allocation policy u FFS functionality enhancements – long file names, - symbolic links, - other enhancements; u Analysis

63 63 Other file systems u Temporary file systems - RAM disk, mfs, tmpfs) u The Specfs File System u The /proc File System

64 64 Linux Virtual File System u Uniform file system interface to user processes u Represents any conceivable file system’s general feature and behavior u Assumes files are objects that share basic properties regardless of the target file system

65 65

66 66

67 67 Primary Objects in VFS u Superblock object u Represents a specific mounted file system u Inode object u Represents a specific file u Dentry object u Represents a specific directory entry u File object u Represents an open file associated with a process

Download ppt "1 UNIX Internals – The New Frontiers Chapters 8 & 9 File Systems."

Similar presentations

Ads by Google