Presentation is loading. Please wait.

Presentation is loading. Please wait.

P.J.Braam/CMU -- 1 Linux Virtual File System Peter J. Braam.

Similar presentations

Presentation on theme: "P.J.Braam/CMU -- 1 Linux Virtual File System Peter J. Braam."— Presentation transcript:

1 P.J.Braam/CMU -- 1 Linux Virtual File System Peter J. Braam

2 P.J.Braam/CMU -- 2 Aims Present the data structures in Linux VFS Provide information about flow of control Describe methods and invariants needed to implement a new file system Illustrate with some examples

3 P.J.Braam/CMU -- 3 File access History BSD implemented VFS for NFS: aim dispatch to different filesystems VMS had elaborate filesystem NT/Win95 have VFS type interfaces Newer systems integrate VM with buffer cache.

4 P.J.Braam/CMU -- 4 Linux Filesystems Media based –ext2 - Linux native –ufs - BSD –fat - DOS FS –vfat - win 95 –hpfs - OS/2 –minix - well…. –Isofs - CDROM –sysv - Sysv Unix –hfs - Macintosh –affs - Amiga Fast FS –NTFS - NT’s FS –adfs - Acorn-strongarm Network –nfs –Coda –AFS - Andrew FS –smbfs - LanManager –ncpfs - Novell Special ones –procfs -/proc –umsdos - Unix in DOS –userfs - redirector to user

5 P.J.Braam/CMU -- 5 Linux Filesystems (ctd) Forthcoming: –devfs - device file system –DFS - DCE distributed FS Varia: –cfs - crypt filesystem –cfs - cache filesystem –ftpfs - ftp filesystem –mailfs - mail filesystem –pgfs - Postgres versioning file system Linux serves (unrelated to the VFS!) –NFS - user & kernel –Coda –AppleShare - netatalk/CAP –SMB - samba –NCP - Novell

6 P.J.Braam/CMU -- 6 Linux is Obsolete Andrew Tanenbaum Usefulness

7 P.J.Braam/CMU -- 7 File access Linux VFS Multiple interfaces build up VFS: –files –dentries –inodes –superblock –quota VFS can do all caching & provides utility fctns to FS FS provides methods to VFS; many are optional

8 P.J.Braam/CMU -- 8 User level file access Typical user level types and code: –pathnames: “/myfile” –file descriptors: fd = open(“/myfile”…) –attributes in struct stat: stat(“/myfile”, &mybuf), chmod, chown... –offsets: write, read, lseek –directory handles: DIR *dh = opendir(“/mydir”) –directory entries: struct dirent *ent = readdir(dh)

9 P.J.Braam/CMU -- 9 VFS Manages kernel level file abstractions in one format for all file systems Receives system call requests from user level (e.g. write, open, stat, link) Interacts with a specific file system based on mount point traversal Receives requests from other parts of the kernel, mostly from memory management

10 P.J.Braam/CMU -- 10 File system level Individual File Systems –responsible for managing file & directory data –responsible for managing meta-data: timestamps, owners, protection etc –translates data between particular FS data: e.g. disk data, NFS data, Coda/AFS data VFS data: attributes etc in standard format –e.g. nfs_getattr(….) returns attributes in VFS format, acquires attributes in NFS format to do so.

11 P.J.Braam/CMU -- 11 Anatomy of stat system call sys_stat(path, buf) { dentry = namei(path); if ( dentry == NULL ) return -ENOENT; inode = dentry->d_inode; rc =inode->i_op->i_permission(inode); if ( rc ) return -EPERM; rc = inode->i_op->i_getattr(inode, buf); dput(dentry); return rc; } Establish VFS data Call into inode layer of filesystem

12 P.J.Braam/CMU -- 12 sys_fstatfs(fd, buf) { /* for things like “df” */ file = fget(fd); if ( file == NULL ) return -EBADF; superb = file->f_dentry->d_inode->i_super; rc = superb->sb_op->sb_statfs(sb, buf); return rc; } Call into superblock layer of filesystem Translate fd to VFS data structure Anatomy of fstatfs system call

13 P.J.Braam/CMU -- 13 Data structures VFS data structures for: –VFS handle to the file: inode (BSD: vnode) –User instantiated file handle: file (BSD: file) –The whole filesystem: superblock (BSD: vfs) –A name to inode translation: dentry

14 P.J.Braam/CMU -- 14 Shorthand method notation super block methods: sss_methodname inode methods: iii_methodname dentry methods: ddd_methodname file methods: fff_methodname instead of : inode i_op lookup we write iii_lookup

15 P.J.Braam/CMU -- 15 namei struct dentry *namei(parent, name) { if (dentry = d_lookup(parent,name)) else ddd_hash(parent, name) ddd_revalidate(dentry) iii_lookup(parent, name) sss_read_inode(…) struct inode *iget(ino, dev) { /* try cache else.. */ } VFS FS

16 P.J.Braam/CMU -- 16 Superblocks Handle metadata only (attributes etc) Responsible for retrieving and storing metadata from the FS media or peers Struct superblocks hold things like: –device, blocksize, dirty flags, list of dirty inodes –super operations –wait queue –pointer to the root inode of this FS

17 P.J.Braam/CMU -- 17 Super Operations (sss_) Ops on Inodes: –read_inode –put_inode –write_inode –delete_inode –clear_inode –notify_change Superblock manips: –read_super (mount) –put_super (unmount) –write_super (unmount) –statfs (attributes)

18 P.J.Braam/CMU -- 18 Inodes Inodes are VFS abstraction for the file Inode has operations (iii_methods) VFS maintains an inode cache, NOT the individual FS’s (compare NT, BSD etc) Inodes contain an FS specific area where: –ext2 stores disk block numbers etc –AFS would store the FID Extraordinary inode ops are good for dealing with stale NFS file handles etc.

19 P.J.Braam/CMU -- 19 What’s inside an inode - 1 list_head i_hash list_head i_list list_head i_dentry int i_count long i_ino int i_dev {m,a,c}time {u,g}id mode size n_link caching Identifies file Usual stuff

20 P.J.Braam/CMU -- 20 What’s inside an inode -2 superblock i_sb inode_ops i_op wait objects, semaphore lock vm_area_struct pipe/socket info page information union { ext2fs_inode_info i_ext2 nfs_inode_info i_nfs coda_inode_info i_coda..} u Which FS For mmap, networking waiting FS Specific info: blockno’s fids etc

21 P.J.Braam/CMU -- 21 Inode state Inode can be on one or two lists: –(hash & in_use) or (hash & dirty ) or unused –inode has a use count i_count Transitions –unused  hash: iget calls sss_read_inode –dirty  in_use: sss_write_inode –hash  unused: call on sss_clear_inode, but if i_nlink = 0: iput calls sss_delete_inode when i_count falls to 0

22 P.J.Braam/CMU -- 22 Dirty inodes Inode_hashtable 1. iget: if i_count>0 ++ 2. iput: if i_count>1 - - sss_write_inode (sync one) Fs storage Used inodes Unused inodes Fs storage sss_read_inode (iget) sss_clear_inode (freeing inos) or sss_delete_inode (iput) media fs only (mark_inode_dirty) 3. free_inodes 4. syncing inodes Players: Fs storage Inode Cache

23 P.J.Braam/CMU -- 23 Red Hat Software sold 240,000 copies of Red Hat Linux in 1997 and expects to reach 400,000 in 1998. Estimates of installed servers (InfoWorld): - Linux: 7 million - OS/2: 5 million - Macintosh: 1 million Sales

24 P.J.Braam/CMU -- 24 Inode operations (iii_) lookup: return inode –calls iget creation/removal –create –link –unlink –symlink –mkdir –rmdir –mknod –rename symbolic links –readlink –follow link pages –readpage, writepage, updatepage - read or write page. Generic for mediafs. –bmap - return disk block number of logical block special operations –revalidate - see dentry sect –truncate –permission

25 P.J.Braam/CMU -- 25 Dentry world Dentry is a name to inode translation structure Cached agressively by VFS Eliminates lookups by FS & private caches –timing on Coda FS: ls -lR 1000 files after priming cache linux 2.0.32: 7.2secs linux 2.1.92: 0.6secs –disk fs: less benefit, NFS even more Negative entries! Namei is dramatically simplified

26 P.J.Braam/CMU -- 26 Inside dentry’s name pointer to inode pointer to parent dentry list head of children chains for lots of lists use count

27 P.J.Braam/CMU -- 27 Dentry associated lists d_alias chains place: d_instantiate remove: dentry_iput inode I_dentry list head d_child chains place: d_alloc remove: d_prune, d_invalidate, d_put inode i_dentry list head = d_inode pointer= d_parent pointer dentry inode relationship dentry tree relationship Legend:inodedentry

28 P.J.Braam/CMU -- 28 Dcache dentry_hashtable (d_hash chains) unused dentries (d_lru chains) namei iii_lookup d_add prune d_invalidate d_drop namei tries cache: d_lookup –ddd_compare Success: ddd_revalidate –d_invalidate if fails –proceed if success Failure: iii_lookup –find inode –iget sss_read_inode –finish: d_add –can give negative entry in dcache dhash(parent, name) list head

29 P.J.Braam/CMU -- 29 Dentry methods ddd_revalidate: can force new lookup ddd_hash: compute hash value of name ddd_compare: are names equal? ddd_delete, ddd_put, ddd_iput: FS cleanup opportunity

30 P.J.Braam/CMU -- 30 Dentry particulars: ddd_hash and ddd_compare have to deal with extraordinary cases for msdos/vfat: –case insensitive –long and short filename pleasantries ddd_revalidate -- can force new lookup if inode not in use: –used for NFS/SMBfs aging –used for Coda/AFS callbacks

31 P.J.Braam/CMU -- 31 Dijkstra probably hates me Linus Torvalds Style

32 P.J.Braam/CMU -- 32 Memory mapping vm_area structure has –vm_operations –inode, addresses etc. vm_operations –map, unmap –swapin, swapout –nopage -- read when page isn’t in VM mmap –calls on iii_readpage –keeps a use count on the inode until unmap

Download ppt "P.J.Braam/CMU -- 1 Linux Virtual File System Peter J. Braam."

Similar presentations

Ads by Google