Distributed file systems, Case studies n Sun’s NFS u history u virtual file system and mounting u NFS protocol u caching in NFS u V3 n Andrew File System.

Slides:



Advertisements
Similar presentations
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
Advertisements

Distributed Storage March 12, Distributed Storage What is Distributed Storage?  Simple answer: Storage that can be shared throughout a network.
CS-550: Distributed File Systems [SiS]1 Resource Management in Distributed Systems: Distributed File Systems.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts Amherst Operating Systems CMPSCI 377 Lecture.
Copyright © Clifford Neuman - UNIVERSITY OF SOUTHERN CALIFORNIA - INFORMATION SCIENCES INSTITUTE CS582: Distributed Systems Lecture 13, 14 -
Caching in Distributed File System Ke Wang CS614 – Advanced System Apr 24, 2001.
File System Implementation
Other File Systems: LFS and NFS. 2 Log-Structured File Systems The trend: CPUs are faster, RAM & caches are bigger –So, a lot of reads do not require.
Chapter 11: File System Implementation. File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management.
Other File Systems: AFS, Napster. 2 Recap NFS: –Server exposes one or more directories Client accesses them by mounting the directories –Stateless server.
Distributed File System: Design Comparisons II Pei Cao Cisco Systems, Inc.
NFS. The Sun Network File System (NFS) An implementation and a specification of a software system for accessing remote files across LANs. The implementation.
University of Pennsylvania 11/21/00CSE 3801 Distributed File Systems CSE 380 Lecture Note 14 Insup Lee.
Dr. Kalpakis CMSC 421, Operating Systems File System Implementation.
Distributed File System: Design Comparisons II Pei Cao.
Case Study - GFS.
File Systems (2). Readings r Silbershatz et al: 11.8.
DESIGN AND IMPLEMENTATION OF THE SUN NETWORK FILESYSTEM R. Sandberg, D. Goldberg S. Kleinman, D. Walsh, R. Lyon Sun Microsystems.
Distributed File Systems Sarah Diesburg Operating Systems CS 3430.
Lecture 23 The Andrew File System. NFS Architecture client File Server Local FS RPC.
Distributed File Systems Concepts & Overview. Goals and Criteria Goal: present to a user a coherent, efficient, and manageable system for long-term data.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Distributed File Systems Steve Ko Computer Sciences and Engineering University at Buffalo.
CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.
Distributed Systems Principles and Paradigms Chapter 10 Distributed File Systems 01 Introduction 02 Communication 03 Processes 04 Naming 05 Synchronization.
Distributed File Systems 1 CS502 Spring 2006 Distributed Files Systems CS-502 Operating Systems Spring 2006.
Networked File System CS Introduction to Operating Systems.
Distributed Systems. Interprocess Communication (IPC) Processes are either independent or cooperating – Threads provide a gray area – Cooperating processes.
Distributed File Systems
Distributed File Systems Case Studies: Sprite Coda.
Distributed File Systems Distributed file system (DFS) – a distributed implementation of the classical time-sharing model of a file system, where multiple.
Chapter 20 Distributed File Systems Copyright © 2008.
What is a Distributed File System?? Allows transparent access to remote files over a network. Examples: Network File System (NFS) by Sun Microsystems.
Introduction to DFS. Distributed File Systems A file system whose clients, servers and storage devices are dispersed among the machines of a distributed.
CSE 486/586 CSE 486/586 Distributed Systems Distributed File Systems Steve Ko Computer Sciences and Engineering University at Buffalo.
Ridge Xu 12.1 Operating System Concepts Chapter 12: File System Implementation File System Structure File System Implementation Directory Implementation.
ITEC 502 컴퓨터 시스템 및 실습 Chapter 11-2: File System Implementation Mi-Jung Choi DPNM Lab. Dept. of CSE, POSTECH.
Chapter 11: File System Implementation Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Chapter 11: File System Implementation Chapter.
Advanced Operating Systems - Spring 2009 Lecture 20 – Wednesday April 1 st, 2009 Dan C. Marinescu Office: HEC 439 B.
Distributed File Systems
Chapter 11: File System Implementation Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Jan 1, 2005 File-System Structure.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition File System Implementation.
Information Management NTU Distributed File Systems.
GLOBAL EDGE SOFTWERE LTD1 R EMOTE F ILE S HARING - Ardhanareesh Aradhyamath.
Distributed File Systems 11.2Process SaiRaj Bharath Yalamanchili.
4P13 Week 9 Talking Points
COT 4600 Operating Systems Fall 2009 Dan C. Marinescu Office: HEC 439 B Office hours: Tu-Th 3:00-4:00 PM.
Distributed File Systems Group A5 Amit Sharma Dhaval Sanghvi Ali Abbas.
Lecture 25 The Andrew File System. NFS Architecture client File Server Local FS RPC.
Review CS File Systems - Partitions What is a hard disk partition?
Distributed File Systems Questions answered in this lecture: Why are distributed file systems useful? What is difficult about distributed file systems?
Chapter Five Distributed file systems. 2 Contents Distributed file system design Distributed file system implementation Trends in distributed file systems.
Distributed Systems: Distributed File Systems Ghada Ahmed, PhD. Assistant Prof., Computer Science Dept. Web:
DISTRIBUTED FILE SYSTEM- ENHANCEMENT AND FURTHER DEVELOPMENT BY:- PALLAWI(10BIT0033)
Chapter 12: File System Implementation
File System Implementation
NFS and AFS Adapted from slides by Ed Lazowska, Hank Levy, Andrea and Remzi Arpaci-Dussea, Michael Swift.
CSE 451: Operating Systems Winter Module 22 Distributed File Systems
Chapter 11: File System Implementation
Distributed File Systems
Distributed File Systems
CSE 451: Operating Systems Spring Module 21 Distributed File Systems
DESIGN AND IMPLEMENTATION OF THE SUN NETWORK FILESYSTEM
Distributed File Systems
CSE 451: Operating Systems Winter Module 22 Distributed File Systems
Chapter 15: File System Internals
Today: Distributed File Systems
Outline Review of Quiz #1 Distributed File Systems 4/20/2019 COP5611.
Distributed File Systems
Distributed File Systems
Presentation transcript:

Distributed file systems, Case studies n Sun’s NFS u history u virtual file system and mounting u NFS protocol u caching in NFS u V3 n Andrew File System u history u organization u caching u DFS n AFS vs. NFS

Sun’s network file systems (NFS) n Designed by Sun Microsystems u First distributed file service designed as a project, introduced in 1985 u To encourage its adoption as a standard F Definitions of the key interfaces were placed in the public domain in 1989 F Source code for a reference implementation was made available to other computer vendors under license F Currently the de facto standard for LANs n Provides transparent access to remote files on a LAN, for clients running on UNIX and other operating systems u A UNIX computer typically has a NFS client and server module in its OS kernel F Available for almost any UNIX F Client modules are available for Macintosh and PCs

NFS - mounting remote file system n NFS supports mounting of remote file systems by client machines u Name space seen by each client may be different u Same file on server may have different path names on different clients u NFS does not enforce a single network-wide name space, but a uniform name space (and location transparency) can be established if desired / (root) export people bin robinbill / (root) usr profsstudents etc han / (root) nfs users janejimbob remote mount remote mount client server 1 server 2

NFS - mounting remote file system (cont.) n Remote file systems may be u Hard mounted — when a user-level process accesses a file, it is suspended until the request can be completed F If a server crashes, the user-level process will be suspended until recovers u Soft mounted — after a small number of retries, the NFS client returns a failure code to the user process F Most UNIX utilities don’t check this code… n Automounting u The automounter dynamically mounts a file system whenever an “empty” mount point is referenced by a client F Further accesses do not result in further requests to the automounter… F Unless there are no references to the remote file system for several minutes, in which case the automounter unmounts it

NFS - mounting remote file system (cont.) n Virtual file system: u Separates generic file-system operations from their implementation (can have different types of local file systems) u Based on a file descriptor called a vnode that is unique networkwide (UNIX inodes are only unique on a single file system) local disk UNIX file system NFS client localremote UNIX kernel user-level client process system calls client computer UNIX file system NFS server UNIX kernel server computer network NFS protocol virtual file system local disk

NFS protocol n NFS protocol provides a set of RPCs for remote file operations u Looking up a file within a directory u Manipulating links and directories u Creating, renaming, and removing files u Getting and setting file attributes u Reading and writing files n NFS is stateless u Servers do not maintain information about their clients from one access to the next F There are no open-file tables on the server u There are no open and close operations F Each request must provide a unique file identifier, and an offset within the file u Easy to recover from a crash, but file operations must be idempotent

NFS protocol (cont.) n Because NFS is stateless, all modified data must be written to the server’s disk before results are returned to the client u Server crash and recovery should be invisible to client —data should be intact u Lose benefits of caching F Solution — RAM disks with battery backup (un-interruptable power supply), written to disk periodically n A single NFS write is guaranteed to be atomic, and not intermixed with other writes to the same file u However, NFS does not provide concurrency control  A write system call may be decomposed into several NFS writes, which may be interleaved F Since NFS is stateless, this is not considered to be an NFS problem

Caching in NFS n Traditional UNIX u Caches file blocks, directories, and file attributes u Uses read-ahead (prefetching), and delayed-write (flushes every 30 seconds) n NFS servers u Same as in UNIX, except server’s write operations perform write-through F Otherwise, failure of server might result in undetected loss of data by clients n NFS clients u Caches results of read, write, getattr, lookup, and readdir operations u Possible inconsistency problems F Writes by one client do not cause an immediate update of other clients’ caches

Caching in NFS (cont.) n NFS clients (cont.) u File reads F When a client caches one or more blocks from a file, it also caches a timestamp indicating the time when the file was last modified on the server F Whenever a file is opened, and the server is contacted to fetch a new block from the file, a validation check is performed Client requests last modification time from server, and compares that time to its cached timestamp If modification time is more recent, all cached blocks from that file are invalidated Blocks are assumed to valid for next 3 seconds (30 seconds for directories) u File writes F When a cached page is modified, it is marked as dirty, and is flushed when the file is closed, or at the next periodic flush u Now two sources of inconsistency: delay after validation, delay until flush

NFS v2 performance improvements, v3 n Client-side caching: u every NFS operation requires network access - slow. client can cache the data it currently works on to speed up access to it. u problem - multiple clients access the data - multiple copies of cached data may become inconsistent. u Solutions - cache only read-only files; cache files where inconsistency is not vital (inodes) and check consistently frequently n deferral of writes u client side - the clients bunch writes together (if possible) before sending them to server (if the client crushes - it knows where to restart) u a sever must commit the writes to stable storage before reporting them to a client. Battery backed non-volatile memory (NVRAM) is used (rather than the disk). NVRAM -> disk transfers are optimized for disk head movements and written to disk later. n v3 allows delayed writes by introducing commit operation - all writes are “volatile” until the server processes commit operation. n Requires changes in semantics - applications programmer cannot assume that all the writes are done and should explicitly issue commit.

NFS network appliances (dedicated NFS servers) n Auspex NS500 n NW and FS run functional multi- processing kernel (FMK) - a “stripped down” version of Unix (no shared memory, processes never terminate) which allows NW and FS to context switch fast and service requests quickly. NW receives a request - if its NFS it passes it to FS if not a processor running regular Unix (SunOS) services it.

Andrew file system (AFS) n Designed by Carnegie Mellon University u Developed during mid-1980s as part of the Andrew distributed computing environment u Designed to support a WAN of more than 5000 workstations u Much of the core technology is now part of the Open Software Foundation (OSF) Distributed Computing Environment (DCE), available for most UNIX and some other operating systems n AFS was made to span large campuses and scale well therefore the emphasis was placed on offloading the work to the clients n as much as possible data is cached on clients, uses session semantics - cache consistency operations are done when file is opened or closed n Provides transparent access to remote files on a WAN, for clients running on UNIX and other operating systems u Access to all files is via the usual UNIX file primitives u Compatible with NFS — servers can mount NFS file systems

AFS (cont.) n AFS is stateful, when a client reads a file from a server it holds a callback, the server keeps track of callbacks and when one of the clients closes the file (and synchronizes it’s cached copy) and updates it, the server notifies all the callback holders of the change breaking the callback, callbacks can be also broken to conserve storage at server n problems with AFS: u even if the data is in local cache - if the client performs a write a complex protocol of local callback verification with the server must be used; cache consistency preservation leads to deadlocks u in a stateful model, it is hard to deal with crushes.

Caching in Andrew n When a remote file is accessed, the server sends the entire file to the client u The entire file is then stored in a disk cache on the client computer F Cache is big enough to store several hundred files n Implements session semantics u Files are cached when opened u Modified files are flushed to the server when they are closed u Writes may not be immediately visible to other processes n When client caches a file, server records that fact — it has a callback on the file u When a client modifies and closes a file, other clients lose their callback, and are notified by server that their copy is invalid

How can Andrew perform well? n Most file accesses are to files that are infrequently updated, or are accessed by only a single user, so the cached copy will remain valid for a long time n Local cache can be big — maybe 100 MB — which is probably sufficient for one user’s working set of files n Typical UNIX workloads: u Files are small, most are less than 10kB u Read operations are 6 times more common than write operations u Sequential access is common, while random access is rare u Most files are read and written by only one user; if a file shared, usually only one user modifies it u Files are referenced in bursts

DCE Distributed File System (DCE DFS) n Modification of AFS by Open Software Foundation for its Distributed Computing Environment (DCE) n switch to Unix file sharing semantics, implement using tokens u tokens provide finer degree of control than callbacks of AFS u tokens F fine grained: tokens (read/write) on portions of a file F type specific: open, read, write, lock, status, update u a client can “check out” a token from a server for a particular file. Only if a client holds a lock token the client is allowed to modify the file; if a client holds status token it is allowed to modify status of the file u token expires in 2 minutes (for fault-tolerance) n replication – easy replication of filesystems on several machines, transparent to the user, u one server – primary does all updates u others – read-only

AFS vs. NFS n NFS is simpler to implement, AFS is cumbersome due to cache consistency synchronization mechanisms, etc n NFS does not scale well - it is usually limited to a LAN, AFS scales well - it may span the Internet n NFS performs better than AFS under light to medium load. AFS does better under heavy load n NFS does writes faster(?)