Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 17: Distributed-File Systems Part 2. 17.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Chapter 17 Distributed-File Systems Chapter.

Similar presentations


Presentation on theme: "Chapter 17: Distributed-File Systems Part 2. 17.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Chapter 17 Distributed-File Systems Chapter."— Presentation transcript:

1 Chapter 17: Distributed-File Systems Part 2

2 17.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Chapter 17 Distributed-File Systems Chapter 17.1 Background Naming and Transparency Remote File Access Chapter 17.2 Stateful versus Stateless Service File Replication An Example: AFS

3 17.3 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Chapter Objectives To contrast stateful and stateless distributed file servers To show how replication of files on different machines in a distributed file system is a useful redundancy for improving availability

4 17.4 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Stateful File Service Stateful File Service : Server either tracks each file being accessed by a each client or else it is stateless. Stateful File Service: Client opens a file Server  fetches information about the file from its disk,  stores it in its memory, and  gives the client a connection identifier unique to the client and the open file – Identifier is used for subsequent accesses until session ends A stateful service is characterized by a connection between the client and the server during a ‘session.’ Server keeps main memory information about all of its clients. Server reclaims main-memory space used by clients when no longer active.

5 17.5 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Stateful File Service - 2 Advantages : We certainly get Increased performance Fewer disk accesses because file information is cached in main memory in the server and can be readily accessed by the connection identifier  (In Unix, the server fetches an inode and provides the client a file descriptor which serves as an index to an in-core table of inodes) Stateful server knows  if a file was opened for sequential access and can thus  read ahead the next blocks

6 17.6 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Stateless File Server Stateless : This mechanism only provides blocks of data as / when requested with no knowledge of how the client uses or modifies the data. Here, each request is self-contained; by itself. Thus no state information is maintained by the server. Each request must thus tell the server the exact file and position in the file it wishes transmitted to the client. Server does not keep track of open files – although it may do so for its own internal efficiency considerations. Establishing or closing a connection is not necessary, as each action is separate and there is no notion of a ‘session.’

7 17.7 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Stateless File Server - 2 Operationally, a client opens a file. Then reads and writes take place as remote messages. The final ‘close’ message is simply a local operation. Stateless servers have no knowledge of the purpose of a client’s requests. The older version of NFS by SUN use this mechanism.

8 17.8 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Distinctions Between Stateful & Stateless Service (1 of 3) Failure Recovery, a.k.a. ‘crashes’ with a Stateful Server: A stateful server loses all its volatile state in a crash  In most cases, operations underway will be aborted.  State can be restored by a recovery protocol based on a dialog with clients  This can take some time…  Server needs to be aware of client failures in order to reclaim space allocated to record the state of crashed client processes Failure Recover with a Stateless Server is simpler. With stateless server, the effects of server failure and recovery are almost unnoticeable A newly reincarnated server can respond to a self-contained request without any difficulty But, of course, there are its downsides too. 

9 17.9 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Distinctions (2 of 3) Downsides for Stateless Servers – real downsides: There’s no in-core information about a session going on, So we get  longer request messages,  lower request processing, and  additional constraints imposed on DFS design – Since each request targets a file, some kind of uniform, system-wide, low-level naming scheme must be used. – Translating these to local names for each request takes time. – Requests for services (transmitting a file) as well as file deletes must be idempotent. » The same thing must happen if repeated. » Be careful regarding deletes! – “Not difficult to implement if reads and writes use an absolute byte count to indicate exact position within a file they are trying to access and do not rely on an offset (like Unix read() and write() system calls in Unix).”

10 17.10 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Distinctions (3 of 3) Some platforms actually require stateful service Consider:  A server that undertakes server-initiated cache validation can not provide stateless service, because it maintains a record of which files are cached by which clients.  We saw the advantages of stateful service earlier. Example of Stateful service:  Unix is stateful:  UNIX uses file descriptors and implicit offsets.  Such servers must maintain tables to map the file descriptors to inodes, and store the current offset within a file Example of Stateless service:  Sun’s NFS is stateless.  It does not use file descriptors but does include an explicit offset for every access.

11 17.11 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts File Replication Replicas: We often have replicas of files on different machines. Provides for redundancy and increased availability. Big Issue: consistency. Better availability: One can download files from multiple sites more quickly than if only a single site had the desired file. Failure Independence: Replicas of the same file provide for what is called a failure-independent machine, which means that if one machine goes down, others are available. Naming scheme need to be invisible to higher levels but must be distinguishable via lower-level names. Naming Scheme must map a replicated file name to a particular replica, where a location may be part of the name.

12 17.12 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts File Replicas – the big issue Updates and Consistency – replicas of a file denote the same logical entity, So, if we update any file, all copies (replicas) must be similarly updated. But: Sometimes consistency is not critically important, as might occur in times when ‘progress in the face of some disaster’ is imminent. So it may be that we sacrifice consistency for availability and performance. But: (another but) If consistency is paramount, and recognizing that all replica need to be updated, it is feasible that we could experience indefinite blocking. On the other hand: If consistency is not of paramount importance, and perhaps a slightly ‘not absolutely current’ version of a file may suffice, then inconsistency for some finite length of time may be acceptable.

13 17.13 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Andrew File System This will be your chapter assignment. Andrew addresses all of the things we talked about earlier in this chapter and tries to embody all the most important features into this system. See assignment on my web page for Chapter 17. Some additional notes / slides are provided ahead. that may help you focus your homework discussion. This information is largely from the textbook.

14 17.14 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts An Example: AFS A distributed computing environment (Andrew) under development since 1983 at Carnegie-Mellon University, purchased by IBM and released as Transarc DFS, now open sourced as OpenAFS AFS tries to solve complex issues such as uniform name space, location-independent file sharing, client-side caching (with cache consistency), and secure authentication (via Kerberos) Also includes server-side caching (via replicas), high availability Can span 5,000 workstations

15 17.15 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts ANDREW (Cont.) Clients are presented with a partitioned space of file names: a local name space and a shared name space Dedicated servers, called Vice, present the shared name space to the clients as an homogeneous, identical, and location transparent file hierarchy The local name space is the root file system of a workstation, from which the shared name space descends Workstations run the Virtue protocol to communicate with Vice, and are required to have local disks where they store their local name space Servers collectively are responsible for the storage and management of the shared name space

16 17.16 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts ANDREW (Cont.) Clients and servers are structured in clusters interconnected by a backbone LAN A cluster consists of a collection of workstations and a cluster server and is connected to the backbone by a router A key mechanism selected for remote file operations is whole file caching Opening a file causes it to be cached, in its entirety, on the local disk

17 17.17 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts ANDREW Shared Name Space Andrew’s volumes are small component units associated with the files of a single client A fid identifies a Vice file or directory - A fid is 96 bits long and has three equal-length components: volume number vnode number – index into an array containing the inodes of files in a single volume uniquifier – allows reuse of vnode numbers, thereby keeping certain data structures, compact Fids are location transparent; therefore, file movements from server to server do not invalidate cached directory contents Location information is kept on a volume basis, and the information is replicated on each server

18 17.18 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts ANDREW File Operations Andrew caches entire files form servers A client workstation interacts with Vice servers only during opening and closing of files Venus – caches files from Vice when they are opened, and stores modified copies of files back when they are closed Reading and writing bytes of a file are done by the kernel without Venus intervention on the cached copy Venus caches contents of directories and symbolic links, for path- name translation Exceptions to the caching policy are modifications to directories that are made directly on the server responsibility for that directory

19 17.19 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts ANDREW Implementation Client processes are interfaced to a UNIX kernel with the usual set of system calls Venus carries out path-name translation component by component The UNIX file system is used as a low-level storage system for both servers and clients The client cache is a local directory on the workstation’s disk Both Venus and server processes access UNIX files directly by their inodes to avoid the expensive path name-to-inode translation routine

20 17.20 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts ANDREW Implementation (Cont.) Venus manages two separate caches: one for status one for data LRU algorithm used to keep each of them bounded in size The status cache is kept in virtual memory to allow rapid servicing of stat (file status returning) system calls The data cache is resident on the local disk, but the UNIX I/O buffering mechanism does some caching of the disk blocks in memory that are transparent to Venus

21 End of Chapter 17.2


Download ppt "Chapter 17: Distributed-File Systems Part 2. 17.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Chapter 17 Distributed-File Systems Chapter."

Similar presentations


Ads by Google