1 CS 194: Distributed Systems Distributed File Systems Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and.

Slides:



Advertisements
Similar presentations
DISTRIBUTED FILE SYSTEMS Computer Engineering Department Distributed Systems Course Asst. Prof. Dr. Ahmet Sayar Kocaeli University - Fall 2013.
Advertisements

Computer Science Lecture 20, page 1 CS677: Distributed OS Today: Coda, xFS Case Study: Coda File System Brief overview of other recent file systems –xFS.
CS-550: Distributed File Systems [SiS]1 Resource Management in Distributed Systems: Distributed File Systems.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts Amherst Operating Systems CMPSCI 377 Lecture.
1 Network File System (NFS) a)The remote access model. b)The upload/download model.
Computer Science Lecture 20, page 1 CS677: Distributed OS Today: Distributed File Systems Issues in distributed file systems Sun’s Network File System.
Copyright © Clifford Neuman - UNIVERSITY OF SOUTHERN CALIFORNIA - INFORMATION SCIENCES INSTITUTE CS582: Distributed Systems Lecture 13, 14 -
Distributed File Systems Chapter 11
Coda file system: Disconnected operation By Wallis Chau May 7, 2003.
Computer Science Lecture 18, page 1 CS677: Distributed OS Last Class: Fault tolerance Reliable communication –One-one communication –One-many communication.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Other File Systems: AFS, Napster. 2 Recap NFS: –Server exposes one or more directories Client accesses them by mounting the directories –Stateless server.
Computer Science Lecture 21, page 1 CS677: Distributed OS Today: Coda, xFS Case Study: Coda File System Brief overview of other recent file systems –xFS.
Distributed File System: Design Comparisons II Pei Cao Cisco Systems, Inc.
Distributed Systems CS Distributed File Systems- Part II Lecture 20, Nov 16, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1.
Jeff Chheng Jun Du.  Distributed file system  Designed for scalability, security, and high availability  Descendant of version 2 of Andrew File System.
NFS. The Sun Network File System (NFS) An implementation and a specification of a software system for accessing remote files across LANs. The implementation.
1 CS 194: Distributed Systems Security Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences.
File Systems (2). Readings r Silbershatz et al: 11.8.
Sun NFS Distributed File System Presentation by Jeff Graham and David Larsen.
Distributed File Systems Concepts & Overview. Goals and Criteria Goal: present to a user a coherent, efficient, and manageable system for long-term data.
CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.
Distributed Systems Principles and Paradigms Chapter 10 Distributed File Systems 01 Introduction 02 Communication 03 Processes 04 Naming 05 Synchronization.
Networked File System CS Introduction to Operating Systems.
Distributed Systems. Interprocess Communication (IPC) Processes are either independent or cooperating – Threads provide a gray area – Cooperating processes.
Distributed File Systems
Distributed File Systems Case Studies: Sprite Coda.
Distributed File Systems Overview  A file system is an abstract data type – an abstraction of a storage device.  A distributed file system is available.
Chapter 20 Distributed File Systems Copyright © 2008.
What is a Distributed File System?? Allows transparent access to remote files over a network. Examples: Network File System (NFS) by Sun Microsystems.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
NFS : Network File System SMU CSE8343 Prof. Khalil September 27, 2003 Group 1 Group members: Payal Patel, Malka Samata, Wael Faheem, Hazem Morsy, Poramate.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
DISCONNECTED OPERATION IN THE CODA FILE SYSTEM J. J. Kistler M. Sataynarayanan Carnegie-Mellon University.
CODA: A HIGHLY AVAILABLE FILE SYSTEM FOR A DISTRIBUTED WORKSTATION ENVIRONMENT M. Satyanarayanan, J. J. Kistler, P. Kumar, M. E. Okasaki, E. H. Siegel,
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Presented By: Samreen Tahir Coda is a network file system and a descendent of the Andrew File System 2. It was designed to be: Highly Highly secure Available.
Computer Science Lecture 19, page 1 CS677: Distributed OS Last Class: Fault tolerance Reliable communication –One-one communication –One-many communication.
CS425 / CSE424 / ECE428 — Distributed Systems — Fall 2011 Some material derived from slides by Prashant Shenoy (Umass) & courses.washington.edu/css434/students/Coda.ppt.
Distributed File Systems Architecture – 11.1 Processes – 11.2 Communication – 11.3 Naming – 11.4.
Distributed File Systems Group A5 Amit Sharma Dhaval Sanghvi Ali Abbas.
Distributed File Systems Questions answered in this lecture: Why are distributed file systems useful? What is difficult about distributed file systems?
Highly Available Services and Transactions with Replicated Data Jason Lenthe.
Computer Science Lecture 19, page 1 CS677: Distributed OS Last class: Distributed File Systems Issues in distributed file systems Sun’s Network File System.
THE EVOLUTION OF CODA M. Satyanarayanan Carnegie-Mellon University.
Chapter Five Distributed file systems. 2 Contents Distributed file system design Distributed file system implementation Trends in distributed file systems.
Distributed Systems: Distributed File Systems Ghada Ahmed, PhD. Assistant Prof., Computer Science Dept. Web:
Computer Science Lecture 20, page 1 CS677: Distributed OS Today: Distributed File Systems Issues in distributed file systems Sun’s Network File System.
Computer Science Lecture 19, page 1 CS677: Distributed OS Last Class: Fault tolerance Reliable communication –One-one communication –One-many communication.
Mobility Victoria Krafft CS /25/05. General Idea People and their machines move around Machines want to share data Networks and machines fail Network.
DISTRIBUTED FILE SYSTEM- ENHANCEMENT AND FURTHER DEVELOPMENT BY:- PALLAWI(10BIT0033)
Distributed File Systems
Nache: Design and Implementation of a Caching Proxy for NFSv4
NFS and AFS Adapted from slides by Ed Lazowska, Hank Levy, Andrea and Remzi Arpaci-Dussea, Michael Swift.
Distributed Systems CS
Distributed Systems CS
Today: Coda, xFS Case Study: Coda File System
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Distributed File Systems
Distributed File Systems
CSE 451: Operating Systems Spring Module 21 Distributed File Systems
Distributed File Systems
CSE 451: Operating Systems Winter Module 22 Distributed File Systems
Today: Distributed File Systems
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Outline Review of Quiz #1 Distributed File Systems 4/20/2019 COP5611.
Distributed File Systems
Distributed File Systems
Distributed File Systems
Presentation transcript:

1 CS 194: Distributed Systems Distributed File Systems Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences University of California, Berkeley Berkeley, CA

2 Outline  Network File System (NFS)  CODA

3 Main Goal  Provide a client transparent access to a file system stored at a remote server  Why would you want to store files remotely?

4 NFS  A specification for a distributed file system (by Sun, 1984)  Implemented on various OS’s  De facto standard in the UNIX community  Latest version is 4 (2000)  Client-server file system

5 Access Model  Two access models : -Remote access -Upload/Download Remote access Upload/Download

6 NFS Architecture  Virtual File System (VFS) provide a uniform access to local and remote files

7 File System Model  Similar to UNIX: files are treated as uninterpreted sequences of bytes  Each files has a name, but usually referred by a file handle -Client use a name service to get file handle  Files organized into a naming graphs -Nodes  directories or files  First three versions were stateless; version 4 is stateful

8 Stateful vs. Stateless  Stateless model: each call contains complete information to execute operation  Stateful model: server maintain context (info) shared by consecutive operations  Discussion: compare stateless and stateful design

9 Communication  RPC based  One operation per RPC (NFS v. 1,2,3)  Multiple operations per RPC (NFS v. 4) NFS v1,v2,v3 (stateless) NFS v4. (stateful)

10 Naming  Allow a client to mount a remote file system into its own local file system  Pathnames are not globally unique; what’s the implication?

11 Example: Mounting Nested Directories

12 File Handles  File handle: created by server hosting the file  Unique with respect to all file systems exported by servers  Persistent: doesn’t change during file’s lifetime  Length: 32b in v2, 64b in v3, and 64b in v4

13 Automounting  Mount file system transparently when client accesses it

14 Mandatory File Attributes AttributeDescription TYPEThe type of the file (regular, directory, symbolic link) SIZEThe length of the file in bytes CHANGE Indicator for a client to see if and/or when the file has changed FSIDServer-unique identifier of the file's file system

15 Semantics of File Sharing a) On a single processor, when a read follows a write, the value returned by the read is the value just written b) In a distributed system with caching, obsolete values may be returned.

16 Semantics of File Sharing MethodComment UNIX semantics Every operation on a file is instantly visible to all processes Session semantics No changes are visible to other processes until the file is closed Immutable files No updates are possible; simplifies sharing and replication TransactionAll changes occur atomically  NFS implements session semantics

17 File Locking in NFS OperationDescription LockCreates a lock for a range of bytes LocktTest whether a conflicting lock has been granted LockuRemove a lock from a range of bytes RenewRenew the leas on a specified lock  NFS v1-v3: use a separate (stateful) lock manager  NFS v4: integrated in the file system

18 Client Caching  NFS v1-v3: mainly left outside the protocol (see book)  NFS v4: use file delegation

19 Fault Tolerance  Need to maintain state consistent in v4, e.g., -Locking -Delegation  Challenge: eliminate duplicate operations in case of failure  Solution: use transaction identifiers (XID)

20 Handling Retransmissions a) Request still in progress b) Reply has just been returned c) Reply has been some time ago, but was lost

21 Security  Secure RPC: three methods -System authentication: trust the user has passed a proper login procedure -Diffie-Hellman key exchange (but not very secure—uses only 192b Keys) -Kerberos  File access control -Use access control list (ACL)

22 Security Architecture

23 Outline  Network File System (NFS)  CODA

24 The Coda File System  Developed at CMU  Based on Andrew File System (AFS), another distributed system developed at CMU -Community wide system

25 AFS Goals  Scalability: system should grow without major problems  Fault-Tolerance: system should remain usable in the presence of server failures, communication failures and voluntary disconnections  Unix Emulation  Design philosophy: Scalability and Accessibility more important than consistency

26 Coda Goals  AFS goals, plus  Disconnected mode for portable computers

27 System Model  Client workstations are personal computers owned by their users -Fully autonomous -Cannot be trusted  Coda allows laptops that operate in disconnected mode

28 Overall Organization of AFS & Coda

29 Internal Organization of Virtue

30 Communication  Based on RPC2: provides reliable transmission on top of UDP  RPC2 supports side-effects, i.e., user defined protocols  RPC2 provides support for multicast -Transparent for the client

31 Side Effects in Coda’s RPC

32 Support for Multicast in RCP2  Example: send invalidation a)One at a time b)Multicast

33 Naming  Single shared naming space (vs. client-based in NFS)

34 File Identifiers  Globally unique (vs server unique in NFS)  Coda distinguishes between physical and logical volumes  Logical volume (identified by RVID) a possible replicated physical volume (identified VID)

35 Sharing Files in Coda  Transaction semantics -Session is treated like a transaction

36 Caching and Replication  Caching: -Achieve scalability -Increases fault tolerance  Challenge: how to maintain data consistency?  Solution: use callbacks to notify clients when a file changes -If a client modifies a copy, server sends a callback break to all clients maintaining copies of same file

37 Example: Client Caching

38 Server Replication  Unit of replication: volume  Volume Storage Group (VSG): set of servers that have a copy of a volume  Accessible Volume Storage Group (AVSG): set of servers in VSG that the client can contact  Use vector versioning -One entry for each server in VSG -When file updated, corresponding version in AVSG is updated

39 Example: Handling Network Partition  Versioning vector when partition happens: [1,1,1]  Client A updates file  versioning vector in its partition: [2,2,1]  Client B updates file  versioning vector in its partition: [1,1,2]  Partition repaired  compare versioning vectors: conflict!

40 Disconnected Operation  HOARDING: File cache in advance with all files that will be accessed when disconnected -Best effort  EMULATION: when disconnected, behavior of server emulated at client  REINTEGRATION: transfer updates to server; resolves conflicts

41 Security  Set-up a secure channel between client and server -Use secure RPC  System-level authentication

42 Mutual Authentication in RPC2  Based on Needham-Schroeder protocol

43 Establishing a Secure Channel  Upon authentication AS (authentication server) returns: -Clear token: CT = [Alice, TID, K S, T start, T end ] -Secret token: ST = K vice ([CT]* Kvice ) -K S : secret key obtained by client during login procedure -K vice : secret key shared by vice servers  Token is similar to the ticket in Kerberos Client (Venus) Vice Server

44 NFS vs Coda IssueNFSCoda Design goalsAccess transparencyHigh availability Access modelRemoteUp/Download CommunicationRPC Server groupsNoYes Mount granularityDirectoryFile system Name spacePer clientGlobal Sharing sem.SessionTransactional Cache consist.write-back Fault toleranceReliable comm.Replication and caching RecoveryClient-basedReintegration Secure channelsExisting mechanismsNeedham-Schroeder