CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Distributed File Systems Steve Ko Computer Sciences and Engineering University at Buffalo.

Slides:



Advertisements
Similar presentations
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Consistency Steve Ko Computer Sciences and Engineering University at Buffalo.
Advertisements

Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
Distributed Storage March 12, Distributed Storage What is Distributed Storage?  Simple answer: Storage that can be shared throughout a network.
CS-550: Distributed File Systems [SiS]1 Resource Management in Distributed Systems: Distributed File Systems.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts Amherst Operating Systems CMPSCI 377 Lecture.
CS 425 / ECE 428 Distributed Systems Fall 2014 Indranil Gupta (Indy) Lecture 27: Distributed File Systems All slides © IG.
Distributed Systems 2006 Styles of Client/Server Computing.
File System Implementation
Other File Systems: LFS and NFS. 2 Log-Structured File Systems The trend: CPUs are faster, RAM & caches are bigger –So, a lot of reads do not require.
Other File Systems: AFS, Napster. 2 Recap NFS: –Server exposes one or more directories Client accesses them by mounting the directories –Stateless server.
NFS. The Sun Network File System (NFS) An implementation and a specification of a software system for accessing remote files across LANs. The implementation.
University of Pennsylvania 11/21/00CSE 3801 Distributed File Systems CSE 380 Lecture Note 14 Insup Lee.
Case Study - GFS.
File Systems (2). Readings r Silbershatz et al: 11.8.
DESIGN AND IMPLEMENTATION OF THE SUN NETWORK FILESYSTEM R. Sandberg, D. Goldberg S. Kleinman, D. Walsh, R. Lyon Sun Microsystems.
Sun NFS Distributed File System Presentation by Jeff Graham and David Larsen.
Distributed File Systems Concepts & Overview. Goals and Criteria Goal: present to a user a coherent, efficient, and manageable system for long-term data.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Case Study: Amazon Dynamo Steve Ko Computer Sciences and Engineering University at Buffalo.
CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.
Distributed Systems Principles and Paradigms Chapter 10 Distributed File Systems 01 Introduction 02 Communication 03 Processes 04 Naming 05 Synchronization.
Distributed File Systems 1 CS502 Spring 2006 Distributed Files Systems CS-502 Operating Systems Spring 2006.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Distributed File Systems Steve Ko Computer Sciences and Engineering University at Buffalo.
Networked File System CS Introduction to Operating Systems.
Distributed File Systems
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Distributed Shared Memory Steve Ko Computer Sciences and Engineering University at Buffalo.
Distributed file systems, Case studies n Sun’s NFS u history u virtual file system and mounting u NFS protocol u caching in NFS u V3 n Andrew File System.
CSE 486/586 CSE 486/586 Distributed Systems Case Study: Amazon Dynamo Steve Ko Computer Sciences and Engineering University at Buffalo.
What is a Distributed File System?? Allows transparent access to remote files over a network. Examples: Network File System (NFS) by Sun Microsystems.
Introduction to DFS. Distributed File Systems A file system whose clients, servers and storage devices are dispersed among the machines of a distributed.
1 Chap8 Distributed File Systems  Background knowledge  8.1Introduction –8.1.1 Characteristics of File systems –8.1.2 Distributed file system requirements.
CSE 486/586 CSE 486/586 Distributed Systems Distributed File Systems Steve Ko Computer Sciences and Engineering University at Buffalo.
Presented By: Samreen Tahir Coda is a network file system and a descendent of the Andrew File System 2. It was designed to be: Highly Highly secure Available.
Computer Science Lecture 19, page 1 CS677: Distributed OS Last Class: Fault tolerance Reliable communication –One-one communication –One-many communication.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition File System Implementation.
Information Management NTU Distributed File Systems.
GLOBAL EDGE SOFTWERE LTD1 R EMOTE F ILE S HARING - Ardhanareesh Aradhyamath.
Distributed File Systems Architecture – 11.1 Processes – 11.2 Communication – 11.3 Naming – 11.4.
EE324 INTRO TO DISTRIBUTED SYSTEMS. Distributed File System  What is a file system?
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Replication Steve Ko Computer Sciences and Engineering University at Buffalo.
Distributed File Systems Group A5 Amit Sharma Dhaval Sanghvi Ali Abbas.
CSE 486/586 CSE 486/586 Distributed Systems Consistency Steve Ko Computer Sciences and Engineering University at Buffalo.
Distributed File Systems Questions answered in this lecture: Why are distributed file systems useful? What is difficult about distributed file systems?
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Transactions on Replicated Data Steve Ko Computer Sciences and Engineering University at Buffalo.
Chapter Five Distributed file systems. 2 Contents Distributed file system design Distributed file system implementation Trends in distributed file systems.
Distributed Systems: Distributed File Systems Ghada Ahmed, PhD. Assistant Prof., Computer Science Dept. Web:
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Consistency Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586 Distributed Systems Distributed File Systems
CSE 486/586 Distributed Systems Case Study: Amazon Dynamo
CSE 486/586 Distributed Systems Consistency --- 1
Lecture 25: Distributed File Systems
NFS and AFS Adapted from slides by Ed Lazowska, Hank Levy, Andrea and Remzi Arpaci-Dussea, Michael Swift.
Outline Midterm results summary Distributed file systems – continued
CSE 486/586 Distributed Systems Consistency --- 1
CSE 451: Operating Systems Winter Module 22 Distributed File Systems
Distributed File Systems
Distributed File Systems
CSE 451: Operating Systems Spring Module 21 Distributed File Systems
DESIGN AND IMPLEMENTATION OF THE SUN NETWORK FILESYSTEM
Distributed File Systems
Lecture 25: Distributed File Systems
Distributed File Systems
CSE 451: Operating Systems Winter Module 22 Distributed File Systems
CSE 486/586 Distributed Systems Distributed File Systems
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Outline Review of Quiz #1 Distributed File Systems 4/20/2019 COP5611.
Distributed File Systems
CSE 486/586 Distributed Systems Distributed File Systems
Distributed File Systems
CSE 486/586 Distributed Systems Case Study: Amazon Dynamo
Network File System (NFS)
Presentation transcript:

CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Distributed File Systems Steve Ko Computer Sciences and Engineering University at Buffalo

CSE 486/586, Spring 2012 Recap Amazon Dynamo –Distributed key-value storage with eventual consistency Gossiping for membership and failure detection –Eventual view of membership Consistent hashing for node & key distribution –Virtual nodes Object versioning for eventually-consistent data objects –Reconciliation Quorums for partition/failure tolerance –N, R, W all configurable Merkel tree for resynchronization after failures/partitions –Just compare hashes 2

CSE 486/586, Spring 2012 Local File Systems File systems provides file management. –Name space –API for file operations (create, delete, open, close, read, write, append, truncate, etc.) –Physical storage management & allocation (e.g., block storage) –Security and protection (access control) Name space is usually hierarchical. –Files and directories File systems are mounted. –Different file systems can be in the same name space. 3

CSE 486/586, Spring 2012 Traditional Distributed File Systems Goal: emulate local file system behaviors –Files not replicated –No hard performance guarantee But, –Files located remotely on servers –Multiple clients access the servers Why? –Users with multiple machines –Data sharing for multiple users –Consolidated data management (e.g., in an enterprise) 4

CSE 486/586, Spring 2012 Requirements Transparency: a distributed file system should appear as if it’s a local file system –Access transparency: it should support the same set of operations, i.e., a program that works for a local file system should work for a DFS. –(File) Location transparency: all clients should see the same name space. –Migration transparency: if files move to another server, it shouldn’t be visible to users. –Performance transparency: it should provide reasonably consistent performance. –Scaling transparency: it should be able to scale incrementally by adding more servers. 5

CSE 486/586, Spring 2012 Requirements Concurrent updates should be supported. Fault tolerance: servers may crash, msgs can be lost, etc. Consistency needs to be maintained. Security: access-control for files & authentication of users 6

CSE 486/586, Spring 2012 File Server Architecture 7 Client computerServer computer Application program Application program Client module Flat file service Directory service

CSE 486/586, Spring 2012 Components Directory service –Meta data management –Creates and updates directories (hierarchical file structures) –Provides mappings between user names of files and the unique file ids in the flat file structure. Flat file service –Actual data management –File operations (create, delete, read, write, access control, etc.) These can be independently distributed. –E.g., centralized directory service & distributed flat file service 8

CSE 486/586, Spring 2012 Sun NFS 9 Application Program Virtual File System UNIX File System Other File System NFS Client System Client Computer Virtual File System NFS Server System UNIX File System Server Computer NFS Protocol UNIX Kernel

CSE 486/586, Spring 2012 VFS A translation layer that makes file systems pluggable & co-exist –E.g., NFS, EXT2, EXT3, ZFS, etc. Keeps track of file systems that are available locally and remotely. Passes requests to appropriate local or remote file systems Distinguishes between local and remote files. 10

CSE 486/586, Spring 2012 NFS Mount Service / student usr … / users nfs pet jimbob staff / people org mth johnbob Each server keeps a record of local files available for remote mounting. Clients use a mount command for remote mounting, providing name mappings Remote Mount Server 1 ClientServer 2

CSE 486/586, Spring 2012 NFS Basic Operations Client –Transfers blocks of files to and from server via RPC Server –Provides a conventional RPC interface at a well-known port on each host –Stores files and directories Problems? –Performance –Failures 12

CSE 486/586, Spring 2012 CSE 486/586 Administrivia Project 2 has been released. –Simple DHT based on Chord –Please, please start right away! –Deadline: 4/13 2:59PM 13

CSE 486/586, Spring 2012 Improving Performance Let’s cache! Server-side –Typically done by OS & disks anyway –A disk usually has a cache built-in. –OS caches file pages, directories, and file attributes that have been read from the disk in a main memory buffer cache. Client-side –On accessing data, cache it locally. What’s a typical problem with caching? –Consistency: cached data can become stale. 14

CSE 486/586, Spring 2012 (General) Caching Strategies Read-ahead (prefetch) –Read strategy –Anticipates read accesses and fetches the pages following those that have most recently been read. Delayed-write –Write strategy –New writes stored locally. –Periodically or when another client accesses, send back the updates to the server Write-through –Write strategy –Writes go all the way to the server’s disk This is not an exhaustive list! 15

CSE 486/586, Spring 2012 NFS Client-Side Caching Write-through, but only at close() –Not every single write –Helps performance Other clients periodically check if there’s any new write (next slide). Multiple writers –No guarantee –Could be any combination of writes Leads to inconsistency 16

CSE 486/586, Spring 2012 Validation A client checks with the server about cached blocks. Each block has a timestamp. –If the remote block is new, then the client invalidates the local cached block. Always invalidate after some period of time –3 seconds for files –30 seconds for directories Written blocks are marked as “dirty.” 17

CSE 486/586, Spring 2012 Failures Two design choices: stateful & stateless Stateful –The server maintains all client information (which file, which block of the file, the offset within the block, file lock, etc.) –Good for the client-side process (just send requests!) –Becomes almost like a local file system (e.g., locking is easy to implement) Problem? –Server crash  lose the client state –Becomes complicated to deal with failures 18

CSE 486/586, Spring 2012 Failures Stateless –Clients maintain their own information (which file, which block of the file, the offset within the block, etc.) –The server does not know anything about what a client does. –Each request contains complete information (file name, offset, etc.) –Easier to deal with server crashes (nothing to lose!) NFS’s choice Problem? –Locking becomes difficult. 19

CSE 486/586, Spring 2012 NFS Client-side caching for improved performance Write-through at close() –Consistency issue Stateless server –Easier to deal with failures –Locking is not supported (later versions of NFS support locking though) Simple design –Led to simple implementation, acceptable performance, easier maintenance, etc. –Ultimately led to its popularity 20

CSE 486/586, Spring 2012 Comparison: AFS AFS: Andrew File System Two unusual design principles: –Whole file serving: not in blocks –Whole file caching: permanent cache, survives reboots Based on (validated) assumptions that –Most file accesses are by a single user –Most files are small –Even a client cache as “large” as 100MB is supportable (e.g., in RAM) –File reads are much more often that file writes, and typically sequential Active invalidation –On write, the server tells each client to invalidate cached copies 21

CSE 486/586, Spring 2012 Summary Goal: emulate local file system behaviors Requirements –Transparency –Concurrent updates –Fault tolerance –Consistency –Security NFS –Caching with write-through policy at close() –Stateless server 22

CSE 486/586, Spring Acknowledgements These slides contain material developed and copyrighted by Indranil Gupta (UIUC).