Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to DFS. Distributed File Systems A file system whose clients, servers and storage devices are dispersed among the machines of a distributed.

Similar presentations


Presentation on theme: "Introduction to DFS. Distributed File Systems A file system whose clients, servers and storage devices are dispersed among the machines of a distributed."— Presentation transcript:

1 Introduction to DFS

2 Distributed File Systems A file system whose clients, servers and storage devices are dispersed among the machines of a distributed system File system operations have to be carried out over the network A good DFS should ensure transparency –Clients should have the look and feel of a conventional file system

3 Naming and Transparency Mapping between the logical and physical objects Location Transparency – Name and physical storage location have no relationship Location independence – Name and physical storage are independent –Name need not be changed if physical location is changed Location independent files are essentially logical data containers Location transparency hides the association b/w names and physical storage

4 Naming Schemes Combination of host name and local name –Local name is a path similar to Unix –Neither transparent nor independent Attaching remote directories to the local directory –Popularized by Sun’s NFS –Appears as a coherent directory tree Globally unique names –Truly transparent –Global naming structure spans all names –Difficult to achieve due to special files

5 Implementing Naming Schemes Transparent naming requires mapping between names and their associated locations Aggregating files into components for scalability and manageability –Hierarchical directory trees Replication and caching –Maintaining consistency of cached view Location independent file identifiers

6 Accessing Remote Files Needs network data transfer Remote service mechanism Remote procedure call Caching for improved performance

7 Caching Idea is fetch once, use multiple times If requested data is not available, get it from server Store fetched data Perform access on local data Replace data when cache becomes full One master copy at the server, several secondary copies at clients Granularity – File blocks to entire file

8 Cache Location Main memory –Workstations can be diskless –Faster access –Technology trends memory accesses becoming faster –Server caches will be in main memory – code reusability Local disks –Reliability via persistence Hybrid schemes –Best of both worlds

9 Cache Update Policy Policy regarding when the modified data is reflected on the master copy Can have significant impact on the performance Write through policy –All writes are reflected immediately on the master copy –Blocking Delayed writes –Write on flush –Periodic writes –Write on close

10 Ensuring consistency Ensuring that data being read is consistent with master copy Client initiated approach –Clients validates with server whether its data is up-to- date –Frequency of validation is the main issue –Check on first access –Check on every access –Periodic checking

11 Server Initiated Approaches Server records the files each client is accessing Detects potential inconsistency and notifies clients Conflicts occur when at least 2 clients cache and one is writing Invalidation/Update based mechanisms Session semantics –Consistency enforced upon file closing Unix semantics –Consistency enforced upon write

12 Why or Why not Caching Locality of accesses –Gains in performance and scalability Big chunks of data lead to lesser overheads Disk accesses can be optimized for larger chunks of data Consistency maintenance is the cost Memory/disk space requirements at clients

13 Stateful vs. Stateless Servers Stateful servers maintain information about files being accessed by clients Clients are given connection ids, which acts as index into inode tables Performance gains – Prefetching file blocks Stateless servers maintain no state Each request is self-contained Reliability is the issue !!!


Download ppt "Introduction to DFS. Distributed File Systems A file system whose clients, servers and storage devices are dispersed among the machines of a distributed."

Similar presentations


Ads by Google