Presentation is loading. Please wait.

Presentation is loading. Please wait.

Distributed System.

Similar presentations


Presentation on theme: "Distributed System."— Presentation transcript:

1 Distributed System

2 Distributed System Distributed System (DS) DS software
consists of a collection of autonomous computers linked by a computer network and equipped with distributed system software. DS software enables computers to coordinate their activities and to share the resources of the system, i.e., hardware, software and data. Users of a DS should perceive a single, integrated computing facility even though it may be implemented by many computers in different locations.

3 Characteristics of Distributed Systems
The following characteristics are primarily responsible for the usefulness of distributed systems Resource Sharing Openness Concurrency Scalability Fault tolerance Transparency They are not automatic consequences of distribution; system and application software must be carefully designed

4 DESIGN GOALS Key design goals Basic design issues
Performance, Reliability, Consistency, Scalability, Security Basic design issues Naming Communication: optimize the implementation while retaining a high level programming model Software structure: structure a system so that new services can be introduced that will interwork fully with existing services Workload allocation: deploy the processing, communication and resources for optimum effect in the processing of changing workload Consistency maintenance: the maintenance of consistency at reasonable cost

5 Naming Distributed systems are based on the sharing of resources and on the transparency of resource distribution Names assigned to resources must have global meanings that are independent of location be supported by a name interpretation system that can translate names to enable programs to access the resources Design issue design a naming scheme that will scale, and translate names efficiently to meet appropriate performance goals

6 Communication Communication between a pair of processes involves:
transfer of data from the sending process to the receiving process synchronization of the receiving process with the sending process may be required Programming Primitives Communication Structure Client- Server Group Communication

7 Distributed programming
Software Structure Addition of new service should be easy Applications Open services Distributed programming support Operating system kernel services Computer and network hardware The main categories of software in a distributed system

8 Workload Allocation How is work allocated amongst resources in a DS ?
Workstation-Server Model ‘putting the processor cycles near the user’ – good for interactive applications capacity of workstation determines the size of largest task that can be performed on behalf of the user does not optimize the use of processing and memory resources a single user with a large computing task is not able to obtain additional resources Some modifications of the workstation-server model processor pool model, shared memory multiprocessor

9 Processor Pool Model Processor pool model Users
allocate processors dynamically to users a processor pool usually consists of a collection of low-cost computers each processor in a pool has an independent network connection processors do not have to be homogeneous processors are allocated to processes for their lifetime Users use a simple computer or X-terminal a user’s work can be performed partly or entirely on the pool processors examples: Amoeba, Clouds, Plan 9

10 Use of Idle Workstations
A significant proportion of workstations on a network may be unused or be used for lightweight activities (at some time especially overnight) The idle workstations can be used to run jobs for users who are logged on at other stations and do not have sufficient capacity at their machine In Sprite OS the target workstation is chosen transparently by the system include a facility for process migration NOW(Networks of Workstations) MPP is expensive and workstations are NOT network is getting faster than any other components for what? network RAM, cooperative file cacheing, software RAID, parallel computing, …etc

11 Consistency Maintenance
Update Consistency Arises when several processes access and update data concurrently changing a data value cannot be performed instantaneously desired effect the update looks atomic - a related set of changes made by a given process should appear to all other processes as if it was done instantaneous Significant because many processes share data operation of system itself depends on the consistency of file directories managed by file services, naming databases etc

12 Consistency Maintenance (cont’d)
Replication Consistency motivations of data replication increased availability and performance if data have been copied to several computers and subsequently modified at one or more of them, the possibility of inconsistencies arises between the values of data items at different computers

13 Consistency Maintenance (cont’d)
Cache Consistency cacheing vs replication same consistency problem as replication examples multiprocessor caches file caches cluster web server

14 User Requirements Functionality What the system should do for users
Quality of Service issues of performance, reliability and security Reconfigurability accommodate changes without causing disruption to existing services

15 Distributed File System
Introduction The SUN Network File System The Andrew File System The Coda File System The xFS

16 Introduction Three practical implementations.
Sun Network File System Andrew File System Coda File System These systems aim to emulate the UNIX file system interface Emulation of a UNIX file system interface caching of file data in client computers is an essential design feature, but the conventional UNIX file system offers one-copy update semantics one-copy update semantics: file contents seen by all of the concurrent processes are those that they would see if only single copy of the file contents existed These three implementations allow some deviation from one-copy semantics one-copy model has not been strictly adhered

17 Server Structure Connectionless Connection-Oriented Iterative Server
Concurrent Server

18 Stateful Server file position is updated here fopen(...) file
descriptor for client A fread(fp, nbytes) data file system client A

19 Stateless Server fopen(fp, read)) fread(.,position.) fclose(fp) file
descriptor for client A data file system client A file position is updated here

20 The Sun NFS provide transparent access to remote files for client programs each computer has client and server modules in its kernel the client and server relationship is symmetric each computer in an NFS can act as both a client and a server larger installations may be configured as dedicated servers available for almost every major system

21 The Sun NFS (cont’d) Design goals with respect to transparency
Access transparency An API is identical to the local OS’s interface. Thus, in a UNIX client, no modifications to existing programs are required for accesses to remote files. Location transparency each client establishes a file name space by adding remote file systems to its local name space for each client (mount) NFS does not enforce a single network-wide file name space. each client may see a unique set of name space

22 The Sun NFS (cont’d) Failure transparency Performance transparency
NFS server is stateless and most file access operations are idempotent UNIX file operations are translated to NFS operations by an NFS client module Stateless and idempotent nature of NFS ensures that failure semantics for remote file access are similar to those for local file access Performance transparency both the client and server employ caching to achieve satisfactory performance For clients, the maintenance of cache coherence is somewhat complex, because several clients may be using and updating the same file

23 The Sun NFS (cont’d) Migration transparency Mount service
establish the file name space in client computers file systems may be moved between servers, but the remote mount tables in each client must then be separately updated to enable the clients to access the file system in its new location migration transparency is not fully achieved by NFS Automounter runs in each NFS client and enables pathnames to be used that refer to unmounted file systems

24 The Sun NFS (cont’d) Replication transparency Concurrency transparency
NFS does not support file replication in a general sense Concurrency transparency UNIX support only rudimentary locking facilities for concurrency control NFS does not aim to improve upon the UNIX approach to the control of concurrent updates to files

25 The Sun NFS (cont’d) Scalability Scalability of the NFS is limited.
Due to the lack of replication The number of clients that can simultaneously access a shared file is restricted by the performance of the server that holds the file. can become a system-wide performance bottleneck for heavily-used files.

26 Implementation of NFS User-level client process: process using NFS
NFS client and server modules communicate using remote procedure calling.

27 The Andrew File System Andrew Andrew File System (AFS)
a distributed computing environment developed at CMU Andrew File System (AFS) reflects an intention to support information-sharing on a large scale provides transparent access to remote shared files for UNIX programs scalability is the most important design goal implemented on workstations and servers running BSD4.3 UNIX or Mach

28 The Andrew File System (cont’d)
Two unusual design characteristics whole-file serving the entire contents of files are transmitted to client computers by AFS servers. whole-file caching a copy of a file is stored in a cache on the client’s local disk. the cache is permanent, surviving reboots of the client computer.

29 The Andrew File System (cont’d)
The design strategy is based on some assumptions files are small reads are much more common than writes (about 6 times) sequential access is common and random access is rare most files are read and written by only one user temporal locality of reference for files is high Databases do not fit the design assumptions of AFS typically shared by many users and are often updated quite frequently DB are treated by its own storage control, anyway

30 Implementation Some questions about the implementation of AFS
How does AFS gain control when an open or close system call referring to a file in the shared file space is issued by a client? How is the server holding the required file located? What space is allocated to cached files in workstations? How does AFS ensure that the cached copies of files are up-to-date when files may be updated by several clients?

31 Implementation (cont’d)
Vice: name given to the server software that runs as a user-level UNIX process in each server computer. Venus: a user-level process that runs in each client computer.

32 Cache coherence Callback promise
mechanism for ensuring that cached copies of files are updated when another client closes the same file after updating it. Vice supplies a copy of a file to a Venus with a callback callback promises are stored with the cached files state of callback promise: either valid or cancelled When a Vice update a file, it notifies all of the Venus processes to which it has issued callback promises by sending a callback callback is a RPC from a server to a client (i.e., Venus) When a Venus receives a callback, it sets the callback promise token for the relevant file to cancelled

33 Cache coherence (cont’d)
Handling open in Venus If the required file is found in the cache, then its token is checked. If its value is cancelled, then get a new copy If valid, then use it Restart of a client computer after a failure some callbacks may have been missed for each file with a valid token, Venus sends a timestamp to the server If timestamp is current, the server responds with valid. Otherwise, the server responds with cancelled

34 Cache coherence (cont’d)
Callback promise renewal interval Callback promises must be renewed before an open if a time T (say, 10 minutes) has elapsed without communication from the server for a cached file deals with communication failure

35 Update semantics For a client C operating on a file F on a server S, the followings are guaranteed Update semantics for AFS-1 after a successful open: latest(F,S) after a failed open: failure(S) after a successful close: updated(F,S) after a failed close: failure(S) latest(F,S): current value of F at C is the same as the value at S failure(S): open or close has not been performed at S updated(F,S): C’s value of F has been successfully propagated to S

36 Update semantics (2) Update semantics for AFS-2:
currency guarantee for open is slightly weaker after a successful open: latest(F,S,0) or (lostCallback(S,T) and inCache(F) and latest(F,S,T)) latestes(F,S,T): the copy of F seen by client is no more than T out of date lostCallback(S,T): callback message from S to C has been lost during the last T time inCache(F): F was in the cache at C before open was attempted

37 Update semantics (3) AFS does not provide any further concurrency control mechanism If clients in different workstations open, write and close the same file concurrently, only the updates from the last close remain and all others will be silently lost (no error report) clients must implement concurrency control independently if they require it When two client processes in the same workstation open a file, they share the same cached copy, and updates are performed in the normal UNIX fashion: block-by-block.

38 The Coda File System Coda File System Goal
a descendent of AFS that addresses several new requirements [CMU] replication for a large scale system improvement in fault-tolerance mobile use of portable computers Goal constant data availability provide users with the benefits of a shared file repository, but allow them to rely entirely on local resources when the repository is partially or totally inaccessible retain the original goals of AFS with regard to scalability and the emulation of UNIX

39 The Coda File System (cont’d)
read-write volumes can be stored on several servers higher throughput of file accesses and a greater degree of fault tolerance Support of disconnected operation an extension of the mechanism in AFS for caching copies of files at workstations enable workstations to operate when disconnected from the network

40 The Coda File System (cont’d)
Volume storage group (VSG) set of servers holding replicas of a file volume Available volume storage group (AVSG) some subset of VSG in which a client wishing to open a file Callback promise mechanism Clients are notified of a change, as in AFS Updates instead of invalidations

41 The Coda File System (cont’d)
Coda version vector (CVV) attached to each version of a file vector of integers with one element for each server in VSG [server-i1, server-i2, . . ., server-ik] each element of CVV denotes the number of modifications on the version of the file held at the corresponding server Provide information about the update history of each file version to enable inconsistencies to be detected and corrected automatically if updates do not conflict, or with manual intervention if they do

42 The Coda File System (cont’d)
Repair of inconsistency if all the elements of CVV at one site > those of all other sites inconsistency can be automatically repaired otherwise, the conflict cannot in general be resolved automatically the file is marked as ‘inoperable’, and the owner of the file is informed of the conflict needs a manual intervention

43 The Coda File System (cont’d)
Scenario when a modified file is closed, Venus sends to each site in AVSG an update message (new contents of the file and CVV) Vice at each site checks CVV if consistent, store new contents and returns ACK Venus increments elements of CVV for the servers that responded positively to the update message, and distributes the new CVV to members of AVSG

44 The Coda File System: Example
F is a file in a volume replicated at servers S1, S2 and S3 C1 and C2: clients VSG for F = {S1, S2, S3} AVSG for C1 = {S1, S2}, AVSG for C2 = {S3} Initially, CVVs for F at all three servers are [1, 1, 1] C1 modifies F CVVs for F at S1 and S2 are [2, 2, 1] C2 modifies F CVV for F at S3 is [1, 1, 2] No CVV dominates all other CVVs conflict requiring manual intervention Suppose F is not modified in step 3 above. Then [2, 2, 1] dominates [1, 1, 1]. Thus, the version of the file at S1 or S2 should replace that at S3

45 Update semantics The currency guarantees by Coda when a file is opened at a client are weaker than for AFS The Guarantee offered by successful open It provides the most recent copy of file from the current AVSG If no server is accessible, a locally cached copy of file is used if available. successful close The file has been propagated to the currently accessible set of servers If no server is available, the file has been marked for propagation at the earliest opportunity.

46 Update semantics (cont’d)
S: server, S: set of servers (the file’s VSG) s: the AVSG for the file seen by a client C after a successful open: s ¹ Æ and (latest(F,s,0) or (latest(F,s,T) and lostCallback(s,T) and inCache(F))) or (s = Æ and inCache(F)) after a failed open: s ¹ Æ and conflict(F, s) or (s = Æ and Ø inCache(F)) after a successful close: s ¹ Æ and updated(F, s) or (s = Æ) after a failed close: s ¹ Æ and conflict(F, s) conflict(F, s) means that the values of F at some servers in s are currently in conflict

47 Cache coherence Venus at each client must detect the following events within T seconds enlargement of AVSG due to accessibility of a previously inaccessible server shrinking of an AVSG due to a server becoming inaccessible a lost callback Multicast messages to VSG

48 xFS xFS: Serverless Network File System Cooperative Cacheing
in the paper " A Case for NOW", “Experience with a ...” idea file system as a parallel program exploit fast LANs Cooperative Cacheing use remote memory to avoid going to disk manage client memory as a global resource much of client memory is not used server: get file from client's memory instead of from disk better send to idle client than discarding replaced file copy

49 xFS Cache Coherence Write Ownership Cache Coherence
each node can own a file owner has the most up to date copy server just keeps track of who "owns" file any request to a file is forwarded to the owner a file is either owned: only one copy exists read-only: multiple copies to modify a file, secure a file as owned modify as many time as you want if someone else reads the file, send the up to date version, and marks the file as read-only

50 xFS Cache Coherence invalid write by other node read write by write
read-only owned read by other node

51 xFS Software RAID Cooperative cacheing makes availability nightmare
any crash will damage a part of a file system stripe data redundantly over multiple disks software RAID reconstruct missing part from remaining parts logging makes reconstruction easy

52 xFS Software RAID Motivations high nadwidth requirements from
multimedia parallel computing economic workstations high speed network let’s learn from RAID parallel IO from inexpensive hard disks fault managements limitations single server small write problem

53 xFS Software RAID Approaches small file problems
stripe each file across multiple file servers small file problems when stripping units is too small ideal size is 10’s of Kbytes two reads and two writes for a write (parity check/build) when a file is a stripping unit parity will consume the same space load cannot be spread across servers

54 xFS Experiences Need of a formal method for cache coherence
it is much more complicated than it looks lot of trasient states 3 formal states => 22 implementation states ad hoc test-and-retry leaves unknown errorr permanently no one is sure about the correctness software protability is poor

55 xFS Experiences Threads in a server RPC it is a nice concept but
it incurs too much concurrency too much data races the most difficult thing to understand in the world dificult to debug solution:iterative server difficult to design but simple to debug less error-prone efficient RPC not suitable for multi-party communication need to gather/scatter RPC servers


Download ppt "Distributed System."

Similar presentations


Ads by Google