Presentation is loading. Please wait.

Presentation is loading. Please wait.

Distributed Systems. Interprocess Communication (IPC) Processes are either independent or cooperating – Threads provide a gray area – Cooperating processes.

Similar presentations


Presentation on theme: "Distributed Systems. Interprocess Communication (IPC) Processes are either independent or cooperating – Threads provide a gray area – Cooperating processes."— Presentation transcript:

1 Distributed Systems

2 Interprocess Communication (IPC) Processes are either independent or cooperating – Threads provide a gray area – Cooperating processes can affect other processes execution Reasons for cooperation – Information sharing – Parallelization – Modularity – Convenience Cooperating processes must communicate (IPC) Two models: – Shared memory – Message passing

3 Communication Models Process A Process B Kernel Process A Process B Kernel Shared Memory

4 Processes – Each process has private address space – Must explicitly setup shared memory segments inside each process address space Threads – Default behavior, always on Advantages – Fast and easy to share data Disadvantages – Must synchronize data accesses; error prone

5 Signals Signal – Software based notification of event – Available signals for applications SIGUSR1, SIGUSR2, etc… Signal Reception – Catch: Process specifies a handler to call – Default is to ignore signal – Signals can also be masked Disadvantages – Does not exchange data – Complex thread semantics

6 Message Passing Processes exchange messages – Messages are arbitrary pieces of information – Explicit send and receive operations Advantages – Explicit sharing (easier to reason about) – Improves modularity (well defined interfaces) – Improved isolation Disadvantage – Performance overhead for handling messages

7 Message Passing Where do you send a message? – How do you NAME recipients? Depends on environment: IP + port, PID, UNIX pipe file How do you parse a message? – Requires a protocol Message format Message order Process states Directionality – One way: one process always sends data, other receives – Two way: Receiver sends a reply for each message Example: Remote Procedure Calls

8 Remote Procedure Calls Invocation of a function in a separate process – Identifies which function to invoke – Provides function arguments – Sends return value back to caller Function invocation info must be encapsulated into message – Marshaling Transform arguments and return values into message data Requires knowledge about data types Either by hand or automated

9 Distributed File systems Goal: View a distributed system as a file system – Storage is distributed Issues not common to local file systems – Naming transparency – Load balancing – Scalability – Location and network transparency – Fault tolerance

10 Transfer Model Upload/Download model – Client downloads file and modifies local copy – Uploads final result back to server – Simple with good performance Remote Access model – File only exists on server, all I/O operations forwarded by client

11 Naming Transparency Naming is a mapping from logical to physical objects Ideally client interface should be transparent – No difference between local and remote files – No partitioning of namespace 2 types of transparency – Location transparency: path provides no info about location – Location independence: Move files without changing names Name space separate from storage device hierarchy

12 Caching Keep repeatedly accessed blocks in cache – Improves performance of further accesses Very similar to buffer cache – But cached copy can reside on local disk – Synchronization occurs much less frequently – Cache consistency becomes a much larger issue Multiple levels of caching – Memory: Faster accesses, less state to track (NFS) – Disk: Reliable, allows disconnected operation (AFS)

13 Network File System (NFS) Developed by Sun in 1984 – Joined FSes on multiple systems into one logical whole Most common modern Distributed File System – Used by Facebook until 2009 Assumptions – Allows arbitrary collection of users to share a file system – Clients and servers can be on different networks – Systems can act as both clients and servers simultaneously Architecture – Server exports one or more directories to remote clients – Clients access exported directories by mounting them into local file system

14 Example

15 NFS Protocol NFS operates over RPC operations for remote file access All UNIX system calls OTHER than open and close – Searching a directory – Read directory entries – Manipulate links and directories – Read/Write files Every operation is transmitted over network and performed on NFS server

16 No Open or Close? Open and Close are intentionally left out – To read a file, a client sends a “lookup” message to server – Server finds file and returns a handle – Lookup does add entry to open file table No file descriptor – Read RPCs include handle, offset, number of bytes, and data – Each message is independent and self contained Pros: – Server is stateless, – Does not need to track open files Cons: – Locking is difficult – No concurrency control

17 NFS implementation System Call layer – Handles calls like open, read and close Virtual File System (VFS) Layer – Maintains table with one entry (v-node) for each open file – V-nodes indicate if a file is remote or local Remote files include server information Local files include inode information NFS Service layer – Lowest layer that implements network protocol

18 NFS Layer structure

19 Cache Coherency Clients cache file attributes and data – If two clients cache the same data, cache coherency is lost Solutions – Each block has a timer (data: 3 sec, dirs: 30 sec) Entries discarded when time expires – When opening cached files, check modification time on server If cache is old, discard data – Flush cache to server every 30 seconds

20 Andrew File System (AFS) Developed at CMU for student computing Consists of workstation clients and dedicated file servers Workstations have local disks to cache used files – Originally entire files, now just 64KB chunks Single namespace for every system in the world – /afs/cs.pitt.edu Very good for widely distributed operation – Local disk caching allows very fast performance on slow connections – Startup is slow, but afterwards most accesses are local

21 AFS Overview Based on upload/download model – Clients download and cache files – Server keeps track of clients with cached copies – Clients upload files at end of session Whole file caching is central idea behind AFS – Later changed to block operations – Simple and effective AFS servers are stateful – Keep track of clients that have cached files – Recall files that have been modified

22 AFS Details Based on dedicated server machines Clients see partitioned namespace – Local name space and shared name space – Cluster of dedicated AFS servers present shared name space AFS file names work anywhere in the world – /afs/cs.pitt.edu/usr0/jacklange

23 AFS: Operations and Consistency AFS caches entire files from servers – Client interacts with servers only for open/close OS on client intercepts calls, and passes it to local AFS process – Local process caches files from servers – Contacts AFS servers for each open/close Only if file is not in cache – File reads/writes operate on local cached copy

24 AFS Caching and Consistency Need for scaling led to reduction of client-server message traffic Once a file is cached, all operations are performed locally – Cache is on disk, so normal FS operations are used On close, if file is modified, it is replaced on the server – What happens if multiple clients share a file? Client assumes its cache is up to date… – Unless it receives a callback message from server – On file open, if client has received callback for that file, it must fetch a new copy

25 Summary NFS – Simple distributed file system protocol – Stateless (No open/close) – Has problems with cache consistency and locking AFS – More complex protocol – Stateful server – Session semantics – Consistency on close


Download ppt "Distributed Systems. Interprocess Communication (IPC) Processes are either independent or cooperating – Threads provide a gray area – Cooperating processes."

Similar presentations


Ads by Google