Presentation is loading. Please wait.

Presentation is loading. Please wait.

Distributed Operating Systems

Similar presentations


Presentation on theme: "Distributed Operating Systems"— Presentation transcript:

1 Distributed Operating Systems

2 Coverage Distributed Systems (DS) Paradigms Coordination
DS & OS’s Services and Models Communication Distributed File Systems Coordination Distributed Mutual Exclusion (ME) Distributed Co-ordination Synchronization DS Scheduling & Misc. Issues

3 What is a Distributed System
“A distributed system is the one preventing you from working because of the failure of a machine that you had never heard of.” Leslie Lamport “A distributed system is a collection of independent computers that appear to the users of the system as a single computer.” Tanenbaum shared memory multiprocessor message passing multiprocessor distributed system  Multiple computers sharing (same) state and interconnected by a network

4 Distribution: Example Pro/Cons
All the Good Stuff: High Performance, Distributed Access, Scalable, Heterogeneous, Sharing (Concurrency), Load Balancing (Migration, Relocation), Fault Tolerance , … Bank account database (DB) example Naturally centralized: easy consistency and performance Fragment DB among regions: exploit locality of reference, security & reduce reliance on network for remote access Replicate each fragment for fault tolerance But, we now need (additional) DS techniques Route request to right fragment Maintain access/consistency of fragments as a whole database Maintain access/consistency of each fragment’s replicas

5 Transparency: Global Access
Illusion of a single computer across a DS Fragmentation Hide wether the resource is fragmented or not Transparency means that any form of distributed system should hide its distributed nature from its users, appearing and functioning as a normal centralized system. Distribution transparency: Access transparency + location transparency + replication transparency + fragmentation transparency

6 Multiprocessor OS Types (1)
Bus Each CPU has its own operating system Shared bus  Comm. blocking & CPU idling!

7 Multiprocessor OS Types (2)
Master-Slave multiprocessors Bus Master-Slave multiprocessors Master is a bottleneck!

8 Multiprocessor OS Types (3)
Symmetric Multiprocessors SMP multiprocessor model Bus Eliminates the CPU bottleneck, but have issues associated to ME, synchronization Mutex on OS?

9 OS’s for DS’s Loosely-coupled OS Tightly-coupled OS
A collection of computers each running their own OS, OS’s allow sharing of resources across machines A.K.A. Network Operating System (NOS) Manages heterogeneous multicomputer DS Difference: provides local services to remote clients via remote logging Data transfer from remote OS to local OS via FTP (File Transfer Protocols) Tightly-coupled OS OS tries to maintain single global view of resources it manages A.K.A. Distributed Operating System (DOS) Manages multiprocessors & homogeneous multicomputers Similar “local access feel” as a non-distributed, standalone OS Data migration or computation migration modes (entire process or threads)

10 Network Operating Systems (NOS)
ACCENT Network OS kernel developed at Carnegie-Mellon U. for the PERQ workstation. Early 1980s [Rashid & Robertson 1981]. BOS/NET Multitasking, multiprocessing version of BOS/5. COCANET UNIX A local network operating system based on UNIX, developed for the COCANET local area network at U.C. Berkeley. Early 1980s [Rowe & Birman 1982]. CP/NET Networking version of CP/M. Digital Research, Early 1980s [Kildall 1981, Rolander 1981]. CP/NOS A memory-resident, diskless version of CP/NET. Digital Research, Early 1980s [Kildall 1981]. HetNOS LahNOS MP/NET Version of MP/M with networking facilities. Digital Research, Early 1980s [Kildall 1981]. MP/NOS Memory-resident, diskless version of MP/NET. Digital Research, Early 1980s [Kildall 1981]. NetWare Network OS for Local Area Network and server control by Novell. Newcastle Connection A network OS layer for UNIX systems providing transparent distributed access. Early 1980s [Brownbridge et al 1982]. NSW National Software Works. Late 1970s [Millstein 1977]. PC/NOS Network OS for MS-DOS or CP/M. Applied Intelligence [Row & Daugherty 1984]. RIO/CP Network operating system for the ZNET. Late 1970s [Zarella 1981]. RSEXEC Network OS for the ARPANET, based principally on TENEX. Early 1970s [Thomas 1973]. TRIX A network oriented OS. Late 1970s [Ward 1980]. uNETix Network OS for the 8086, 68000, & families. Multitasking with transparent remote file access, load balancing, and multiple windows. UNIX and PC-DOS compatible. Lantech Systems, Mid-1980s [Foster 1984].

11 Distributed Operating Systems (DOS)
Distributed operating systems differ from network operating systems in supporting a transparent view of the entire network, in which users normally do not distinguish local resources from remote resources. AEGIS OS for the Apollo DOMAIN Distributed system. Early 1980s. AMOEBA A distributed OS based partly on UNIX. Based on passive data objects protected by encrypted capabilities. 1980s [Tanenbaum & Mullender 1981, Mullender & Tanenbaum 1986]. Arachne A distributed operating system developed at the U. of Wisconsin. Late 1970s [Finkel 1980]. Charlotte Distributed OS for the Crystal Multicomputer project at the U. of Wisconsin. Explores coarse-grained parallelism without shared memory for computationally intensive tasks. 1980s [Finkel et al 1989]. CHOICES Distributed, object-oriented OS featuring a high degree of customization. U. of Idaho, 1990s [Campbell et al 1993]. Clouds A distributed object-based operating system developed at Georgia Institute of Technology. Early 1990s. [DasGupta 1991] CMDS The Cambridge Model Distributed System. U. of Cambridge (England). Late 1970s [Wilkes & Needham 1980]. CONDOR A distributed OS described as a "hunter of idle workstations," which distributes large computationally intensive jobs among available processors in a workstation pool. U. Wisconsin at Madison, 1980s [Litzkow 1988]. Cronus Object-oriented distributed computing system for heterogenous environments. BBN Systems, 1980s [Schantz et al 1986]. DEMOS/MP A distributed version of the DEMOS operating system. Message-based, featuring process migration. U.C. Berkeley, early 1980s [Miller et al 1984]. DISTOS A Distributed OS for a network of 68000s. DISTRIX Message-based distributed version of Unix. Early 1980s. DUNIX A distributed version of UNIX developed at Bell Labs. late 1980s [xxx 1988]. Eden A distributed object-oriented OS at the U. of Washington, based on an integrated distributed network of bit-mapped workstations. Capability-based. Early 1980s [Almes et al 1985]. Galaxy A distributed UNIX-compatible system featuring multi-level IPC and variable-weight processes. Univ. of Tokyo, late 1980s [Sinha et al 1991]. LOCUS Distributed OS based on UNIX. Mid 1980s. [Popek & Walker, 1985]. MDX MICROS Distributed OS for MICRONET, a reconfigurable network computer. Late 1970s [Wittie & van Tilborg 1980]. MOS An early version of MOSIX. Controls four linked PDP-11s. Mid 1980s [Barak & Litman 1985]. MOSIX A distributed version of UNIX supporting full transparency and dynamic process migration for load balancing. Developed at the Hebrew U. of Jersusalem. Mid 1980's to 1990's [Barak et al 1993]. Newark Early version of Eden developed for the VAX environment. The name was chosen because it was "far from Eden." NSMOS A version of MOSIX for National Semiconductor VR32 systems. late 1980's [Barel 1987]. Plan9 Distributed UNIX-like system developed at Bell Labs by the originators of UNIX. Features per-process name-spaces, allowing each process a customized view of the resources in the system. 1990s [Pike et al 1995]. REPOS Operating System for small PDP-11's attached to a host computer. Late 1970s [Maegaard & Andreasan 1979]. RIG Rochester Intelligent Gateway. Network OS developed at the University of Rochester. Influenced Accent and Mach. Early 1970s [Ball et al 1976]. Roscoe Distributed OS for multiple identical processors (LSI-11s). University of Wisconsin, Late 1970s [Solomon & Finkel 1979]. Saguaro Distributed OS at the U. of Arizona, supporting varying degrees of transparency. Mid 1980s [Andrews et al 1987]. SODA A Simplified OS for Distributed Applications. Mid 1980s [Kepecs & Solomon 1985]. SODS/OS OS for a Distributed System developed on the IBM Series/1 at the U. of Delaware. Late 1970s [Sincoskie & Farber 1980]. Spring Distributed multiplatform OS developed by Sun. Not related to the Spring Kernel, a real-time system. 1990s [Mitchell et al 1994]. Uniflex Multitasking, multiprocessing OS for the family. Technical Systems Consultants. Early 1980s [Mini-Micro 1986]. V Experimental Distributed OS linking powerful bit-mapped workstations at Stanford U. Early 1980s [Cheriton 1984, Berglund 1986].

12 Client Server Model for DOS & NOS

13 Middleware Can we have the best of both worlds?
Scalability and openness of a NOS Transparency and relative ease of a DOS Solution: additional layer of SW above NOS Mask heterogeneity Improve distribution transparency (and others)  “Middleware” (MW)

14 Middleware (& Openness)
Document-based middleware (e.g. WWW) Coordination-based MW (e.g., Linda, publish subscribe, Jini etc.) File system based MW Shared object based MW

15 File System-Based Middleware (1)
Approach: make a DS look like a big file system Transfer models: Download/upload model (work done locally) Remote access model (work done remotely)

16 File System-Based Middleware (2)
(a) Two file systems (b) Naming Transparency: All clients have same view of FS (c) Some clients with different FS views

17 File System-Based Middleware (3)
Semantics of file sharing (ordering and session semantics) (a) single processor gives sequential consistency (b) distributed system may return obsolete value

18 Shared Object-Based Middleware (1)
Approach: make a DS look like objects (variables + methods) Easy scaling to large systems replicated objects (C++, Java) flexibility Example 1: Main elements of CORBA (Common Object Request Broker Architecture) based system: - Object Request Broker (ORB) - Internet Inter-ORB Protocol (IIOP) Main elements of CORBA based system Common Object Request Broker Architecture inter-ORB protocol

19 Shared Object-Based Middleware (2)
Example 2: Globe [1] provides a model of distributed shared objects and basic support services (naming, locating objects etc.) an object is completely self-contained designed to scale to a billion users, a trillion objects around the world GIDS: Globe Infrastructure Directory Service [1] M. van Steen, P. Homburg, and A. Tanenbaum. “Globe: A Wide-Area Distributed System”, 1999.

20 Shared Object-Based Middleware (3)
A distributed shared object in Globe can have its state copied on multiple computers at once how to maintain sequential consistency of write operations?

21 Shared Object-Based Middleware (4)
2.1 The structure of Globe objects The local representatives of a Globe object are composed of subobjects, modules that take care of a particular aspect of the object’s implementation. A replica local representative of an object minimally consists of five subobjects, as illustrated in Figure 2. Semantics subobject: This subobject is the actual implementation of the distributed object’s methods and logically holds the state of the object (which may be on persistent storage physically). It can be developed without having to take many distribution or replication issues into account. Accesses to a semantics subobject are serialized: at most one thread is active in a semantics subobject at a particular point in time. Replication subobject: A Globe object may have semantics subobjects in multiple local representatives (i.e., be replicated) for reasons of fault tolerance or performance. The replication subobject is responsible for keeping the state of these replicas consistent according to the consistency model chosen for this particular distributed object. In addition, the replication subobject may also manage the degree and placement of replicas (i.e., how many replicas the object should have and where they should be located). In other words, the replication subobject is in charge of the object’s distribution strategy. To perform this task, the subobject communicates with its peers in other local representatives using an object-specific replication protocol. Different distributed objects will use different (sets of) replication subobjects depending on the protocol chosen. An important aspect of replication subobjects is that there is one standardized interface for all replication subobjects. This also holds for the communication subobject, allowing us to build a large library of such subobjects that are reusable by other applications. Security subobject: The security subobjects implement the security policy for the Globe object. In particular, a security subobject controls access to the local representative and ensures that the local representative talks only to authorized peers. To perform these tasks it can use different security protocols and keying schemes. The Globe security framework is described in detail in [10], and is currently being built into our Globe implementation. Communication subobject: This subobject is responsible for handling communication between the local representatives of the distributed object residing in different address spaces, usually on different machines. Depending on what is needed by the other subobjects, a communication subobject may offer (reliable or unreliable) primitives for point–to–point communication, group communication, or both. This is generally a system-provided subobject (i.e., taken from a library). Control subobject: The control subobject takes care of invocations from client processes, and controls the interaction between the semantics subobject and the replication subobject. This subobject is needed to bridge the gap between the programmer-defined interfaces of the semantics subobject, and the standardized interface of the replication subobject. For example, the control subobject marshalls and unmarshalls method invocations and replies. Because of its bridging function a control subobject has both a standardized interface used by the replication subobject and programmer-defined interfaces used by client processes (in Globe, as in Java, a (sub)object can have multiple interfaces.) Instead of using a stub compiler to generate control subobjects, we use Java’s reflection APIs [11, 12] to generate the application-defined interfaces at run time and connect them to a generic (i.e., application independent) control subobject. Internal structure of a Globe object

22 OS’s, DS’s & MW

23 Network Hardware The Internet

24 Network Services and Protocols
(blocking) Network Services (non-blocking) • Internet Protocol (IP) • Transmission Control Protocol (TCP): Connection-oriented Universal Datagram Protocol (UDP): Connectionless

25 Client-Server Communications
Unbuffered msg. passing send(addr,msg), recv(addr,msg) all request and reply at C/S level all msg. acks between kernels only Buffered msg. passing msg. sent to kernel mailbox or kernel/user interface socket client server kernel msg. directed at a process client server kernel kernel blocking? non-blocking?

26 Remote Procedure Calls (RPC)
Synchronous/Asynchronous (blocking/non-blocking) communication [Sync] client generated request, STUB  kernel [Sync] kernel blocks process till reply received from server [ASync] buffers msg

27 RPC & Stubs (Dummy Procedure i.p.o RPC)
NS C: Client; CS: Client Stub, [K] Kernel S: Server; SS: Server Stub, NS : Network service [C] call “client stub” procedure [CS] prepare msg. buffer [CS] load parameters into buffer [CS] prepare msg. header [CS] send trap to kernel [K] context switch to kernel [K] copy msg. to kernel [K] determine server address (NS) [K] put address in header [K] set up network interface [K] start timer for msg [S] process req; initiate “server stub” [SS] call server [SS] set up parameter stack/unbundle [K] context switch to server stub [K] copy msg. to stub [K] see if stub is waiting [K] decide which stub to assign [K] check packet for validity [K] process interrupt (save PC, kernel state)

28 Remote Procedure Call Implementation Issues:
Can we pass pointers? (local context…) call by reference becomes copy-restore (but might fail) Weakly typed languages (e.g., C allows computations, say product of arrays sans array size specs) can client stub determine unspecified size to pass on? Not always possible to determine parameter types Cannot use global variables C/S may get moved to remote machine • Weakly typed languages – client stub cannot determine size

29 RPC Failures? C/S failure vs. communication failure?
Who detects? Timeouts? Does it matter if a node (C/S) failed BEFORE or AFTER a request arrived? BEFORE or AFTER a request is processed? Client failure: orphan requests? add expiration counters Server crash? Semantics: at-least once or at-most-once

30 Communication Delivers messages despite
communication link(s) failure process failures Main kinds of failures to tolerate timing (link and process) omission (link and process) value

31 Communication: Reliable Delivery
Omission failure tolerance (degree k). Design choices: Error masking (spatial): several (> k) links Error masking (temporal): repeat K+1 times Error recovery: detect error and recover

32 Reliable Delivery (cont.)
Error detection and recovery: ACK’s and timeouts Positive ACK: sent when a message is received Timeout on sender without ACK: sender retransmits Negative ACK: sent when a message loss detected Needs sequence #s or time-based reception semantics Tradeoffs Positive ACKs faster failure detection usually NACKs : fewer msgs… Q: what kind of situations are good for Spatial error masking? Temporal error masking? Error detection and recovery with positive ACKs? Error detection and recovery with NACKs?

33 Resilience to Sender Failure
Multicast FT-Communication harder than point-to-point Basic problem is of failure detection Subsets of senders may receive msg, then sender fails Solutions depend on flavor of multicast reliability Unreliable: no effort to overcome link failures Best-effort: some steps taken to overcome link failures Reliable: participants coordinate to ensure that all or none of correct recipients get it (sender failed in b) Best-effort broadcast (beb) Properties BEB1. Validity: If pi and pj are correct, then every message broadcast by pi eventually delivered by pj BEB2. No duplication: No message delivered more than once BEB3. No creation: No message delivered unless it was broadcast Reliable broadcast (rb) RB1 = BEB1. RB2 = BEB2. RB3 = BEB3. RB4. Agreement: For any message m, if a correct process delivers m, then every correct process delivers m


Download ppt "Distributed Operating Systems"

Similar presentations


Ads by Google