Presentation on theme: "Amoeba – A Distributed Operating System for the 1990s"— Presentation transcript:
1 Amoeba – A Distributed Operating System for the 1990s Authors:Sape J. Mullender,Guido van Rossum,Andrew S. Tanenbaum,Robbert van Renesse,and Hans van StaverenPresented by:Oliver HamptonWednesday, October 1, 2003
2 Amoeba Developed at: Vrije Unviversiteit (Amsterdam) (Free University) In cooperation with:Centrum voor Wiskunde en Informatica (Amsterdam)(Center for Mathematics and Computer Science)Research began in 1980First prototype release in 1983 (V1.0), last official release 1996 (V5.3)Platforms:MicroSPARC and SPARC stationsSun 3/50 and Sun 3/60 workstationsIntel 386/486/PentiumAmoeba is a general-purpose distributed operating system. It is designed to take a collection of machines and make them act together as a single integrated system; such that, users are not aware of the number and location of the processors that execute their commands, nor the number and location of the file servers that store their files.
3 ComponentsWorkstations: supports editing and other tasks that require fast interactive responseProcessor Pool: does most of the heavy duty processingSpecialized Servers: file server, data baseGateway: links Amoeba domains togetherWorkstations: User interface computers or terminalsProcessor Pool: large number of single-board computers each with several MegaBytes of RAM and a network interface(3) Specialized Servers: running dedicated processes that have unusual resource demands(4) Gateway: Allows Amoeba to incorporate several domains; for example, Amoeba spans across several European contries (Holland, England, Norway, and Germany).Put It All Together:A powerful microkernel-based operating system that turns a collection of workstations into a transparent distributed system. The basic idea is to provide users with the illusion of a single powerful timesharing system, when in fact the system is implemented on a collection of machines.
4 Network vs. Transparent Distributed System Each user logs into one specific machine (home machine).Programs execute on the home machine, unless the user gives an explicit command to run it elsewhere.Transparent Distributed System:User logs into the system, not to any specific machine.When a program is run, the system, not the user determines the best place to run it.Other Attributes of the Transparent Distributed System include:Single, system wide file systemFiles in a user directory may be stored on different machines, i.e. distributedNo concept of file transfer, uploading , or downloading form a server (abstracted for the user) or mounting remote file systemsA file’s position in the directory hierarchy has no relation to its location
5 Design Goals Transparency Distribution Parallelism High Performance Hiding the complexities of a distributed system from the user. Amoeba users should not be concerned about the number of processors in the system, nor must they know the location of the other machines of serversDistributionSeveral machines connected over a network operate as a single system. Illusion of interacting with a single, powerful system.ParallelismSingle program/command may use multiple processors. The user simply requests an operation, and the Amoeba OS decides the best way to execute the request.High PerformanceBullet File ServerFLIP (Fast Local Internet Protocol) introduced after V4.0FLIP performs clean, simple and efficient communication between distributed nodes.
6 Objects and Capabilities Amoeba is an Object-Based systemObject: abstract data type on which well-defined operations may be performedExamples: file, process, or directoryObjects are identified and protected by a capabilityCapability: a handle on an object. Allows the folder to perform operations on the object.The Amoeba system may be thought of as a collection of objects, on each of which there is a set of operations that can be performed. For example, a file object typically allow the operations of read, write, append, and delete.Each object has a capability associated with it. A capability may be thought of as a key that allows the holder of the capability to perform some (not necessarily all) operations on that object. For example, a user process might have a capability for a file that permits it to read but not modify the file.Capabilities are protected cryptographically to prevent users form tampering with them.Capability Structure: (128 bits)Service Port: Identifies the service that manages the objectObject Number: which object is being referred to, since a service may manage more than one objectRights Field: specifies which operations allow (ex. capability for a file may be read only)Check Field: needed for protectionObjects are identified at the system level by their binary capability structure, yet at the programming/user level objects are given string path names. Mapping of the object string name to its corresponding binary capability is handled by the Directory Service.
7 Communication Primitives Remote Procedure Call (RPC) model is used between Client and Serverdo_operationget_requestsend_replyMultiple Inheritance via ClassesAmoeba Interface Language (AIL)Interface for object manipulationMarshal / Unmarshal RPC parametersRequest / Reply transport mechanismAmoeba is a distributed OS that uses a request/response communication model that requires only three communication primitives:do_operation: is used by the client to send a request to a server and to wait on the reply form the server.get_request: is used by the server to receive the request that are sent by the clients.send_reply: is used by the server to send a reply message to the client after the request has been processedCommunication in Amoeba proceeds: First the client sends a request to the server and blocks; then the server accepts the request, does the work, and sends back the replay allowing the client to unblock.Objects may be organized hierarchically, meaning that an object may inherit interfaces from one or more underlying classes.User-oriented interface has been build on top of the simple communications model to provide interfaces for object manipulation. Amoeba Interface Language (AIL) only handles simple data types (Boolean, Integer, Floating Point, Character, String).AIL interfaces offer object manipulations such as read and write as seen in the paper example: (bio_read and bio_write)
8 Now that we know what objects are, how do we find them? Objects in Amoeba may be physically located anywhere there is disk space.Locatingdo_operation call arrives at kernelKernel checks if Service Port is knownIf not, kernel broadcasts locate packetLocate packet asks if there are servers with get_requests for the port in questionLocating ProcedureClient sends a do_operation that arrives at the kernelThe kernel check to see if the Service Port from the capability of the object is already knownIf the Service Port is not known, the kernel broadcasts a locate packet onto the network, asking if anyone has an outstanding get_request for the Service Port in question.The kernel upon determining the network address will store the (Service Port, Network Address) pair for future reference.
9 Amoeba File System Public capabilities are accessible by users Such as: command executables, public files, data basesHierarchical Directory StructureDirectories are ObjectsDirectory is a set of (name, capability) pairsBasic OperationsLookupEnterDeleteDirectory ServerReliability & DependabilitySecurity: directories may be encryptedDirectories may be shared between users. Paper example, two research groups want to share data in a directory but keep other users out. Members of the research groups may be granted access by giving them the capability of the directory.Directory ServerCrucial – all applications depend on it to find their capabilities (must never stop/crash)Replicated – saves internal tables on multiple disks so single site failure would crash entire systemSecurity – trusted not to divulge capabilities to those not intended to see themAtomic Transactions – also provided by the directory server thereby preventing glitches that may result when two user requests an object, create two copies of that same object locally with modifications, and try to copy these objects back to the file system – resulting in one of the object updates being lost.
10 Bullet File Server Increased Performance - FAST 3 fold increase over Sun Network File System (NFS)Files are stored contiguously – (next slide)Increased FragmentationExample: may require 800 MB disk to store 500 MB worth of filesFiles are immutableRead_fileCreate_fileDelete_fileBullet Server only supports 3 principle operations:read_filecreate_filedelete_fileFor most applications this system work well; however, take the example of a log file where each append to the log file would require the whole file to be copied. This is not good. In this case, Amoeba log files are implemented on a different server.Data base may also cause problems in the Bullet File Server. A small update of a large data base might incur large overhead/ethernet traffic. In this case, the data base can be subdivided over many smaller Bullet files – perhaps based on the identifying keys.
11 Bullet File Server Continued Stores files contiguously on disk and in server’s RAM cacheProcessors may only operate on files that fit in their physical memoryQuestions:Let’s say we have an Amoeba system comprised of Sun 3/60 machines with the following: (note: these are the minimal requirement for running Ameoba on a Sun 3/60)(a) file server = 12 MB RAM, and 300 MB disk(b) workstation = 4 MB RAM(c) processor pool = 4 MB RAM eachGiven the above contraints, assume that after installing Ameoba, the system administrator installs a flat file database object of 10 MB directly into the file server. Are users able to request this database object over the LAN from a workstation (of course this would have to happen in multiple RPCs of 30KB each)?
12 Process Management A Process is an Object Process information is contained in Capabilities and in a data structure called the Process DescriptorProcess DescriptorHost DescriptorCapabilitiesSegment ComponentThread ComponentA process is created by sending a process descriptor to the kernel in an execute process request.The Process Descriptor consists of 4 components:Host Descriptor: provides the system requirements where the process must be run, by describing the machine class/type, kernel type, and instruction setCapabilities: includes the capabilities for the process that every client needs, the capabilities of a handler (deals with process exit, signals, and exceptions)Segment Component: describes the layout of the address space (VA, Segment Length, and capabilities of a file or segment form which the new segment should be initialized – for mapped-file I/O).Thread Component: describes the state of each of the threads in the process (processor status word, program counter, stack pointer, etc.)
13 Process Management Continued Processes have explicit control over their address space.mapping segments in and outProcesses StatesRunningStunnedProcess exists but does not execute instructions; example, debuggingProcesses have explicit control over their address space. They can add new segments to their address space by mapping them in and remove segments by mapping them out. This allows mapped-file I/O.
14 Process Management Continued ThreadsProcess start-up: it has at least one threadThread number is dynamicDuring execution:Process may create additional threadsProcess may terminate existing threadsThreads are managed by the kernelWhen threads do RPCsKernel can block threadKernel can schedule another thread in blocked thread’s place if one is availableHaving multiple threads in a process increases performance through parallelism.Example, file server could be programmed as a process with multiple threads. When a request comes in, the request is given to a thread to handle. The thread first checks the internal RAM cache to see if the needed data is present. If it is not, it will perform an RPC with a remote disk server to acquire the data. While waiting for the reply from disk, the thread may be blocked, allowing new requests to be given to other threads to work on while the first thread is blocked.
15 PerformanceAmoeba performance data on two 16.7-megahertz Motorola MC68020s, with a user process on each, communicating over a 10-megabit-per-second Ethernet.Native Amoeba Remote Communication: at 8 KB (Case 2) is 3.05 times faster than the Sun RPC remote (NFS) and 2.78 times faster than the Unix driver remote.
16 Conclusions Review of Design Goals: Transparency: User is unaware of location and number of processors available in Amoeba SystemDistribution:Objects are distributed, as is computational powerParallelism:Multi-Thread ProcessesHigh Performance:Bullet File Server, performance data