Presented by: Sagnik Bhattacharya Kingshuk Govil, Dan Teodosiu, Yongjang Huang, Mendel Rosenblum.

Slides:



Advertisements
Similar presentations
System Area Network Abhiram Shandilya 12/06/01. Overview Introduction to System Area Networks SAN Design and Examples SAN Applications.
Advertisements

Chapter 8-1 : Multiple Processor Systems Multiple Processor Systems Multiple Processor Systems Multiprocessor Hardware Multiprocessor Hardware UMA Multiprocessors.
Operating Systems Lecture 10 Issues in Paging and Virtual Memory Adapted from Operating Systems Lecture Notes, Copyright 1997 Martin C. Rinard. Zhiqing.
WHAT IS AN OPERATING SYSTEM? An interface between users and hardware - an environment "architecture ” Allows convenient usage; hides the tedious stuff.
Distributed System Structures Network Operating Systems –provide an environment where users can access remote resources through remote login or file transfer.
Disco: Running Commodity Operation Systems on Scalable Multiprocessors E Bugnion, S Devine, K Govil, M Rosenblum Computer Systems Laboratory, Stanford.
Multiple Processor Systems
Disco Running Commodity Operating Systems on Scalable Multiprocessors Presented by Petar Bujosevic 05/17/2005 Paper by Edouard Bugnion, Scott Devine, and.
Multiprocessors CS 6410 Ashik Ratnani, Cornell University.
Disco: Running Commodity Operating Systems on Scalable Multiprocessors Bugnion et al. Presented by: Ahmed Wafa.
G Robert Grimm New York University Disco.
1: Operating Systems Overview
Operating System Structure. Announcements Make sure you are registered for CS 415 First CS 415 project is up –Initial design documents due next Friday,
OPERATING SYSTEM OVERVIEW
Disco Running Commodity Operating Systems on Scalable Multiprocessors.
Disco Running Commodity Operating Systems on Scalable Multiprocessors.
1 Disco: Running Commodity Operating Systems on Scalable Multiprocessors Edouard Bugnion, Scott Devine, and Mendel Rosenblum, Stanford University, 1997.
Chapter 11 Operating Systems
Operating Systems CS208. What is Operating System? It is a program. It is the first piece of software to run after the system boots. It coordinates the.
Computer Organization and Architecture
November 1, 2004Introduction to Computer Security ©2004 Matt Bishop Slide #29-1 Chapter 33: Virtual Machines Virtual Machine Structure Virtual Machine.
Virtualization for Cloud Computing
MULTICOMPUTER 1. MULTICOMPUTER, YANG DIPELAJARI Multiprocessors vs multicomputers Interconnection topologies Switching schemes Communication with messages.
Design and Implementation of a Single System Image Operating System for High Performance Computing on Clusters Christine MORIN PARIS project-team, IRISA/INRIA.
Cellular Disco: resource management using virtual clusters on shared memory multiprocessors Published in ACM 1999 by K.Govil, D. Teodosiu,Y. Huang, M.
Microkernels, virtualization, exokernels Tutorial 1 – CSC469.
Mac OS X Panther Operating System
Disco : Running commodity operating system on scalable multiprocessor Edouard et al. Presented by Jonathan Walpole (based on a slide set from Vidhya Sivasankaran)
CS533 Concepts of Operating Systems Jonathan Walpole.
LOGO OPERATING SYSTEM Dalia AL-Dabbagh
Operating System Review September 10, 2012Introduction to Computer Security ©2004 Matt Bishop Slide #1-1.
Multiple Processor Systems. Multiprocessor Systems Continuous need for faster and powerful computers –shared memory model ( access nsec) –message passing.
Kinshuk Govil, Dan Teodosiu*, Yongqiang Huang, and Mendel Rosenblum
Recall: Three I/O Methods Synchronous: Wait for I/O operation to complete. Asynchronous: Post I/O request and switch to other work. DMA (Direct Memory.
Transparent Process Migration: Design Alternatives and the Sprite Implementation Fred Douglis and John Ousterhout.
Windows 2000 Course Summary Computing Department, Lancaster University, UK.
Our work on virtualization Chen Haogang, Wang Xiaolin {hchen, Institute of Network and Information Systems School of Electrical Engineering.
Copyright © George Coulouris, Jean Dollimore, Tim Kindberg This material is made available for private study and for direct.
Disco: Running Commodity Operating Systems on Scalable Multiprocessors Edouard Bugnion, Scott Devine, and Mendel Rosenblum Summary By A. Vincent Rayappa.
Chapter 8-2 : Multicomputers Multiprocessors vs multicomputers Multiprocessors vs multicomputers Interconnection topologies Interconnection topologies.
Disco: Running Commodity Operating Systems on Scalable Multiprocessors Edouard et al. Madhura S Rama.
Supporting Multi-Processors Bernard Wong February 17, 2003.
 Virtual machine systems: simulators for multiple copies of a machine on itself.  Virtual machine (VM): the simulated machine.  Virtual machine monitor.
Disco : Running commodity operating system on scalable multiprocessor Edouard et al. Presented by Vidhya Sivasankaran.
By Teacher Asma Aleisa Year 1433 H.   Goals of memory management  To provide a convenient abstraction for programming.  To allocate scarce memory.
CS533 - Concepts of Operating Systems 1 The Mach System Presented by Catherine Vilhauer.
System Components ● There are three main protected modules of the System  The Hardware Abstraction Layer ● A virtual machine to configure all devices.
A. Frank - P. Weisberg Operating Systems Structure of Operating Systems.
MEMORY RESOURCE MANAGEMENT IN VMWARE ESX SERVER 김정수
Full and Para Virtualization
Lecture 26 Virtual Machine Monitors. Virtual Machines Goal: run an guest OS over an host OS Who has done this? Why might it be useful? Examples: Vmware,
Lecture 4 Mechanisms & Kernel for NOSs. Mechanisms for Network Operating Systems  Network operating systems provide three basic mechanisms that support.
Managing Processors Jeff Chase Duke University. The story so far: protected CPU mode user mode kernel mode kernel “top half” kernel “bottom half” (interrupt.
Disco: Running Commodity Operating Systems on Scalable Multiprocessors Presented by: Pierre LaBorde, Jordan Deveroux, Imran Ali, Yazen Ghannam, Tzu-Wei.
Cloud Computing – UNIT - II. VIRTUALIZATION Virtualization Hiding the reality The mantra of smart computing is to intelligently hide the reality Binary->
Running Commodity Operating Systems on Scalable Multiprocessors Edouard Bugnion, Scott Devine and Mendel Rosenblum Presentation by Mark Smith.
Cellular Disco Resource management using virtual clusters on shared-memory multiprocessors.
Kernel Design & Implementation
Processes and threads.
Operating System Structure
KERNEL ARCHITECTURE.
Disco: Running Commodity Operating Systems on Scalable Multiprocessors
Chapter 3: Windows7 Part 2.
Chapter 3: Windows7 Part 2.
Chapter 33: Virtual Machines
High Performance Computing
Operating Systems: A Modern Perspective, Chapter 3
Lecture 3: Main Memory.
Lecture 8: Efficient Address Translation
Chapter 33: Virtual Machines
Presentation transcript:

Presented by: Sagnik Bhattacharya Kingshuk Govil, Dan Teodosiu, Yongjang Huang, Mendel Rosenblum

Overview Problems of current shared memory multiprocessors and our requirements Cellular Disco as a solution –architecture –prototype –hardware-fault containment –CPU management –Memory management –statistics Cellular Disco and ubiquitous environments Conclusion

Problem Extending modern Operating systems to run efficiently on shared memory multiprocessors. Software development has not kept pace with hardware development. Common operating systems fail beyond 12 processors.

What we need…. the system should be reliable it should be scalable it should be fault-tolerant it should not take too much of development time or effort.

Traditional approaches Hardware partitioning - lacks resource sharing, makes physical clusters. Software-centric approaches : (significant development time and cost) –modify existing OS –develop new OS

A scenario…. Control unit Smart Space ProcProc ProcProc (No rebooting necessary)

Solution : Cellular Disco Extension of previous work - Disco Uses the concept of Virtual machine monitors Partitions the multiprocessor system into virtual clusters.

Virtual Machine Monitor VM1 µP1µP2µP3 VM2 µP1µP3µP8 VM1 - µP’s 1,2,3 µP5 VM2 - µP’s 1,3,5,8 OS (Win NT) OS (IRIX 6.2) Virtual Machine Hardware

Virtual Machine Monitor VM1 µP1µP2µP3 VM2 µP1µP3µP8 VM1 - µP’s 1,2,3 µP5 VM2 - µP’s 1,3,5,8 OS (Win NT) OS (IRIX 6.2) I/O request

Virtual Machine Monitor VM1 µP1µP2µP3 VM2 µP1µP3µP8 VM1 - µP’s 1,2,3 µP5 VM2 - µP’s 1,3,5,8 OS (Win NT) OS (IRIX 6.2) Trap I/O request & perform I/O

Virtual Machine Monitor VM1 µP1µP2µP3 VM2 µP1µP3µP8 VM1 - µP’s 1,2,3 µP5 VM2 - µP’s 1,3,5,8 OS (Win NT) OS (IRIX 6.2) Perform I/O and send interrupt

Virtual Machine Monitor VM1 µP1µP2µP3 VM2 µP1µP3µP8 VM1 - µP’s 1,2,3 µP5 VM2 - µP’s 1,3,5,8 OS (Win NT) OS (IRIX 6.2)

Issues it addresses Address scalability NUMA awareness Hardware fault-containment Resource management

Basic Cellular Disco Architecture

Prototype Runs on a 32-processor SGI-Origin 2000 Supports shared memory systems based on MIPS R1000 architecture. The prototype runs piggybacked on IRIX 6.4 The host OS is made dormant and is only used to invoke some device drivers.

Hardware Virtualization Physical Resources - visible to a virtual machine Machine Resources - actual resources; allocated by Cellular Disco CD operates in the kernel mode of the MIPS processor CD intercepts all system calls.

Resource Management CPU management - Each processor maintains its own run queue Memory Management - Memory borrowing mechanism Each OS instance is only given as many resources as it can handle. Large applications are split and communications between the parts is established by using the shared-memory regions.

CPU Management VCPU migration : - Intra node (37 µsec) - Inter node (520 µsec) - Inter Cell (1520 µsec)

VCPU migration Cellular Disco Interconnect InterconnectNodeNodeNodeNodeNodeNode CPUCPUCPUCPUCPUCPUCPUCPUCPU CellCellCell VCPU

Intra Node Cellular Disco Interconnect InterconnectNodeNodeNodeNodeNodeNode CPUCPUCPUCPUCPUCPUCPUCPUCPU CellCellCell VCPU

Inter Node Cellular Disco Interconnect InterconnectNodeNodeNodeNodeNodeNode CPUCPUCPUCPUCPUCPUCPUCPUCPU CellCellCell VCPU

Inter Cell Cellular Disco Interconnect InterconnectNodeNodeNodeNodeNodeNode CPUCPUCPUCPUCPUCPUCPUCPUCPU CellCellCell VCPU

CPU Management (contd.) CPU balancing : Idle Balancer Periodic balancer Load Balancing Scenario

Idle balancer CPU0CPU1CPU2CPU3 VC B0 VC A1 VC B1 VC A0 Does this have enough cache affinity to CPU2? (Idle) Asks

Idle balancer CPU0CPU1CPU2CPU3 VC B0 VC A1 VC B1 VC A0 Does this have enough cache affinity to CPU2? NO!! (Idle) Asks

Idle balancer CPU0CPU1CPU2CPU3 VC B0 VC A1 VC B1 VC A0 VC B1

Periodic Balancer Does depth-first traversal of the load tree Traversal

Periodic Balancer Checks difference of 2 siblings, ignores if< Traversal Diff=1Diff=1

Periodic Balancer If diff>=2 does load balancing if benefit>cost Traversal Diff=2 Diff=2

Gang Scheduling For all the CPU’s we select the VCPU that is to run on the physical CPU. The VCPU selected is the highest priority be gang-runnable VCPU –all non-idle VCPU’s of that VM are either running or, waiting on run queues of processors running lower- priority VM’s.

Example µP1 : µP2 : µP3 : VC1 VC2 VC5 VC7VC5 VC1VC9 VC3VC4 Currently Executing VCPU Wait Queue VM1 VC’s - 1,3,8 (idle) VM2 VC’s - 2,4,6 (idle),7 VM3 VC’s - 5,9 Priority

Example µP1 : µP2 : µP3 : VC1 VC2 VC5 VC7VC5 VC1VC9 VC3VC4 VM1 VC’s - 1,3,8 (idle) VM2 VC’s - 2,4,6 (idle),7 VM3 VC’s - 5,9 Priority Gang Runnable

Example µP1 : µP2 : µP3 : VC5 VC9 VC5 VC7VC1 VC1VC2 VC3VC4 New Executing VCPU New Wait Queue VM1 VC’s - 1,3,8 (idle) VM2 VC’s - 2,4,6 (idle),7 VM3 VC’s - 5,9 Priority

Memory Management Each cell maintains its own freelist, and allocates memory to other cells in it allocation preference list on request(RPC). Speed µsec for 4 MB. A threshold is set for min. amount of local free memory As far as possible Paging is avoided.

Memory Borrowing freelist - list of free pages in the cell allocation preference list - list of cells from which borrowing memory is more beneficial than paging.

Memory Borrowing Cell 1 Cell 3 Cell 4 Cell 5 Cell 2 Freelist sizes 16 MB 32 MB Borrowing threshold Lending threshold

Memory Borrowing Cell 1 Cell 3 Cell 4 Cell 5 Cell 2 Freelist sizes 16 MB 32 MB Borrowing threshold Lending threshold asks

Memory Borrowing Cell 1 Cell 3 Cell 4 Cell 5 Cell 2 Freelist sizes 16 MB 32 MB Borrowing threshold Lending threshold refused

Memory Borrowing Cell 1 Cell 3 Cell 4 Cell 5 Cell 2 Freelist sizes 16 MB 32 MB Borrowing threshold Lending threshold cannot ask

Memory Borrowing Cell 1 Cell 3 Cell 4 Cell 5 Cell 2 Freelist sizes 16 MB 32 MB Borrowing threshold Lending threshold asks

Memory Borrowing Cell 1 Cell 3 Cell 4 Cell 5 Cell 2 Freelist sizes 16 MB 32 MB Borrowing threshold Lending threshold Gives 4 MB 4 MB

Memory Borrowing Cell 1 Cell 3 Cell 4 Cell 5 Cell 2 Freelist sizes 16 MB 32 MB Borrowing threshold Lending threshold

Memory Management (contd.) Paging : Algo - Second Chance FIFO Page sharing information by some control data structure Cellular Disco traps all read and write requests made by the Operating Systems

Second-chance FIFO A reference bit is added to each page in FIFO scheme Every time the page is accessed the bit is set to 1 If the page is selected by FIFO, and the reference bit is 1, then it is set to 0 and another page is looked for. A page is the target page if it is selected b FIFO and the reference bit is 0

Example Page Fault 1 Oldest Page 1 Oldest Page 0 Second Oldest Page Oldest Page FIFO RB Page Table

Example Page Fault 0 Oldest Page 0 Oldest Page 0 Second Oldest Page Oldest Page Second- chance FIFO RB Page Table

Example 0 Oldest Page 0 Oldest Page RB Page Table

Hardware fault-containment Failure rate increases with increase in processors. Internally structured as a set of semi- independent cells. Failure in one cell does not impact VM’s running in other cells (localization of faults) Assumption - CD is a trusted software layer

Cellular Structure Fault in one cell does not affect others

Hardware fault-containment (contd.) Communication modes - Fast inter-processor RPC - Message Side benefit - Software fault containment, i.e., individual OS crashes do not impact the system.

Hardware-Fault recovery liveset - set of still functioning nodes. Failure - removal from liveset Recovery - insert back to liveset Virtual machines dependent on the failed cell are terminated. Memory dependencies are updated when a cell fails.

Example Cellular Disco Interconnect InterconnectNode1Node4Node5Node6Node3Node2 CPUCPUCPUCPUCPUCPUCPUCPUCPU CellCell Cell VM 1 VM 2 VM 3 Liveset - 1,2,3,4,5,6

Example Cellular Disco Interconnect InterconnectNode1Node4Node5Node6Node3Node2 CPUCPUCPUCPUCPUCPUCPUCPUCPU CellCell Cell VM 1 VM 2 VM 3 Liveset - 1,2,3,4,5,6 BOOM

Example Cellular Disco Interconnect InterconnectNode4Node5Node6Node3 CPUCPUCPUCPUCPUCPU CellCell Cell VM 2 Liveset - 5,6

Example Cellular Disco Interconnect InterconnectNode1Node4Node5Node6Node3Node2 CPUCPUCPUCPUCPUCPUCPUCPUCPU CellCell Cell VM 2 Liveset - 5,6 Interrupt

Example Cellular Disco Interconnect InterconnectNode1Node4Node5Node6Node3Node2 CPUCPUCPUCPUCPUCPUCPUCPUCPU CellCell Cell VM 2 Liveset - 1,2,3,4,5,6

Fault-Recovery overhead

Virtualization Overheads (the first column shows the exec. Time on IRIX 6.4 and the second shows the exec. time on Cellular Disco).

Cellular Disco and Ubiquitous environments Provides raw computational power for our smart spaces. More importantly it does not fail. Fault- recovery present. Adaptable to new Operating systems

Grey Areas Will the source simplicity remain if it is not piggybacked on IRIX 6.4? Will it work on non-uniform multiprocessor systems? –Probable solution - development of a hardware virtualization standard

In conclusion…. Cellular Disco present a midway path between hardware and software directed techniques. It can be used on the central control unit for our smart spaces because it is scalable and fault-tolerant.