The Multikernel: A New OS Architecture for Scalable Multicore Systems By (last names): Baumann, Barham, Dagand, Harris, Isaacs, Peter, Roscoe, Schupbach,

Slides:

Advertisements

Similar presentations

Operating Systems Components of OS

Advertisements

CT213 – Computing system Organization

More on Processes Chapter 3. Process image _the physical representation of a process in the OS _an address space consisting of code, data and stack segments.

Chap 2 System Structures.

Introduction CSCI 444/544 Operating Systems Fall 2008.

Multiple Processor Systems

Using DSVM to Implement a Distributed File System Ramon Lawrence Dept. of Computer Science

The Multikernel: A new OS architecture for scalable multicore systems Andrew Baumann et al CS530 Graduate Operating System Presented by.

Distributed Processing, Client/Server, and Clusters

1. Overview  Introduction  Motivations  Multikernel Model  Implementation – The Barrelfish  Performance Testing  Conclusion 2.

CS533 Concepts of Operating Systems Class 20 Summary.

Contiki A Lightweight and Flexible Operating System for Tiny Networked Sensors Presented by: Jeremy Schiff.

G Robert Grimm New York University Disco.

Active Messages: a Mechanism for Integrated Communication and Computation von Eicken et. al. Brian Kazian CS258 Spring 2008.

3.5 Interprocess Communication

User-Level Interprocess Communication for Shared Memory Multiprocessors Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska, and Henry M. Levy Presented.

1 I/O Management in Representative Operating Systems.

Xen and the Art of Virtualization Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Rolf Neugebauer, Ian Pratt, Andrew Warfield.

Chapter 51 Threads Chapter 5. 2 Process Characteristics  Concept of Process has two facets.  A Process is: A Unit of resource ownership:  a virtual.

Stack Management Each process/thread has two stacks  Kernel stack  User stack Stack pointer changes when exiting/entering the kernel Q: Why is this necessary?

Ch4: Distributed Systems Architectures. Typically, system with several interconnected computers that do not share clock or memory. Motivation: tie together.

Chapter 2 Operating System Overview Patricia Roy Manatee Community College, Venice, FL ©2008, Prentice Hall Operating Systems: Internals and Design Principles,

Operating System A program that controls the execution of application programs An interface between applications and hardware 1.

Lecture 4: Parallel Programming Models. Parallel Programming Models Parallel Programming Models: Data parallelism / Task parallelism Explicit parallelism.

Microkernels, virtualization, exokernels Tutorial 1 – CSC469.

The Multikernel: A new OS architecture for scalable multicore systems

Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 2: System Structures.

1 Operating System Overview Chapter 2 Advanced Operating System.

Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?

Multiple Processor Systems. Multiprocessor Systems Continuous need for faster and powerful computers –shared memory model ( access nsec) –message passing.

OS provide a user-friendly environment and manage resources of the computer system. Operating systems manage: –Processes –Memory –Storage –I/O subsystem.

MIDeA :A Multi-Parallel Instrusion Detection Architecture Author: Giorgos Vasiliadis, Michalis Polychronakis,Sotiris Ioannidis Publisher: CCS’11, October.

SplitX: Split Guest/Hypervisor Execution on Multi-Core 林孟諭 Dept. of Electrical Engineering National Cheng Kung University Tainan, Taiwan, R.O.C.

Computers Operating System Essentials. Operating Systems PROGRAM HARDWARE OPERATING SYSTEM.

Ihr Logo Operating Systems Internals & Design Principles Fifth Edition William Stallings Chapter 2 (Part II) Operating System Overview.

Multiple Processor Systems. Multiprocessor Systems Continuous need for faster computers –shared memory model ( access nsec) –message passing multiprocessor.

Supporting Multi-Processors Bernard Wong February 17, 2003.

Disco : Running commodity operating system on scalable multiprocessor Edouard et al. Presented by Vidhya Sivasankaran.

The Mach System Abraham Silberschatz, Peter Baer Galvin, Greg Gagne Presentation By: Agnimitra Roy.

CS533 - Concepts of Operating Systems 1 The Mach System Presented by Catherine Vilhauer.

Operating System Organization Chapter 3 Michelle Grieco.

The Performance of μ-Kernel-Based Systems H. Haertig, M. Hohmuth, J. Liedtke, S. Schoenberg, J. Wolter Presenter: Sunita Marathe.

The Mach System Abraham Silberschatz, Peter Baer Galvin, and Greg Gagne Presented by: Jee Vang.

System Components ● There are three main protected modules of the System  The Hardware Abstraction Layer ● A virtual machine to configure all devices.

The Mach System Silberschatz et al Presented By Anjana Venkat.

Operating Systems: Internals and Design Principles

Disco: Running Commodity Operating Systems on Scalable Multiprocessors Presented by: Pierre LaBorde, Jordan Deveroux, Imran Ali, Yazen Ghannam, Tzu-Wei.

Operating Systems Unit 2: – Process Context switch Interrupt Interprocess communication – Thread Thread models Operating Systems.

Background Computer System Architectures Computer System Software.

Page 1 2P13 Week 1. Page 2 Page 3 Page 4 Page 5.

The Multikernel: A new OS architecture for scalable multicore systems Dongyoung Seo Embedded Software Lab. SKKU.

Running Commodity Operating Systems on Scalable Multiprocessors Edouard Bugnion, Scott Devine and Mendel Rosenblum Presentation by Mark Smith.

1 Chapter 2: Operating-System Structures Services Interface provided to users & programmers –System calls (programmer access) –User level access to system.

1.3 Operating system services An operating system provide services to programs and to the users of the program. It provides an environment for the execution.

Silberschatz, Galvin and Gagne ©2009Operating System Concepts – 8 th Edition Chapter 4: Threads.

Chapter 2 Operating System Overview Dave Bremer Otago Polytechnic, N.Z. ©2008, Prentice Hall Operating Systems: Internals and Design Principles, 6/E William.

T HE M ULTIKERNEL : A NEW OS ARCHITECTURE FOR SCALABLE MULTICORE SYSTEMS Presented by Mohammed Mustafa Distributed Systems CIS*6000 School of Computer.

Introduction to Operating Systems Concepts

Computer System Structures

The Multikernel: A New OS Architecture for Scalable Multicore Systems

Alternative system models

Operating System Structure

The Multikernel A new OS architecture for scalable multicore systems

Chapter 2: System Structures

Presented by Neha Agrawal

Introduction to Operating Systems

Presented by: SHILPI AGARWAL

CS510 - Portland State University

Chapter 2 Operating System Overview

Operating System Overview

Presentation transcript:

The Multikernel: A New OS Architecture for Scalable Multicore Systems By (last names): Baumann, Barham, Dagand, Harris, Isaacs, Peter, Roscoe, Schupbach, Singhania EECS 582 – W161 Presented by: Clyde Byrd III

The Modern Kernel(s) EECS 582 – W162 MonolithicMicrokernel

The Problem with Modern Kernels Modern Operating systems can no longer take serious advantage of the hardware they are running on There exists a scalability issue in the shared memory model that many modern kernels abide by Cache coherence overhead restricts the ability to scale to many-cores EECS 582 – W163

Solution: MultiKernel Treat the machine as a network of independent cores Make all inter-core communication explicit; use message passing Make OS structure hardware-neutral View state as replicated instead of shared EECS 582 – W164

But wait! Isn’t message passing slower than Shared Memory? Not at scale EECS 582 – W165

But wait! Isn’t message passing slower than Shared Memory? At scale it has been show that message passing has surpassed shared memory efficiency Shared memory at scale seems to be plagued by cache misses which cause core stalls Hardware is starting to resemble a message-passing network EECS 582 – W166

But wait! Isn’t message passing slower than Shared Memory? (cont.) EECS 582 – W167

But wait! Isn’t message passing slower than Shared Memory? (cont.) EECS 582 – W168

The MultiKernel Model EECS 582 – W169

Make inter-core communication explicit All inter-core communication is performed using explicit messages No shared memory between cores aside from the memory used for messaging channels Explicit communication allows the OS to deploy well-known networking optimizations to make more efficient use of the interconnect EECS 582 – W1610

Make OS structure hardware-neutral A multikernel separates the OS structure as much as possible from the hardware Hardware-independence in a multikernel means that we can isolate the distributed communication algorithms from hardware details Enable late binding of both the protocol implementation and message transport EECS 582 – W1611

View state as replicated Shared OS state across cores is replicated and consistency maintained by exchanging messages Updates are exposed in API as non-blocking and split-phase as they can be long operations Reduces load on system interconnect, contention for memory, overhead for synchronization; improves scalability Preserve OS structure as hardware evolves EECS 582 – W1612

In practice Model represents an ideal which may not be fully realizable Certain platform-specific performance optimizations may be sacrificed – shared L2 cache Cost and penalty of ensuring replica consistency varies on workload, data volumes and consistency model EECS 582 – W1613

Barrelfish EECS 582 – W1614

Barrelfish Goals EECS 582 – W1615 Comparable performance to existing commodity OS on multicore hardware Scalability to large number of cores under considerable workload Ability to be re-targeted to different hardware without refactoring Exploit message-passing abstraction to achieve good performance by pipelining and batching messages Exploit modularity of OS and place OS functionality according to hardware topology or load

System Structure Multiple independent OS instances communicating via explicit messages OS instance on each core factored into privileged-mode CPU driver which is hardware dependent user-mode Monitor process: responsible for intercore communication, hardware independent System of monitors and CPU drivers provide scheduling, communication and low-level resource allocation Device drivers and system services run in user-level processes EECS 582 – W1616

CPU Drivers Enforces protection, performs authorization, time-slices processes and mediates access to core and hardware Completely event-driven, single-threaded and nonpremptable Serially processes events in the form of traps from user processes or interrupts from devices or other cores Performs dispatch and fast local messaging between processes on core Implements lightweight, asynchronous (split-phase) same-core IPC facility EECS 582 – W1617

Monitors Schedulable, single-core user-space processes Collectively coordinate consistency of replicated data structures through agreement protocols Responsible for IPC setup Idle the core when no other processes on the core are runnable, waiting for IPI EECS 582 – W1618

Process Structure Process is represented by collection of dispatcher objects, one on each core which might execute it Communication is between dispatchers Dispatchers are scheduled by local CPU driver through upcall interface Dispatcher runs a core local user-level thread scheduler EECS 582 – W1619

Inter-core communication Variant of URPC for cache coherent memory – region of shared memory used as channel for cache-line-sized messages Implementation tailored to cache-coherence protocol to minimize number of interconnect messages Dispatchers poll incoming channels for predetermined time before blocking with request to notify local monitor when message arrives EECS 582 – W1620

Memory Management Manage set of global resources: physical memory shared by applications and system services across multiple cores OS code and data stored in same memory - allocation of physical memory must be consistent Capability system – memory managed through system calls that manipulate capabilities All virtual memory management performed entirely by user-level code EECS 582 – W1621

System Knowledge Base System knowledge base (SKB) maintains knowledge of underlying hardware in subset of first-order logic Populated with information gathered through hardware discovery, online measurement, pre-asserted facts SKB allows concise expression of optimization queries Allocation of device drivers to cores, NUMA-aware memory allocation in topology aware manner Selection of appropriate message transports for inter- core communication EECS 582 – W1622

Experiences from Barrelfish implementation Separation of CPU driver and monitor adds constant overhead of local RPC rather than system calls Moving monitor into kernel space is at the cost of complex kernel-mode code base Differs from current OS designs on reliance on shared data as default communication mechanism Engineering effort to partition data is prohibitive Requires more effort to convert to replication model Shared-memory single-kernel model cannot deal with heterogeneous cores at ISA level EECS 582 – W1623

Evaluation of Barrelfish The testing setup was not accurate making any quantitative conclusions from their benchmarks would be bad Barrelfish performs reasonably on contemporary hardware Barrelfish can scale well with core count Gives authors confidence that multikernel can be a feasible alternative EECS 582 – W1624

Related Work MicroKernels Reduction of shared memory/data structures Distributed Systems EECS 582 – W1625

Q &A? EECS 582 – W1626