Mirage: an OCaml Exokernel Anil Madhavapeddy University of Cambridge Computer Laboratory, 15 JJ Thomson Avenue, Cambridge, UK with Dr. Thomas Gazagnaire.

Slides:



Advertisements
Similar presentations
Threads, SMP, and Microkernels
Advertisements

Taekyung Kim 0x410 ~ 0x International Standards Organization (ISO) is a multinational body dedicated to worldwide agreement on international.
Umut Girit  One of the core members of the Internet Protocol Suite, the set of network protocols used for the Internet. With UDP, computer.
XEN AND THE ART OF VIRTUALIZATION Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Rolf Neugebauer, lan Pratt, Andrew Warfield.
CISCO NETWORKING ACADEMY Chabot College ELEC Transport Layer (4)
Chorus and other Microkernels Presented by: Jonathan Tanner and Brian Doyle Articles By: Jon Udell Peter D. Varhol Dick Pountain.
Bart Miller. Outline Definition and goals Paravirtualization System Architecture The Virtual Machine Interface Memory Management CPU Device I/O Network,
BZUPAGES.COM 1 User Datagram Protocol - UDP RFC 768, Protocol 17 Provides unreliable, connectionless on top of IP Minimal overhead, high performance –No.
Transport Layer – TCP (Part1) Dr. Sanjay P. Ahuja, Ph.D. Fidelity National Financial Distinguished Professor of CIS School of Computing, UNF.
TRANSPORT LAYER  Session multiplexing  Segmentation  Flow control (TCP)  Connection-oriented (TCP)  Reliability (TCP)
Copyright 1999, S.D. Personick. All Rights Reserved. Telecommunications Networking II Lecture 32 Transmission Control Protocol (TCP) Ref: Tanenbaum pp:
Linux Networking Overview COMS W Spring 2010.
CS 104 Introduction to Computer Science and Graphics Problems
1 Last Class: Introduction Operating system = interface between user & architecture Importance of OS OS history: Change is only constant User-level Applications.
1 OS & Computer Architecture Modern OS Functionality (brief review) Architecture Basics Hardware Support for OS Features.
Virtualization for Cloud Computing
Gursharan Singh Tatla Transport Layer 16-May
Xen and the Art of Virtualization Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Rolf Neugebauer, Ian Pratt, Andrew Warfield.
Xen and the Art of Virtualization. Introduction  Challenges to build virtual machines Performance isolation  Scheduling priority  Memory demand  Network.
Stack Management Each process/thread has two stacks  Kernel stack  User stack Stack pointer changes when exiting/entering the kernel Q: Why is this necessary?
Tanenbaum 8.3 See references
CP476 Internet ComputingCh.1 # 1 Lecture 2. A Brief Introduction to the Internet The objective is to understand The history of Internet What the Internet.
Zen and the Art of Virtualization Paul Barham, et al. University of Cambridge, Microsoft Research Cambridge Published by ACM SOSP’03 Presented by Tina.
Microkernels, virtualization, exokernels Tutorial 1 – CSC469.
UNIX System Administration OS Kernal Copyright 2002, Dr. Ken Hoganson All rights reserved. OS Kernel Concept Kernel or MicroKernel Concept: An OS architecture-design.
CS533 Concepts of Operating Systems Jonathan Walpole.
Virtualization The XEN Approach. Virtualization 2 CS5204 – Operating Systems XEN: paravirtualization References and Sources Paul Barham, et.al., “Xen.
High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University Piyush Shivam Ohio State University.
Operating System Support for Virtual Machines Samuel T. King, George W. Dunlap,Peter M.Chen Presented By, Rajesh 1 References [1] Virtual Machines: Supporting.
High Performance Computing & Communication Research Laboratory 12/11/1997 [1] Hyok Kim Performance Analysis of TCP/IP Data.
TCP : Transmission Control Protocol Computer Network System Sirak Kaewjamnong.
1 The Internet and Networked Multimedia. 2 Layering  Internet protocols are designed to work in layers, with each layer building on the facilities provided.
Copyright 2002, S.D. Personick. All Rights Reserved.1 Telecommunications Networking II Topic 20 Transmission Control Protocol (TCP) Ref: Tanenbaum pp:
TCP1 Transmission Control Protocol (TCP). TCP2 Outline Transmission Control Protocol.
Processes Introduction to Operating Systems: Module 3.
Interfaces and Services Each layer provides a service to the layer above it. A service is a set of primitive operations. Under UNIX, primitives are implemented.
TCP/IP (Transmission Control Protocol / Internet Protocol)
Introduction to virtualization
Operating Systems Engineering Based on MIT (2012, lec3) Recitation 2: OS Organization.
A. Frank - P. Weisberg Operating Systems Structure of Operating Systems.
Full and Para Virtualization
Security Architecture and Design Chapter 4 Part 2 Pages 319 to 357.
CS533 Concepts of Operating Systems Jonathan Walpole.
1 Network Communications A Brief Introduction. 2 Network Communications.
Computer Networks 1000-Transport layer, TCP Gergely Windisch v spring.
CSCI/CMPE 4334 Operating Systems Review: Exam 1 1.
1 Chapter 2: Operating-System Structures Services Interface provided to users & programmers –System calls (programmer access) –User level access to system.
1 Transmission Control Protocol (TCP) RFC: Introduction The TCP is intended to provide a reliable process-to-process communication service in a.
Introduction to Operating Systems Concepts
Computer System Structures
Operating System Overview
Virtual Machine Monitors
Introduction to Operating Systems
Muen Policy & Toolchain
Xen and the Art of Virtualization
Processes and threads.
Presented by Yoon-Soo Lee
SOFTWARE DESIGN AND ARCHITECTURE
Java programming lecture one
Xen: The Art of Virtualization
OS Virtualization.
Modern Systems: Extensible KERNELS AND containers
Introduction to Operating Systems
Chapter 2: System Structures
Lecture Topics: 11/1 General Operating System Concepts Processes
Binding Times Binding is an association between two things Examples:
CPEG514 Advanced Computer Networkst
The Design & Implementation of Hyperupcalls
Network Architecture Models: Layered Communications
Transport Layer 9/22/2019.
Presentation transcript:

Mirage: an OCaml Exokernel Anil Madhavapeddy University of Cambridge Computer Laboratory, 15 JJ Thomson Avenue, Cambridge, UK with Dr. Thomas Gazagnaire (OcamlPro), Dr. Richard Mortier (Nottingham), Dr. Steven Hand (Cambridge), and Prof. Jon Crowcroft (Cambridge)

Motivation: Layers Hardware Processes OS Kernel Threads Application

Motivation: Layers Hardware Processes OS Kernel Threads Application Language Runtime

Motivation: Layers Hardware Processes OS Kernel Threads Application Hypervisor Language Runtime

Motivation: In Search of Simplicity Hardware Processes OS Kernel Threads Application Hypervisor Language Runtime Linux Kernel Mar 1994: 176,250 LoC May 2010: 13,320,934 LoC

Architecture: Exokernel Hardware Processes OS Kernel Threads Application Hypervisor Language Runtime Hardware Application Hypervisor Language Runtime

Architecture: Workflow Hardware Processes OS Kernel Threads Application Hypervisor Language Runtime Hardware Application Hypervisor Language Runtime Develop Deploy

Layer 1: Separation Kernel Assume { Xen, KVM, L4 } exists Abstract Hardware I/O interfaces Resource Isolation for memory CPU Concurrency and Timers Hardware Application Hypervisor Language Runtime

Layer 1: Minimal OS “signature” module Console : sig type t val create : unit -> t val write : t -> string -> unit end Hardware Application Hypervisor Language Runtime let rec fib n = if n < 2 then 1 else fib(n-1) + fib(n-2) let _ = fib 40

Layer 1: A simple “hello world” kernel Xen runs para-virtualized kernels that cooperate with the hypervisor. Most code runs unmodified Privileged instructions go via Xen hypercalls Hardware Application Hypervisor Language Runtime Linked to a small C library to make a kernel Boots in 64-bit mode directly, with starting memory all mapped. Is approximately KB in size.

OS Text and Data Network Buffers Reserved OCaml minor heap OCaml major heap 120 TB 128 TB Mirage: 64-bit Xen Memory Layout 64- bit address space Single 64-bit address space Specialize regions of memory No support for: Dynamic shared libraries Address Space Randomization Multiple runtimes (for now)

Mirage: Network Buffers OS Text and Data Network Buffers Reserved OCaml minor heap OCaml major heap 120 TB 128 TB 64- bit address space IP Header TCP Header Transmit packet data IP Header TCP Header Receive packet data 4 KB

Mirage: x86 superpages for OCaml heap OS Text and Data Network Buffers Reserved OCaml minor heap OCaml major heap 120 TB 128 TB 64- bit address space 4MB Reduces TLB pressure significantly. Is_in_heap check is much simpler Q: Improve GC/cache interaction using PAT registers? Q: co-operative GC?

MirageOS: memory performance vs PV Linux

Layer 2: Concurrency and Parallelism Core Kernel Core Hypervisor Process Thread

Layer 2: Concurrency Xen provides an low-level event interface. No need for interrupts: a perfect fit for co-operative threading! We always know our next timeout (priority queue) So adapted the LWT threading library Block 5s

Layer 2: OS Signature with Timing module Console : sig type t val create : unit -> t val sync_write : t -> string -> unit Lwt.t val write : t -> string -> unit end module Clock : sig val time : unit -> float end module Time : sig val sleep : float -> unit Lwt.t end module Main : sig val run : unit Lwt.t -> unit end

…and parallelism? Xen divides up cores into vCPUs, LWT multiplexes on a single core Mirage “process” is a separate OS, communicating via event channels Open Question: parallelism model (JoCaml, OPIS, CIEL futures) vCPU 1 vCPU 2 Mem 1 Mem 2 SH M

Layer 3: Abstract I/O module type FLOW = sig type t type mgr type src type dst val read : t -> view option Lwt.t val write : t -> view -> unit Lwt.t val close : t -> unit Lwt.t module type DATAGRAM = sig type mgr type src type dst type msg

Layer 3: Abstract I/O module type FLOW = sig type t type mgr type src type dst val read : t -> view option Lwt.t val write : t -> view -> unit Lwt.t val close : t -> unit Lwt.t val listen : mgr -> src -> (dst -> t -> unit Lwt.t) -> unit Lwt.t val connect : mgr -> src -> dst -> (t -> unit Lwt.t) -> unit Lwt.t end module type DATAGRAM = sig type mgr type src type dst type msg val recv : mgr -> src -> (dst -> msg -> unit Lwt.t) -> unit Lwt.t val send : mgr -> dst -> msg -> unit Lwt.t end

Layer 3: Concrete I/O Modules module TCPv4 : sig type t type mgr = Manager.t type src = (ipv4_addr option * int) type dst = (ipv4_addr * int) val read : t -> view option Lwt.t val write : t -> view -> unit Lwt.t val close : t -> unit Lwt.t val listen : mgr -> src -> (dst -> t -> unit Lwt.t) -> unit Lwt.t val connect : mgr -> src -> dst -> (t -> unit Lwt.t) -> unit Lwt.t end module Shmem : sig type t type mgr = Manager.t type src = domid type dst = domid val read : t -> view option Lwt.t val write : t -> view -> unit Lwt.t val close : t -> unit Lwt.t val listen : mgr -> src -> (dst -> t -> unit Lwt.t) -> unit Lwt.t val connect : mgr -> src -> dst -> (t -> unit Lwt.t) -> unit Lwt.t end

Layer 3: Multiple OS modules OS (Unix) OS (Xen) Stdlib Istring Time Clock Console Ethif Main Istring Time Clock Console Ethif Main Istring Time Clock Console Ethif Main Istring Time Clock Console Ethif Main

Layer 3: Multiple OS modules OS (Unix) OS (Xen) Stdlib Istring Time Clock Console Ethif Main Istring Time Clock Console Ethif Main Istring Time Clock Console Ethif Main Istring Time Clock Console Ethif Main Gnttab Evtchn Ring Xenbus Xenstore Gnttab Evtchn Ring Xenbus Xenstore Kernel bindings Xen bindings Xen bindings

Layer 3: Standard Library Combinations OS (Unix) OS (Xen) Stdlib Net (direct) Net (socket) Unix/socket (ELF binary) Unix/socket (ELF binary) Unix/direct (ELF binary) Unix/direct (ELF binary) Xen/direct (microkernel) Xen/direct (microkernel) Application

Layer 3: Ocamlbuild Compilation ocamlopt -output-obj asmrun.a minios.a Application cmx a a cmi ml camlp4 mli cmx Stdlib a a cmi ml camlp4 mli xen.lds Mirage kernel

Layer 3: Ethernet I/O I/O arrives via shared-memory Ethernet frames, and parsed via a DSL We have Ethernet, ARP, ICMP, IPv4, DHCP, TCPv4, HTTP, DNS, SSH in pure OCaml. Performance in user-space is excellent (EuroSys 2007), now benchmarking under Xen. Zero-copy, bounds optimisation is vital to performance. Ethernet IP TCP Data

Meta Packet Language (MPL) packet tcp { source_port: uint16; dest_port: uint16; sequence: uint32; ack_number: uint32; offset: bit[4] value(offset(header_end) / 4); reserved: bit[4] const(0); cwr: bit[1] default(0); ece: bit[1] default(0); urg: bit[1] default(0); ack: bit[1] default(0); psh: bit[1] default(0); rst: bit[1] default(0); syn: bit[1] default(0); fin: bit[1] default(0); window: uint16; checksum: uint16; urgent: uint16 default(0); header_end: label; options: byte[(offset * 4) - offset(header_end)] align(32); data: byte[remaining()]; } OCaml output can both construct and parse packets from this DSL. Melange: Towards a ‘Functional’ Internet EuroSys 2007, Madhavapeddy et al.

Research Directions A more general solution that can handle ABNF, XML, JSON, etc. Yakker (AT&T Research) Dependently typed DSLs (Idris) LinearML (quasi-linear, reference-counted ML) Goals: 10GB/s type-safe network I/O. Specify file-systems in this way also.

Research Directions Platforms Bytecode: Simple interpreted runtime ELF binary: Native code binary running in user-space Kernel module: Native code binary running in kernel mode Javascript: Web browser via ocamljs or js_of_ocaml JVM: virtual machine via ocamljava 8-bit PIC: via ocamlpic Microkernel: Xen / KVM / VMWare Optimisation Whole OS compilation LLVM – needed badly for interoperability, not performance Profiling

Mirage: roadmap WWW: self-hosting, so it might be is down :) Code: First developer release: soon! “Early adopters” welcome, you just need an Amazon EC2 account for the Xen backend, or Linux/*BSD/MacOS X for POSIX. Goal: practical, open, safe, fast Internet services IRC: #mirage Twitter: avsm This work is supported by Horizon Digital Economy Research, RCUK grant EP/G065802/1

Backup Slides

Mirage: concurrency using LWT Advantages: Core library is pure OCaml with no magic Excellent camlp4 extension to hide the bind monad. Function type now clearly indicates that it blocks. Open Issues: Creates a lot of runtime closures (lambda lifting, whole program opt?) Threat model: malicious code can now hang whole OS

Moving on from the Socket API (ii) type packet = | Stream | Datagram type direction = | Uni | Bi type consumption = | Blaster | Congestion val target : packet -> direction -> consumption -> ip_addr -> sockaddr module Flow : sig type t val read: t -> string -> int -> int -> int Lwt.t val write: t -> string -> int -> int -> int Lwt.t val connect: sockaddr -> (t -> unit Lwt.t) -> unit Lwt.t val listen: sockaddr -> (sockaddr -> t -> unit Lwt.t) -> unit Lwt.t end

OS Text and Data Network Buffers Reserved OCaml minor heap OCaml major heap 120 TB 128 TB 64- bit address space Mirage: Typed Memory Allocators Buddy Allocator dyn_init(type) dyn_malloc(type, size) dyn_realloc(size) dyn_free(type) Heap Allocator heap_init(type, pages) heap_extend(type, pages) heap_shrink(type, pages) Page Grant Allocator grant_alloc_page(type) grant_free_page(type)

DNS: Performance of BIND (C) vs Deens (ML)

DNS: with functional memoisation

SQL performance vs PV Linux