Active Messages: a Mechanism for Integrated Communication and Computation von Eicken et. al. Brian Kazian CS258 Spring 2008.

Slides:



Advertisements
Similar presentations
Threads, SMP, and Microkernels
Advertisements

Multiprocessors— Large vs. Small Scale Multiprocessors— Large vs. Small Scale.
WHAT IS AN OPERATING SYSTEM? An interface between users and hardware - an environment "architecture ” Allows convenient usage; hides the tedious stuff.
Extensibility, Safety and Performance in the SPIN Operating System Presented by Allen Kerr.
Chorus and other Microkernels Presented by: Jonathan Tanner and Brian Doyle Articles By: Jon Udell Peter D. Varhol Dick Pountain.
AMLAPI: Active Messages over Low-level Application Programming Interface Simon Yau, Tyson Condie,
Study of Hurricane and Tornado Operating Systems By Shubhanan Bakre.
Using DSVM to Implement a Distributed File System Ramon Lawrence Dept. of Computer Science
Computer Systems/Operating Systems - Class 8
Latency Tolerance: what to do when it just won’t go away CS 258, Spring 99 David E. Culler Computer Science Division U.C. Berkeley.
Chapter 5 Processes and Threads Copyright © 2008.
Contiki A Lightweight and Flexible Operating System for Tiny Networked Sensors Presented by: Jeremy Schiff.
Figure 2.8 Compiler phases Compiling. Figure 2.9 Object module Linking.
OS Fall ’ 02 Introduction Operating Systems Fall 2002.
G Robert Grimm New York University Extensibility: SPIN and exokernels.
3.5 Interprocess Communication Many operating systems provide mechanisms for interprocess communication (IPC) –Processes must communicate with one another.
OPERATING SYSTEM OVERVIEW
Haoyuan Li CS 6410 Fall /15/2009.  U-Net: A User-Level Network Interface for Parallel and Distributed Computing ◦ Thorsten von Eicken, Anindya.
3.5 Interprocess Communication
Threads CSCI 444/544 Operating Systems Fall 2008.
Cs238 Lecture 3 Operating System Structures Dr. Alan R. Davis.
1 Last Class: Introduction Operating system = interface between user & architecture Importance of OS OS history: Change is only constant User-level Applications.
COM S 614 Advanced Systems Novel Communications U-Net and Active Messages.
Ethan Kao CS 6410 Oct. 18 th  Active Messages: A Mechanism for Integrated Communication and Control, Thorsten von Eicken, David E. Culler, Seth.
CS533 Concepts of OS Class 16 ExoKernel by Constantia Tryman.
Lecture 4: Parallel Programming Models. Parallel Programming Models Parallel Programming Models: Data parallelism / Task parallelism Explicit parallelism.
9/13/20151 Threads ICS 240: Operating Systems –William Albritton Information and Computer Sciences Department at Leeward Community College –Original slides.
Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?
Operating System Review September 10, 2012Introduction to Computer Security ©2004 Matt Bishop Slide #1-1.
Operating System 4 THREADS, SMP AND MICROKERNELS
Hardware Definitions –Port: Point of connection –Bus: Interface Daisy Chain (A=>B=>…=>X) Shared Direct Device Access –Controller: Device Electronics –Registers:
Benefits: Increased server utilization Reduced IT TCO Improved IT agility.
Operating Systems Lecture 2 Processes and Threads Adapted from Operating Systems Lecture Notes, Copyright 1997 Martin C. Rinard. Zhiqing Liu School of.
Recall: Three I/O Methods Synchronous: Wait for I/O operation to complete. Asynchronous: Post I/O request and switch to other work. DMA (Direct Memory.
Parallel Programming in Split-C David E. Culler et al. (UC-Berkeley) Presented by Dan Sorin 1/20/06.
Processes and Threads Processes have two characteristics: – Resource ownership - process includes a virtual address space to hold the process image – Scheduling/execution.
Lecture 3 Process Concepts. What is a Process? A process is the dynamic execution context of an executing program. Several processes may run concurrently,
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Introduction to Concurrency.
Operating Systems Lecture 7 OS Potpourri Adapted from Operating Systems Lecture Notes, Copyright 1997 Martin C. Rinard. Zhiqing Liu School of Software.
Hardware process When the computer is powered up, it begins to execute fetch-execute cycle for the program that is stored in memory at the boot strap entry.
Minimizing Communication Latency to Maximize Network Communication Throughput over InfiniBand Design and Implementation of MPICH-2 over InfiniBand with.
MIT’s Exokernel Presented by Victoria Barrow Kyle Safford Sean Sommers.
The Mach System Abraham Silberschatz, Peter Baer Galvin, Greg Gagne Presentation By: Agnimitra Roy.
A summary by Nick Rayner for PSU CS533, Spring 2006
EXTENSIBILITY, SAFETY AND PERFORMANCE IN THE SPIN OPERATING SYSTEM
1 Threads, SMP, and Microkernels Chapter Multithreading Operating system supports multiple threads of execution within a single process MS-DOS.
1: Operating Systems Overview 1 Jerry Breecher Fall, 2004 CLARK UNIVERSITY CS215 OPERATING SYSTEMS OVERVIEW.
1 Qualifying ExamWei Chen Unified Parallel C (UPC) and the Berkeley UPC Compiler Wei Chen the Berkeley UPC Group 3/11/07.
Slide 1 von Eicken et al, "Active Messages: a Mechanism for Integrated Communication and Computation" CS258 Lecture by: Dan Bonachea.
Computer Network Lab. Korea University Computer Networks Labs Se-Hee Whang.
Multithreaded Programing. Outline Overview of threads Threads Multithreaded Models  Many-to-One  One-to-One  Many-to-Many Thread Libraries  Pthread.
LRPC Firefly RPC, Lightweight RPC, Winsock Direct and VIA.
M. Accetta, R. Baron, W. Bolosky, D. Golub, R. Rashid, A. Tevanian, and M. Young MACH: A New Kernel Foundation for UNIX Development Presenter: Wei-Lwun.
The Mach System Silberschatz et al Presented By Anjana Venkat.
Lecture 4 Mechanisms & Kernel for NOSs. Mechanisms for Network Operating Systems  Network operating systems provide three basic mechanisms that support.
Hardware process When the computer is powered up, it begins to execute fetch-execute cycle for the program that is stored in memory at the boot strap entry.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 2.
Operating Systems Unit 2: – Process Context switch Interrupt Interprocess communication – Thread Thread models Operating Systems.
Major OS Components CS 416: Operating Systems Design, Spring 2001 Department of Computer Science Rutgers University
CSCI/CMPE 4334 Operating Systems Review: Exam 1 1.
1 Chapter 2: Operating-System Structures Services Interface provided to users & programmers –System calls (programmer access) –User level access to system.
Chapter 4: Threads Modified by Dr. Neerja Mhaskar for CS 3SH3.
Introduction to threads
CMSC 611: Advanced Computer Architecture
Chapter 4: Threads.
CS 258 Reading Assignment 4 Discussion Exploiting Two-Case Delivery for Fast Protected Messages Bill Kramer February 13, 2002 #
Multithreaded Programming
Operating System Introduction.
Operating Systems: A Modern Perspective, Chapter 6
Chapter 13: I/O Systems “The two main jobs of a computer are I/O and [CPU] processing. In many cases, the main job is I/O, and the [CPU] processing is.
Presentation transcript:

Active Messages: a Mechanism for Integrated Communication and Computation von Eicken et. al. Brian Kazian CS258 Spring 2008

Introduction Gap between processor and network utilization – Need to maximize overlap to ensure efficiency of program High message overhead – Requires batching of messages to compensate H/W development neglects interaction between processor and network

Active Messages Mechanism for sending messages – Message header specifies instruction address for integration into computation – Handler retrieves message, cannot block – No buffering available Idea of making a simple interface to match hardware Allow for overlap of computation and communication

Existing Send/Receive Models Blocking send/receive (3-Phase Protocol) – Simple, yet inefficient computationally – No buffering needed Asynchronous send/receive – Communication encapsulates computation – Buffer space allocated throughout computation

Active Message Protocol Protocol – Sender sends a message to a receiver Asynchronous send while still computing – Receiver pulls message, integrates into computation through handler Handler executes without blocking Handler provides data to ongoing computation – Does not perform any computation itself Handler can only reply to sender, if necessary

Why Active Messages Asynchronous communication – Non-blocking send/receive for overlap No buffering – Only buffering needed within network is needed Software handles other necessary buffers Improved Performance – Close association with network protocol Handlers are kept simple – Serve as an interface between network and computation Concern becomes overhead, not latency

Message Passing Machines Computation is via threads Discrepancy between H/W and programming models – Higher level 3-phase send/recv used Active Messages provide better low-level interaction Little overlap of communication/computation – Active Messages could allow for this No need for complicated scheduling Large messages may still need to be buffered AM provides performance increase solely with software

Message Passing Architectures – nCUBE/2 and CM-5 Overhead reduction – nCUBE/2: 160 us blocking -> 30 us Active Message – CM us blocking -> 23 us Active Message Deadlock – nCUBE/2 uses multiple user buffers to prevent deadlock – CM-5 has dual identical networks Split for requests and replies

Message Driven Machines Computation is within message handlers Network is integrated into the processor Developed for fine-grain parallelism – Utilizes small messages with low overhead May buffer messages upon receipt – Buffers can grow to any size depending on amount of excess parallelism State of computation is very temporal – Small amount of registers, little locality

Hardware Support Network Modifications: – Data reuse Store pieces of data in network interface for reuse – Protection Enforce message restrictions at network level – Message Accelerators Frequent messages launched quickly

Processor Support Interrupts only way to handle asynchronous events – Flushes pipeline, very expensive! Can insert polling for messages by compiler Use multithreading to switch between PC’s Use two separate processors – Handler and main computation separated

Split-C Extension of C for SPMD Programs – Global address space is partitioned into local and remote – Maps shared memory benefits to distributed memory Dereference of remote pointers Keep events associated with message passing models – Split-phase access Enables dereferencing without interruption of processor Active Messages serve as interface for Split-C – PUT/GET instructions utilized by compiler through prefetching

Active Messaging in its Current Form Active Message 2 API – Naming updated to allow for models other than SPMD Paper implementation requires uniform code image – Support for multi-threaded applications – Multiple communication endpoints Controlling communication allows for handling messages that are returned Additional robust forms of AM – AMMPI, LAPI

Titanium Implementation Similar to Split-C, Java-based – Utilizes GASNet for network communication GASNet higher level abstraction of core API with AM – Global address space allows for portability – Skips JVM by compiling translating to C Image from

Conclusion Active Messages provide a low-level interface for asynchronous messaging – Match hardware well on both message passing/driven machines Handlers are simple, keeping complexity low Allows for overlap between computation and communication Model is the basis for many different communication stacks

Questions?