UPC Runtime Layer Jason Duell. The Big Picture The Runtime layer handles everything that is both: 1) Platform/Environment specific —So compiler can output.

Slides:



Advertisements
Similar presentations
Objects and Classes David Walker CS 320. Advanced Languages advanced programming features –ML data types, exceptions, modules, objects, concurrency,...
Advertisements

Part IV: Memory Management
C Structures and Memory Allocation There is no class in C, but we may still want non- homogenous structures –So, we use the struct construct struct for.
Lecture 10: Part 1: OO Issues CS 540 George Mason University.
Garbage Collection  records not reachable  reclaim to allow reuse  performed by runtime system (support programs linked with the compiled code) (support.
(1) ICS 313: Programming Language Theory Chapter 10: Implementing Subprograms.
Chapter 8 Runtime Support. How program structures are implemented in a computer memory? The evolution of programming language design has led to the creation.
Chapter 4: Threads. Overview Multithreading Models Threading Issues Pthreads Windows XP Threads.
George Blank University Lecturer. CS 602 Java and the Web Object Oriented Software Development Using Java Chapter 4.
Contiki A Lightweight and Flexible Operating System for Tiny Networked Sensors Presented by: Jeremy Schiff.
Processes CSCI 444/544 Operating Systems Fall 2008.
1 1 Lecture 4 Structure – Array, Records and Alignment Memory- How to allocate memory to speed up operation Structure – Array, Records and Alignment Memory-
Memory Management. 2 How to create a process? On Unix systems, executable read by loader Compiler: generates one object file per source file Linker: combines.
Run time vs. Compile time
PZ09A Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, PZ09A - Activation records Programming Language Design.
Pointers and Dynamic Variables. Objectives on completion of this topic, students should be able to: Correctly allocate data dynamically * Use the new.
The environment of the computation Declarations introduce names that denote entities. At execution-time, entities are bound to values or to locations:
Chapter 3.7 Memory and I/O Systems. 2 Memory Management Only applies to languages with explicit memory management (C or C++) Memory problems are one of.
Run-time Environment and Program Organization
1 Run time vs. Compile time The compiler must generate code to handle issues that arise at run time Representation of various data types Procedure linkage.
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science Computer Systems Principles C/C++ Emery Berger and Mark Corner University of Massachusetts.
. Memory Management. Memory Organization u During run time, variables can be stored in one of three “pools”  Stack  Static heap  Dynamic heap.
Chapter 8 :: Subroutines and Control Abstraction
LWIP TCP/IP Stack 김백규.
Unified Parallel C at LBNL/UCB The Berkeley UPC Compiler: Implementation and Performance Wei Chen the LBNL/Berkeley UPC Group.
Chapter 3.5 Memory and I/O Systems. 2 Memory Management Memory problems are one of the leading causes of bugs in programs (60-80%) MUCH worse in languages.
The Procedure Abstraction, Part V: Support for OOLs Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in.
Chapter 0.2 – Pointers and Memory. Type Specifiers  const  may be initialised but not used in any subsequent assignment  common and useful  volatile.
Copyright © 2005 Elsevier Chapter 8 :: Subroutines and Control Abstraction Programming Language Pragmatics Michael L. Scott.
June-Hyun, Moon Computer Communications LAB., Kwangwoon University Chapter 26 - Threads.
Topic 2d High-Level languages and Systems Software
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Memory: Relocation.
Data Structures Using C++ 2E Chapter 3 Pointers. Data Structures Using C++ 2E2 Objectives Learn about the pointer data type and pointer variables Explore.
CS 326 Programming Languages, Concepts and Implementation Instructor: Mircea Nicolescu Lecture 9.
Moving Arrays -- 1 Completion of ideas needed for a general and complete program Final concepts needed for Final Review for Final – Loop efficiency.
RUN-Time Organization Compiler phase— Before writing a code generator, we must decide how to marshal the resources of the target machine (instructions,
Operating Systems Lecture 14 Segments Adapted from Operating Systems Lecture Notes, Copyright 1997 Martin C. Rinard. Zhiqing Liu School of Software Engineering.
Processes CS 6560: Operating Systems Design. 2 Von Neuman Model Both text (program) and data reside in memory Execution cycle Fetch instruction Decode.
Slides created by: Professor Ian G. Harris Hello World #include main() { printf(“Hello, world.\n”); }  #include is a compiler directive to include (concatenate)
1 CS503: Operating Systems Spring 2014 Part 0: Program Structure Dongyan Xu Department of Computer Science Purdue University.
CSC 8505 Compiler Construction Runtime Environments.
Computer Graphics 3 Lecture 1: Introduction to C/C++ Programming Benjamin Mora 1 University of Wales Swansea Pr. Min Chen Dr. Benjamin Mora.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Memory Management Overview.
Week 4 - Friday.  What did we talk about last time?  Some extra systems programming stuff  Scope.
Memory Management. 2 How to create a process? On Unix systems, executable read by loader Compiler: generates one object file per source file Linker: combines.
Unified Parallel C at LBNL/UCB Berkeley UPC Runtime Report Jason Duell LBNL September 9, 2004.
LECTURE 3 Translation. PROCESS MEMORY There are four general areas of memory in a process. The text area contains the instructions for the application.
1 Compiler Construction Run-time Environments,. 2 Run-Time Environments (Chapter 7) Continued: Access to No-local Names.
LECTURE 19 Subroutines and Parameter Passing. ABSTRACTION Recall: Abstraction is the process by which we can hide larger or more complex code fragments.
Binding & Dynamic Linking Presented by: Raunak Sulekh(1013) Pooja Kapoor(1008)
Code Generation Instruction Selection Higher level instruction -> Low level instruction Register Allocation Which register to assign to hold which items?
Lecture 3 Translation.
CSC 253 Lecture 8.
Array Lists Chapter 6 Section 6.1 to 6.3
Chapter 9 :: Subroutines and Control Abstraction
CSC 253 Lecture 8.
Page Replacement.
Chap. 8 :: Subroutines and Control Abstraction
Chap. 8 :: Subroutines and Control Abstraction
Handling Arrays Completion of ideas needed for a general and complete program Final concepts needed for Final.
PZ09A - Activation records
Handling Arrays Completion of ideas needed for a general and complete program Final concepts needed for Final.
Activation records Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
Handling Arrays Completion of ideas needed for a general and complete program Final concepts needed for Final.
point when a program element is bound to a characteristic or property
Lecture 4: Instruction Set Design/Pipelining
Activation records Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
Activation records Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
Activation records Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
Presentation transcript:

UPC Runtime Layer Jason Duell

The Big Picture The Runtime layer handles everything that is both: 1) Platform/Environment specific —So compiler can output one version of code for all platforms. 2) But also specific to UPC —So GASNet can remain language-independent.

Runtime Layer Laundry List 1) Shared pointer representation and manipulation 2) Pthread creation and management 3) Memory Management 4) Synchronization 5) Initialization code

Supported Runtime Environments 2 Main Axes: 1) Threads vs. Processes 2) Network vs. Shared Memory vs. both — Also; network vs. local synchronization mechanisms We will support: —Threads on a single SMP (all shared memory) —Processes on a single SMP (all shared memory) —Processes w/network (all network) —Threads w/network (use both) We won’t support (at least for now) —Processes on SMP & network (using both) —Will only use network communications

Implementation goals Speed: compile time resolution instead of run-time checks wherever possible. Parsable by compiler (for compilers that generate straight to assembly) —Inline functions instead of macros where possible. Clean, maintainable implementation —But have it done yesterday

Shared Pointer Representation struct naïve_shared_ptr { void * addr; uint thread; uint phase; }; Provide phaseless shared pointer type (for both phaseless and default cyclic). —Can omit phase field. —If pure shared memory, this can just be a pointer 64 bit platforms: may be able to stuff some fields into unused top bits of pointer. Using offset instead of address may save space —But might make casts to local slower… Solution: provide abstract type and operations.

Thread-specific data All unshared global & static declarations must be have a copy per pthread. Solution #1: Put all variables in a big struct, and make 1 copy of it per thread. —Need to effectively eliminate separate compilation (slow). —Data no longer initialized by linker —Object files not readable by nm, etc. Solution #2: Put all variables in single link section—make 1 copy of section per thread, and use pointer & offset into section to reference variables. —Solves initialization, separate compilation, object format. —But involves nonstandard compiler and linker directives.

Heap Management GASNet provides a single, fixed network-accessible shared memory region to the Runtime. The Runtime must divide it among threads, and manage separate local and global heap for each thread. Also must prevent regular C heap from expanding into shared region: hook malloc/free to our own, bounded heap.

Shared Memory Allocation

Synchronization Pure Shared Memory environments: —Runtime provides synchronization via pthreads or System V IPC mechanisms. Networked environments: —GASNet provides synchronization across processes via the network —Runtime provides it between threads in the same process.

Allocating/Initializing Shared Data Initialization of shared data can be tricky: extern shared int array[THREADS]; shared int *p = &array[8]; Thread-specific data: can no longer trust linker to initialize addresses for unshared global/static pointers: int foo; int *pfoo = &foo; Solution: per file initialization functions to handle complex cases —Must be able to run in arbitrary order —Runtime may provide helper functions for compiler.

Implementation Plan 1) Processes with shared memory: —In progress: should be done by mid-June. —Allow compiler correctness testing and optimization work to proceed. 2) Processes with network —Less than a month additional effort. —Allow GASNet testing, and ports to multiple networks. 3) Threads support —Trickiest implementation issues. 4) Ports to other platforms trivial given a GASNet implementation for the network. —Mainly compiler/linker-specific hooks for TSD.