Compiler BE Panel IDC HPC User Forum April 2009 Don Kretsch Director, Sun Developer Tools Sun Microsystems.

Slides:



Advertisements
Similar presentations
Unified Parallel C at LBNL/UCB Implementing a Global Address Space Language on the Cray X1 Christian Bell and Wei Chen.
Advertisements

Introductions to Parallel Programming Using OpenMP
The OpenUH Compiler: A Community Resource Barbara Chapman University of Houston March, 2007 High Performance Computing and Tools Group
Software and Services Group Optimization Notice Advancing HPC == advancing the business of software Rich Altmaier Director of Engineering Sept 1, 2011.
Advanced microprocessor optimization Kampala August, 2007 Agner Fog
GPGPU Introduction Alan Gray EPCC The University of Edinburgh.
Code Compaction of an Operating System Kernel Haifeng He, John Trimble, Somu Perianayagam, Saumya Debray, Gregory Andrews Computer Science Department.
Types of Parallel Computers
March 18, 2008SSE Meeting 1 Mary Hall Dept. of Computer Science and Information Sciences Institute Multicore Chips and Parallel Programming.
Introduction CS 524 – High-Performance Computing.
System Level Design: Orthogonalization of Concerns and Platform- Based Design K. Keutzer, S. Malik, R. Newton, J. Rabaey, and A. Sangiovanni-Vincentelli.
Software Group © 2006 IBM Corporation Compiler Technology Task, thread and processor — OpenMP 3.0 and beyond Guansong Zhang, IBM Toronto Lab.
Java for High Performance Computing Jordi Garcia Almiñana 14 de Octubre de 1998 de la era post-internet.
What Great Research ?s Can RAMP Help Answer? What Are RAMP’s Grand Challenges ?
Contemporary Languages in Parallel Computing Raymond Hummel.
1 MPI-2 and Threads. 2 What are Threads? l Executing program (process) is defined by »Address space »Program Counter l Threads are multiple program counters.
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
GPU Programming with CUDA – Accelerated Architectures Mike Griffiths
Computer System Architectures Computer System Software
Multi-core Programming Thread Profiler. 2 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Topics Look at Intel® Thread Profiler features.
OpenMP in a Heterogeneous World Ayodunni Aribuki Advisor: Dr. Barbara Chapman HPCTools Group University of Houston.
Effective User Services for High Performance Computing A White Paper by the TeraGrid Science Advisory Board May 2009.
ICOM 5995: Performance Instrumentation and Visualization for High Performance Computer Systems Lecture 7 October 16, 2002 Nayda G. Santiago.
A genda for Today What is memory management Source code to execution Address binding Logical and physical address spaces Dynamic loading, dynamic linking,
Computing Labs CL5 / CL6 Multi-/Many-Core Programming with Intel Xeon Phi Coprocessors Rogério Iope São Paulo State University (UNESP)
Implementation of Parallel Processing Techniques on Graphical Processing Units Brad Baker, Wayne Haney, Dr. Charles Choi.
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
Uncovering the Multicore Processor Bottlenecks Server Design Summit Shay Gal-On Director of Technology, EEMBC.
4 November 2008NGS Innovation Forum '08 11 NGS Clearspeed Resources Clearspeed and other accelerator hardware on the NGS Steven Young Oxford NGS Manager.
CAPS project-team Compilation et Architectures pour Processeurs Superscalaires et Spécialisés.
Performance of mathematical software Agner Fog Technical University of Denmark
Alternative ProcessorsHPC User Forum Panel1 HPC User Forum Alternative Processor Panel Results 2008.
HPC User Forum Back End Compiler Panel SiCortex Perspective Kevin Harris Compiler Manager April 2009.
CS 460/660 Compiler Construction. Class 01 2 Why Study Compilers? Compilers are important – –Responsible for many aspects of system performance Compilers.
Experts in numerical algorithms and HPC services Compiler Requirements and Directions Rob Meyer September 10, 2009.
1 Qualifying ExamWei Chen Unified Parallel C (UPC) and the Berkeley UPC Compiler Wei Chen the Berkeley UPC Group 3/11/07.
Linear Algebra Libraries: BLAS, LAPACK, ScaLAPACK, PLASMA, MAGMA
Programmability Hiroshi Nakashima Thomas Sterling.
HPC F ORUM S EPTEMBER 8-10, 2009 Steve Rowan srowan at conveycomputer.com.
John Demme Simha Sethumadhavan Columbia University.
Chapter 1 Basic Concepts of Operating Systems Introduction Software A program is a sequence of instructions that enables the computer to carry.
Single Node Optimization Computational Astrophysics.
Design of A Custom Vector Operation API Exploiting SIMD Intrinsics within Java Presented by John-Marc Desmarais Authors: Jonathan Parri, John-Marc Desmarais,
Co-Processor Architectures Fermi vs. Knights Ferry Roger Goff Dell Senior Global CERN/LHC Technologist |
3/12/2013Computer Engg, IIT(BHU)1 OpenMP-1. OpenMP is a portable, multiprocessing API for shared memory computers OpenMP is not a “language” Instead,
Lab Activities 1, 2. Some of the Lab Server Specifications CPU: 2 Quad(4) Core Intel Xeon 5400 processors CPU Speed: 2.5 GHz Cache : Each 2 cores share.
Intermediate Parallel Programming and Cluster Computing Workshop Oklahoma University, August 2010 Running, Using, and Maintaining a Cluster From a software.
Background Computer System Architectures Computer System Software.
Constructing a system with multiple computers or processors 1 ITCS 4/5145 Parallel Programming, UNC-Charlotte, B. Wilkinson. Jan 13, 2016.
Tools and Libraries for Manycore Computing Kathy Yelick U.C. Berkeley and LBNL.
PERFORMANCE OF THE OPENMP AND MPI IMPLEMENTATIONS ON ULTRASPARC SYSTEM Abstract Programmers and developers interested in utilizing parallel programming.
Towards a High Performance Extensible Grid Architecture Klaus Krauter Muthucumaru Maheswaran {krauter,
Computer System Structures
Chapter 1 Introduction.
For Massively Parallel Computation The Chaotic State of the Art
Chapter 1 Introduction.
Constructing a system with multiple computers or processors
Performance Analysis, Tools and Optimization
University of Technology
Compiler Back End Panel
Compiler Back End Panel
Constructing a system with multiple computers or processors
Constructing a system with multiple computers or processors
Alternative Processor Panel Results 2008
Constructing a system with multiple computers or processors
Back End Compiler Panel
HPC User Forum: Back-End Compiler Technology Panel
CAPS project-team Compilation et Architectures pour Processeurs Superscalaires et Spécialisés.
Types of Parallel Computers
Question 1 How are you going to provide language and/or library (or other?) support in Fortran, C/C++, or another language for massively parallel programming.
Presentation transcript:

Compiler BE Panel IDC HPC User Forum April 2009 Don Kretsch Director, Sun Developer Tools Sun Microsystems

HPC User ForumPage 2 Are compiler code generation techniques going to transition along with the hardware transition from multi- core to many-core and hybrid systems and at what speed? Single Core Classic performance optimizations Multi-Core Auto-parallelization Alternate code paths Directive extensions Many-Core Fine-grained parallelization Speculative parallelization Hybrid-Core Cooperating processor- specific code generators Assembly of general and special purpose elements

HPC User ForumPage 3 Data that helps compilers improve multi-core utilization include: aliasing assertions, levels for pointers, variable scoping, expected runtime environment (processors, architectures, caches, cores, binding of threads) ‏ Multiple approaches will evolve over time > Directive-based: standard and extended OpenMP (e.g., auto-scoping of variables), Transactional memory > Library-based: various commercial and open-source options (Boost, Sun MCFX, Intel TBB) ‏ > Other: just-in-time (binary) parallelization, profile feedback, compiler commentary review What information do you need from a Compiler Intermediate Format to efficiently utilize multi-core, many-core and hybrid systems that is not available from traditional languages like C, C++, or F90? Are you looking at directive-based or library-based approaches or is there another approach that you like?

HPC User ForumPage 4 Steps for adoption of global memory addressing: > Standard syntax defined and supported on a majority of vendor compilers > Tuned interactions with network run-time layer > Wide-spread use by ISVs in their library offerings > Knowledgeable programmers Alternatives with no compiler impact and less topology sensitivity: > MPI library > Job schedulers (Sun Grid Engine, LSF, etc.) ‏ > Map-Reduce (Hadoop) ‏ > Others Is embedded global memory addressing (like Co-Array Fortran) to be widely available and supported even on distributed memory systems?

HPC User ForumPage 5 Sun's compilers support latest > UltraSPARC processors with 64 threads (8 cores x 8 h/w threads) ‏ > Intel and AMD processors with various SSE instructions Hybrid transactional memory* support in Sun's tool suite is planned for release in 2009 > Hardware assistance (new instructions and microarchitectural structures) is being implemented in a future Sun processor Co-processors > GPGPU support is being considered What kind of hybrid systems or processor extensions are going to be supported by your compiler's code generation suite? * Transactional memory is the ability to perform a set of instructions atomically without expensive atomic instructions

HPC User ForumPage 6 First, basic OS libraries need to be MT-safe & tuned for many-core systems > Libc, libm, etc. > Memory allocation (mt-malloc) ‏ > Hybrid transactional memory routines Next, domain-specific libraries can extend system libraries > Solvers (BLAS, LAPACK,...) ‏ > C++ STL, Boost, etc. > Berkeley “Dwarfs” Possible resource contention when using multiple libraries What new run-time libraries will be available to utilize multi-core, many-core, and hybrid systems and will they work seamlessly through dynamic linking?