Compiler Back End Panel

Slides:

Advertisements

Similar presentations

Streaming SIMD Extension (SSE)

Advertisements

1 Presenter: Chien-Chih Chen. 2 Dynamic Scheduler for Multi-core Systems Analysis of The Linux 2.6 Kernel Scheduler Optimal Task Scheduler for Multi-core.

Overview Motivations Basic static and dynamic optimization methods ADAPT Dynamo.

GPGPU Introduction Alan Gray EPCC The University of Edinburgh.

University of Michigan Electrical Engineering and Computer Science MacroSS: Macro-SIMDization of Streaming Applications Amir Hormati*, Yoonseo Choi ‡,

Introduction to Symmetric Multiprocessors Süha TUNA Bilişim Enstitüsü UHeM Yaz Çalıştayı

Parallel Architectures

Computer Architecture Parallel Processing

Lecture 29 Fall 2006 Lecture 29: Parallel Programming Overview.

ICOM 5995: Performance Instrumentation and Visualization for High Performance Computer Systems Lecture 7 October 16, 2002 Nayda G. Santiago.

Parallel Programming Models Jihad El-Sana These slides are based on the book: Introduction to Parallel Computing, Blaise Barney, Lawrence Livermore National.

GPUs and Accelerators Jonathan Coens Lawrence Tan Yanlin Li.

Compiler BE Panel IDC HPC User Forum April 2009 Don Kretsch Director, Sun Developer Tools Sun Microsystems.

1 Advance Computer Architecture CSE 8383 Ranya Alawadhi.

Performance of mathematical software Agner Fog Technical University of Denmark

GPU Architecture and Programming

HPC User Forum Back End Compiler Panel SiCortex Perspective Kevin Harris Compiler Manager April 2009.

1 Optimizing compiler tools and building blocks project Alexander Drozdov, PhD Sergey Novikov, PhD.

Parallel Computing.

Overview of Operating Systems Introduction to Operating Systems: Module 0.

CS-303 Introduction to Programming

Single Node Optimization Computational Astrophysics.

Design of A Custom Vector Operation API Exploiting SIMD Intrinsics within Java Presented by John-Marc Desmarais Authors: Jonathan Parri, John-Marc Desmarais,

EECS 583 – Class 22 Research Topic 4: Automatic SIMDization - Superword Level Parallelism University of Michigan December 10, 2012.

Parallel Computing Presented by Justin Reschke

Lab Activities 1, 2. Some of the Lab Server Specifications CPU: 2 Quad(4) Core Intel Xeon 5400 processors CPU Speed: 2.5 GHz Cache : Each 2 cores share.

Processor Level Parallelism 2. How We Got Here Developments in PC CPUs.

Chapter Goals Describe the application development process and the role of methodologies, models, and tools Compare and contrast programming language generations.

Chapter Overview General Concepts IA-32 Processor Architecture

These slides are based on the book:

Auburn University COMP8330/7330/7336 Advanced Parallel and Distributed Computing Parallel Hardware Dr. Xiao Qin Auburn.

Advanced Computer Systems

Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming

Applied Operating System Concepts

Support for Program Analysis as a First-Class Design Constraint in Legion Michael Bauer 02/22/17.

Chapter 1 Introduction.

Introduction to parallel programming

COMBINED PAGING AND SEGMENTATION

Conception of parallel algorithms

Topic: Difference b/w JDK, JRE, JIT, JVM

ECE 498AL Lectures 8: Bank Conflicts and Sample PTX code

Multi-core processors

Assembly Language for Intel-Based Computers, 5th Edition

Chapter 1 Introduction.

课程名编译原理 Compiling Techniques

Flynn’s Classification Of Computer Architectures

Parallel Algorithm Design

Multi-Processing in High Performance Computer Architecture:

Morgan Kaufmann Publishers

Many-core Software Development Platforms

Multi-Processing in High Performance Computer Architecture:

Implementation of IDEA on a Reconfigurable Computer

What is Parallel and Distributed computing?

Intel® Parallel Studio and Advisor

Multi-core CPU Computing Straightforward with OpenMP

Benjamin Goldberg Compiler Verification and Optimization

Compiler Back End Panel

Coe818 Advanced Computer Architecture

Compiler Front End Panel

Alternative Processor Panel Results 2008

Chapter 1 Introduction.

Back End Compiler Panel

The Challenge of Cross - Language Interoperability

Samuel Larsen Saman Amarasinghe Laboratory for Computer Science

HPC User Forum: Back-End Compiler Technology Panel

Chapter 6 Programming the basic computer

Operating System Overview

6- General Purpose GPU Programming

Question 1 How are you going to provide language and/or library (or other?) support in Fortran, C/C++, or another language for massively parallel programming.

Martin Croome VP Business Development GreenWaves Technologies.

Presentation transcript:

Compiler Back End Panel Robert Geva Intel Compiler Lab 2018/11/24

Back End Compiler Panel 1) Are compiler code generation techniques going to transition along with the hardware transition from multi-core to many-core and hybrid systems and at what speed? 2) What information do you need from a Compiler Intermediate Format to efficiently utilize multi-core, many-core and hybrid systems that is not available from traditional languages like C, C++, or F90? Are you looking at directive-based or library-based approaches or is there another approach that you like? 3) Is embedded global memory addressing (like Co-Array Fortran) to be widely available and supported even on distributed memory systems? 4) What kind of hybrid systems or processor extensions are going to be supported by your compiler's code generation suite? 5) What new run-time libraries will be available to utilize multi-core, many-core, and hybrid systems and will they work seamlessly through dynamic linking?

Are compiler code generation techniques going to transition along with the hardware transition from multi-core to many-core and hybrid systems and at what speed? JIT compiling Ct: Research technology for a data parallel language Forward scaling to future architectures Save costly memory copying by delayed code generation Validation Multiple targets 11/24/2018

Is “limit” loop invariant? Are you looking at directive-based or library-based approaches or is there another approach that you like Is “limit” loop invariant? Needs more work Here’s my code Interactive Compiler technology to Guide the programmer to write serial code with directives and restructuring, leading to automatic parallelism 11/24/2018

Is embedded global memory addressing (like Co-Array Fortran) to be widely available and supported even on distributed memory systems? Minimize changes to programming practices Use familiar concepts, existing programming languages, integrate into advanced platform technologies Goals Data parallel language: Start programming from data parallelism perspective, use array notations. Tool chain will transform and target a CPU and Larrabee combination Offload language: Start with task parallelism, and use directives to offload computation Solutions 11/24/2018

What kind of hybrid systems or processor extensions are going to be supported by your compiler's code generation suite? Existing SIMD SSE, SSE2, SSE3, SSE4 LRB new instructions 512 width Masked operations Broadcasts, swizzles Advanced Vector eXtensions 256 width 3 operand, non destructive Enhanced data re-arrangement 11/24/2018

What new run-time libraries will be available to utilize multi-core, many-core, and hybrid systems and will they work seamlessly through dynamic linking? Resource coordination and task scheduling http://channel9.msdn.com/pdc2008/TL22/ Generic algorithms, equivalent to language extensions Domain specific libraries Including math libraries New domains, natural language processing, gesture recognition DLL hell? No good news 11/24/2018