2012-10-26 FSOSS Dr. Chris Szalwinski Professor School of Information and Communication Technology Seneca College, Toronto, Canada GPU Research Capabilities.

Slides:



Advertisements
Similar presentations
GPU Programming using BU Shared Computing Cluster
Advertisements

Vectors, SIMD Extensions and GPUs COMP 4611 Tutorial 11 Nov. 26,
Multi-core and tera- scale computing A short overview of benefits and challenges CSC 2007 Andrzej Nowak, CERN
Instructor Notes We describe motivation for talking about underlying device architecture because device architecture is often avoided in conventional.
ATI Stream ™ Physics Neal Robison Director of ISV Relations, AMD Graphics Products Group Game Developers Conference March 26, 2009.
Monte-Carlo method and Parallel computing  An introduction to GPU programming Mr. Fang-An Kuo, Dr. Matthew R. Smith NCHC Applied Scientific Computing.
Introduction Introduction Håkon Kvale Stensland August 26 th, 2011 INF5063: Programming heterogeneous multi-core processors.
Introduction Introduction Håkon Kvale Stensland August 28 th, 2012 INF5063: Programming heterogeneous multi-core processors.
Computing with Accelerators: Overview ITS Research Computing Mark Reed.
ACCELERATING MATRIX LANGUAGES WITH THE CELL BROADBAND ENGINE Raymes Khoury The University of Sydney.
GPU System Architecture Alan Gray EPCC The University of Edinburgh.
GPGPU Introduction Alan Gray EPCC The University of Edinburgh.
Appendix A — 1 FIGURE A.2.2 Contemporary PCs with Intel and AMD CPUs. See Chapter 6 for an explanation of the components and interconnects in this figure.
HPCC Mid-Morning Break High Performance Computing on a GPU cluster Dirk Colbry, Ph.D. Research Specialist Institute for Cyber Enabled Discovery.
Why GPU Computing. GPU CPU Add GPUs: Accelerate Science Applications © NVIDIA 2013.
GRAPHICS AND COMPUTING GPUS Jehan-François Pâris
SYNAR Systems Networking and Architecture Group CMPT 886: Special Topics in Operating Systems and Computer Architecture Dr. Alexandra Fedorova School of.
GPU Computing with CUDA as a focus Christie Donovan.
1 ITCS 6/8010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Jan 19, 2011 Emergence of GPU systems and clusters for general purpose High Performance Computing.
A Performance and Energy Comparison of FPGAs, GPUs, and Multicores for Sliding-Window Applications From J. Fowers, G. Brown, P. Cooke, and G. Stitt, University.
Contemporary Languages in Parallel Computing Raymond Hummel.
Introduction What is GPU? It is a processor optimized for 2D/3D graphics, video, visual computing, and display. It is highly parallel, highly multithreaded.
GPGPU overview. Graphics Processing Unit (GPU) GPU is the chip in computer video cards, PS3, Xbox, etc – Designed to realize the 3D graphics pipeline.
Emergence of GPU systems for general purpose high performance computing ITCS 4145/5145 April 4, 2013 © Barry Wilkinson CUDAIntro.ppt.
Real Parallel Computers. Modular data centers Background Information Recent trends in the marketplace of high performance computing Strohmaier, Dongarra,
HPCC Mid-Morning Break Dirk Colbry, Ph.D. Research Specialist Institute for Cyber Enabled Discovery Introduction to the new GPU (GFX) cluster.
Motivation “Every three minutes a woman is diagnosed with Breast cancer” (American Cancer Society, “Detailed Guide: Breast Cancer,” 2006) Explore the use.
GPU Programming with CUDA – Accelerated Architectures Mike Griffiths
1 ITCS 4/5010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Dec 31, 2012 Emergence of GPU systems and clusters for general purpose High Performance Computing.
Training Program on GPU Programming with CUDA 31 st July, 7 th Aug, 14 th Aug 2011 CUDA Teaching UoM.
NVDA Preetam Jinka Akhil Kolluri Pavan Naik. Background Graphics processing units (GPUs) Chipsets Workstations Personal computers Mobile devices Servers.
GPU – Graphic Processing Unit
David Luebke NVIDIA Research GPU Computing: The Democratization of Parallel Computing.
Shared memory systems. What is a shared memory system Single memory space accessible to the programmer Processor communicate through the network to the.
Implementation of Parallel Processing Techniques on Graphical Processing Units Brad Baker, Wayne Haney, Dr. Charles Choi.
By Arun Bhandari Course: HPC Date: 01/28/12. GPU (Graphics Processing Unit) High performance many core processors Only used to accelerate certain parts.
© David Kirk/NVIDIA and Wen-mei W. Hwu, 1 Programming Massively Parallel Processors Lecture Slides for Chapter 1: Introduction.
Status of the L1 STS Tracking I. Kisel GSI / KIP CBM Collaboration Meeting GSI, March 12, 2009.
General Purpose Computing on Graphics Processing Units: Optimization Strategy Henry Au Space and Naval Warfare Center Pacific 09/12/12.
GPU Computing April GPU Outpacing CPU in Raw Processing GPU NVIDIA GTX cores 1.04 TFLOPS CPU GPU CUDA Architecture Introduced DP HW Introduced.
GPU in HPC Scott A. Friedman ATS Research Computing Technologies.
Programming Concepts in GPU Computing Dušan Gajić, University of Niš Programming Concepts in GPU Computing Dušan B. Gajić CIITLab, Dept. of Computer Science.
Emergence of GPU systems and clusters for general purpose high performance computing ITCS 4145/5145 April 3, 2012 © Barry Wilkinson.
1 Introduction to Computer Graphics with WebGL Ed Angel Professor Emeritus of Computer Science Founding Director, Arts, Research, Technology and Science.
GPU Architecture and Programming
Introducing collaboration members – Korea University (KU) ALICE TPC online tracking algorithm on a GPU Computing Platforms – GPU Computing Platforms Joohyung.
1 Latest Generations of Multi Core Processors
1 Ceng 545 GPU Computing. Grading 2 Midterm Exam: 20% Homeworks: 40% Demo/knowledge: 25% Functionality: 40% Report: 35% Project: 40% Design Document:
Linchuan Chen. 图形处理器( Graphics Processing Unit ), 是一种专门用来处理在个人电脑、工作站或游 戏机上图像运算工作的微处理器。 图形处理器使显卡减少了对中央处理器的依赖, 并分担了部分原本是由中央处理器所担当的工 作 Efficient at manipulating.
Introduction What is GPU? It is a processor optimized for 2D/3D graphics, video, visual computing, and display. It is highly parallel, highly multithreaded.
Carlo del Mundo Department of Electrical and Computer Engineering Ubiquitous Parallelism Are You Equipped To Code For Multi- and Many- Core Platforms?
GPUs: Overview of Architecture and Programming Options Lee Barford firstname dot lastname at gmail dot com.
Current Research Overview Jeremy Espenshade 09/04/08.
Computer Architecture Lecture 24 Parallel Processing Ralph Grishman November 2015 NYU.
GPGPU introduction. Why is GPU in the picture Seeking exa-scale computing platform Minimize power per operation. – Power is directly correlated to the.
GPU Computing for GIS James Mower Department of Geography and Planning University at Albany.
Graphic Processing Units Presentation by John Manning.
Emergence of GPU systems for general purpose high performance computing ITCS 4145/5145 © Barry Wilkinson GPUIntro.ppt Oct 30, 2014.
Heterogeneous Processing KYLE ADAMSKI. Overview What is heterogeneous processing? Why it is necessary Issues with heterogeneity CPU’s vs. GPU’s Heterogeneous.
Sobolev(+Node 6, 7) Showcase +K20m GPU Accelerator.
GPUs (Graphics Processing Units). Information from Textbook Online Appendix C includes information on GPUs Access online resources from: –
Emergence of GPU systems for general purpose high performance computing ITCS 4145/5145 July 12, 2012 © Barry Wilkinson CUDAIntro.ppt.
GPU Computing Jan Just Keijser Nikhef Jamboree, Utrecht
Enabling machine learning in embedded systems
Brook GLES Pi: Democratising Accelerator Programming
The Free Lunch Ended 7 Years Ago
Graphics Processing Unit
Multicore and GPU Programming
CSE 502: Computer Architecture
Multicore and GPU Programming
Presentation transcript:

FSOSS Dr. Chris Szalwinski Professor School of Information and Communication Technology Seneca College, Toronto, Canada GPU Research Capabilities at Seneca

2 A Fresh Initiative From Some Personal History To Heterogeneous Computing

3 A Fresh Initiative The 80287

4 A Fresh Initiative Floating-Point Co-Processor (1985)

5 A Fresh Initiative ATI 3D Rage II Co-Processor (1996)

6 A Fresh Initiative A Paradigm Shift In Programming

7 Paradigm Shift The Turn Towards Concurrency

8 Paradigm Shift

9 Can still increase  transistor density – but it's getting more expensive

10 Paradigm Shift Can still increase  transistor density – but it's getting more expensive Can't increase  processor frequencies < 10 GHz chips

11 Paradigm Shift Can still increase  transistor density – but it's getting more expensive Can't increase  processor frequencies < 10 GHz chips  power consumption – can't melt chips

12 Paradigm Shift Can still increase  transistor density – but it's getting more expensive Can't increase  processor frequencies < 10 GHz chips  power consumption – can't melt chips The Free Lunch is Over  we can't just wait for improvement like we did before  we need new routes to improvement

13 Paradigm Shift Use Different Computational Units For Distinctly Different Tasks

14 Heterogeneous Computing Intel Core i7 (2008), NVIDIA GeForce GTX580 (2010)

15 Heterogeneous Computing

16 Heterogeneous Computing

17 Heterogeneous Computing Serial processing Parallel processing +

18 Heterogeneous Computing NVIDIA many-core GPUs vs Intel multi-core CPUs  Floating point operations per sec (GFLOP/s)  Memory bandwidth (GB/s)

19 Industry Momentum STI (Sony + Toshiba + IBM)  Broadband Cell Processor – CPU + GPU on one chip

20 Industry Momentum STI (Sony + Toshiba + IBM)  Broadband Cell Processor – CPU + GPU on one chip Intel  Xeon Phi – MIC (Many Integrated Core)

21 Industry Momentum STI (Sony + Toshiba + IBM)  Broadband Cell Processor – CPU + GPU on one chip Intel  Xeon Phi – MIC (Many Integrated Core) AMD  APUs (Fusion) – CPU + GPU on a single chip

22 Industry Momentum STI (Sony + Toshiba + IBM)  Broadband Cell Processor – CPU + GPU on one chip Intel  Xeon Phi – MIC (Many Integrated Core) AMD  APUs (Fusion) – CPU + GPU on a single chip  HSA Foundation (2012) – AMD + ARM + TI + Imagination + MediaTek + Samsung + Ateris + Multicore Ware + Apical + Sonics + Symbio + Vivante

23 Industry Momentum STI (Sony + Toshiba + IBM)  Broadband Cell Processor – CPU + GPU on one chip Intel  Xeon Phi – MIC (Many Integrated Core) AMD  APUs (Fusion) – CPU + GPU on a single chip  HSA Foundation (2012) – AMD + ARM + TI + Imagination + MediaTek + Samsung + Ateris + Multicore Ware + Apical + Sonics + Symbio + Vivante  Radeon – Discrete GPUs

24 Industry Momentum STI (Sony + Toshiba + IBM)  Cell Processor – CPU + GPU on one chip Intel  Xeon Phi – MIC (Many Integrated Core) AMD  APUs (Fusion) – CPU + GPU on a single chip  HSA Foundation (2012) – AMD + ARM + TI + Imagination + MediaTek + Samsung + Ateris + Multicore Ware + Apical + Sonics + Symbio + Vivante  Radeon – Discrete GPUs NVIDIA – Discrete GPUs  GeForce (digital gaming)  Quadro (engineering workstations - graphics)  Tesla (scientific computations – double precision)

25 Industry Momentum Discrete GPUs - Add-in board shipments

26 Industry Momentum Predictions

27 Industry Predictions Computer Graphics Market

28 Industry Predictions Computer Graphics Market  Traditional processors + low-cost graphics processors enable combinations of science and entertainment

29 Industry Predictions Embedded Graphics Processors (EGPs) are killing off Integrated Graphics Processors (IGPs)

30 Industry Predictions Embedded Graphics Processors (EGPs) are no threat to Discrete Graphics

31 Programming Heterogeneous Computers Concurrency-Oriented Programming  Core Languages Fortran C C++

32 Programming Heterogeneous Computers Concurrency-Oriented Programming (COP)  Core Languages Fortran C C++  Extensions for COP Cilk Plus (Intel) OpenCL (Khronos Group – AMD and HSA) CUDA  C/C++ (NVIDIA)  Fortran 2008, C-x86 (PGI) DirectCompute (Microsoft)

33 Programming Heterogeneous Computers CUDA Teaching Centers in Ontario  McMaster University (2010) High Performance Parallel Computing on Graphical Processing Units – ECE709 – part of Master's Degree  University of Toronto (2011) Special Topics in Software Engineering: Programming Massively Parallel Graphics Processors – ECE1724H – part of Master's Degree  Seneca College (2012) Introduction to Parallel Programming – Professional Option – GPU610/DPS915 – CPA Diploma and BSD Degree

34 Programming Heterogeneous Computers School of Information and Communications Technology (ICT) Our Capabilities and Plans

35 ICT Facilities Fully Equipped Teaching Classroom and Lab  40 seats  38 CUDA enabled desktops with GTX480s (480 cores) Maximus Workstation  Quadro 600 for visualization  Tesla C2075 for computation SCI-Net Research  Accelerator Research Cluster – research testbed  8 x [2 Intel Xeon X NVIDIA Tesla M2070]

36 ICT Facilities The 80287

37 ICT Courses Introductory Course – Student Skill Set  Solid tested background in both C and C++  Profile for computationally intensive code  Move critical code to the GPU using CUDA  Optimize to hide memory latency with computations Programmer Training Workshops – on demand Advanced Course – (in the planning stage)  Interactive Real-Time Computations + Visualization  Parallelizing Fortran Applications  OpenGL, DirectX Graphics Interoperability

38 ICT Faculty Areas of Interest or Domain Expertise  Big Data – Geocomputation  Cognition – Cognitive Tutors  Intrusion Detection – Information Security  Finite Element Analysis – Soft Matter

39 ICT Scope Areas of Application (source: NVIDIA)  Image Processing  Big Data Mining  Gaming  Advertising  Genetics  Quantum Chemistry  Mathematics  Product Design  Scientific Computing  Computational Finance

FSOSS Dr. Chris Szalwinski Professor School of Information and Communication Technology Seneca College, Toronto, Canada GPU Research Capabilities at Seneca

41 Science and Entertainment Science Art ComputationVisualization