GPU Architectural Considerations for Cellular Automata Programming A comparison of performance between a x86 CPU and nVidia Graphics Card Stephen Orchowski,

Slides:

Advertisements

Similar presentations

Speed, Accurate and Efficient way to identify the DNA.

Advertisements

Accelerators for HPC: Programming Models Accelerators for HPC: StreamIt on GPU High Performance Applications on Heterogeneous Windows Clusters

Introduction to the CUDA Platform

Scalable Multi-Cache Simulation Using GPUs Michael Moeng Sangyeun Cho Rami Melhem University of Pittsburgh.

GPGPU Introduction Alan Gray EPCC The University of Edinburgh.

Instructor Notes This lecture discusses three important optimizations The performance impact of mapping threads to data on the GPU is subtle but extremely.

Cyberinfrastructure for Scalable and High Performance Geospatial Computation Xuan Shi Graduate assistants supported by the CyberGIS grant Fei Ye (2011)

1 The Game of Life Supplement 2. 2 Background The Game of Life was devised by the British mathematician John Horton Conway in More sophisticated.

Weekly Report Ph.D. Student: Leo Lee date: Oct. 9, 2009.

Acceleration of the Smith– Waterman algorithm using single and multiple graphics processors Author : Ali Khajeh-Saeed, Stephen Poole, J. Blair Perot. Publisher:

GPU Computing with CUDA as a focus Christie Donovan.

Multi Agent Simulation and its optimization over parallel architecture using CUDA™ Abdur Rahman and Bilal Khan NEDUET(Department Of Computer and Information.

Optimizing and Auto-Tuning Belief Propagation on the GPU Scott Grauer-Gray and Dr. John Cavazos Computer and Information Sciences, University of Delaware.

Team Members: Tyler Drake Robert Wrisley Kyle Von Koepping Justin Walsh Faculty Advisors: Computer Science – Prof. Sanjay Rajopadhye Electrical & Computer.

DCABES 2009 China University Of Geosciences 1 The Parallel Models of Coronal Polarization Brightness Calculation Jiang Wenqian.

CUDA Programming Lei Zhou, Yafeng Yin, Yanzhi Ren, Hong Man, Yingying Chen.

A Performance and Energy Comparison of FPGAs, GPUs, and Multicores for Sliding-Window Applications From J. Fowers, G. Brown, P. Cooke, and G. Stitt, University.

Accelerating Machine Learning Applications on Graphics Processors Narayanan Sundaram and Bryan Catanzaro Presented by Narayanan Sundaram.

Towards A Multi-Agent System for Network Decision Analysis Jan Dijkstra.

GPGPU overview. Graphics Processing Unit (GPU) GPU is the chip in computer video cards, PS3, Xbox, etc – Designed to realize the 3D graphics pipeline.

Nawaf M Albadia Introduction. Components. Behavior & Characteristics. Classes & Rules. Grid Dimensions. Evolving Cellular Automata using Genetic.

GPGPU platforms GP - General Purpose computation using GPU

Generating Random Numbers in Hardware. Two types of random numbers used in computing: --”true” random numbers: ++generated from a physical source (e.g.,

Accelerating SQL Database Operations on a GPU with CUDA Peter Bakkum & Kevin Skadron The University of Virginia GPGPU-3 Presentation March 14, 2010.

Skew Handling in Aggregate Streaming Queries on GPUs Georgios Koutsoumpakis 1, Iakovos Koutsoumpakis 1 and Anastasios Gounaris 2 1 Uppsala University,

Discovery of Cellular Automata Rules Using Cases Ken-ichi Maeda Chiaki Sakama Wakayama University Discovery Science 2003, Oct.17.

Predictive Runtime Code Scheduling for Heterogeneous Architectures 1.

CuMAPz: A Tool to Analyze Memory Access Patterns in CUDA

GPU Programming David Monismith Based on notes taken from the Udacity Parallel Programming Course.

Computationally Efficient Histopathological Image Analysis: Use of GPUs for Classification of Stromal Development Olcay Sertel 1,2, Antonio Ruiz 3, Umit.

Lukasz Grzegorz Maciak Micheal Alexis

Computer Graphics Graphics Hardware

BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1.

Implementation of Parallel Processing Techniques on Graphical Processing Units Brad Baker, Wayne Haney, Dr. Charles Choi.

CS6963 L15: Design Review and CUBLAS Paper Discussion.

Massively Parallel Mapping of Next Generation Sequence Reads Using GPUs Azita Nouri, Reha Oğuz Selvitopi, Özcan Öztürk, Onur Mutlu, Can Alkan Bilkent University,

1 © 2012 The MathWorks, Inc. Parallel computing with MATLAB.

Programming Concepts in GPU Computing Dušan Gajić, University of Niš Programming Concepts in GPU Computing Dušan B. Gajić CIITLab, Dept. of Computer Science.

Use of GPUs in ALICE (and elsewhere) Thorsten Kollegger TDOC-PG | CERN |

Evaluating FERMI features for Data Mining Applications Masters Thesis Presentation Sinduja Muralidharan Advised by: Dr. Gagan Agrawal.

CS179: GPU Programming Lecture 16: Final Project Discussion.

Introduction to Lattice Simulations. Cellular Automata What are Cellular Automata or CA? A cellular automata is a discrete model used to study a range.

Playing God: The Engineering of Functional Designs in the Game of Life Liban Mohamed Computer Systems Research Lab

Research Into the Time Reversal of Cellular Automata Team rm -rf / Daniel Kaplun, Dominic Labanowski, Alex Lesman.

Cellular Automata Martijn van den Heuvel Models of Computation June 21st, 2011.

Fast Support Vector Machine Training and Classification on Graphics Processors Bryan Catanzaro Narayanan Sundaram Kurt Keutzer Parallel Computing Laboratory,

GPU Architecture and Programming

Richard Kelley Motion Planning on a GPU. Last Time Nvidia’s white paper Productive discussion.

Some key aspects of NVIDIA GPUs and CUDA. Silicon Usage.

GPUs: Overview of Architecture and Programming Options Lee Barford firstname dot lastname at gmail dot com.

QCAdesigner – CUDA HPPS project

By Dirk Hekhuis Advisors Dr. Greg Wolffe Dr. Christian Trefftz.

CS851 – Biological Computing February 6, 2003 Nathanael Paul Randomness in Cellular Automata.

Strategies and Rubrics for Teaching Chaos and Complex Systems Theories as Elaborating, Self-Organizing, and Fractionating Evolutionary Systems Fichter,

An Enhanced Cellular Automata and Image Pyramid Decomposition Based Algorithm for Image Segmentation : A New Concept Anand Prakash Shukla Suneeta Agarwal.

Ray Tracing by GPU Ming Ouhyoung. Outline Introduction Graphics Hardware Streaming Ray Tracing Discussion.

CUDA Compute Unified Device Architecture. Agent Based Modeling in CUDA Implementation of basic agent based modeling on the GPU using the CUDA framework.

Heterogeneous Processing KYLE ADAMSKI. Overview What is heterogeneous processing? Why it is necessary Issues with heterogeneity CPU’s vs. GPU’s Heterogeneous.

Crowds (and research in computer animation and games)

GPU-based iterative CT reconstruction

Brad Baker, Wayne Haney, Dr. Charles Choi

Lecture 2: Intro to the simd lifestyle and GPU internals

Crowds (and research in computer animation and games)

Cellular Automata.

All-Pairs Shortest Paths

Excursions into Logic Based Computation using Conway’s Game of Life

6- General Purpose GPU Programming

Wavelet Compression for In Situ Data Reduction

Cave generation with cellular automata

Presentation transcript:

GPU Architectural Considerations for Cellular Automata Programming A comparison of performance between a x86 CPU and nVidia Graphics Card Stephen Orchowski, CSE 520, 12/3/2008

Project Goals To study the architecture of a GPU To study a programming model based on that architecture and gain experience using it To determine how the various architectural features affect performance and to what degree To suggest an optimum configuration of a particular algorithm for the selected GPU

Cellular Automata 2-Dimensional grid where each cell value in one generation is based on simple rules using the values of adjacent surrounding neighbors Can start with random or pre-defined patterns Successive generations evolve into complex patterns Highly parallelizable!

Results…so far…

Conclusions…so far… GPUs offer speedup over a serial processor, but there is no “silver bullet” for programming techniques Developers have to tweak the programming to get maximum performance out of the architecture Return on programming effort doesn’t always justify use of the GPU – Implementing CA on a GPU probably isn’t worth the effort unless the grid size is extensively large, but CA does have applications to more complex algorithms with Computational Fluid Dynamics and does provide a good framework for studying the architecture Even different versions of a GPU will also necessitate further tweaks to the program, even if the card is of the same GPU hardware family CUDA vs. Cell programming and other architectures – tradeoffs?