Presentation on theme: "CSCI 4125 Programming for Performance Andrew Rau-Chaplin"— Presentation transcript:
CSCI 4125 Programming for Performance Andrew Rau-Chaplin firstname.lastname@example.org www.cs.dal.ca/~arc
Course Objectives Explore techniques for designing, implementing and evaluating efficient programs for Sequential computers, Shared-Memory Multiprocessors, and Distributed Memory Multicomputers Make it go fast!
Performance oriented dev cycle techniques and tools for a performance oriented development cycle Algorithm design Implementation Benchmarking/evaluation Performance Tuning
Quantifying performance Themes include: evaluation of performance design of test data sets issues of stability/reliability scalability common performance enhancing techniques parallel algorithm design techniques identification and elimination of dependencies
Skills Development how to design experiments/benchmarks how to use of statistics in performance evaluation how to instrument code to obtain reliable timings how to use compiler switches how to use a profiler and performance tuning tools how to use a debugger/tracing tools how to plot performance results
Topics Introduction to Parallelism Parallel Programming Parallel Architectures Parallel Algorithms Parallel Applications Other Parallel Architectures & Algorithms
Official Outline This course explores the design, implementation, and evaluation of computer programs for applications in which performance is a central issue. In the sequential and multi-core settings, it explores topics such as profiling, cache effects, I/O performance, floating- point issues, multi-threading, and performance tuning techniques. It introduces techniques for the design, implementation and evaluation of programs for Multicore processors, Shared- Memory Multiprocessors (SMPs) and Distributed Memory Multicomputers (Clusters).
Resources Course web page: www.cs.dal.ca/~arc/teaching/CSc4125 All notes, readings, assignments Parallel Machines Your laptop! CGM6 & CGM7 Hugh
Readings Sorry no text book! Will Assign Readings
Books Introduction to High Performance Computing for Scientists and Engineers by Georg Hager and Gerhard Wellein Parallel Programming by Peter Pacheco, Morgan Kaufman Structured Parallel Programming by Michael McCool, Arch D. Robison, and James Reinders Parallel Programming in C with MPI and OpenMP by Quinn Parallel Programming with Intel Parallel Studio XE by S. Blair-Chappell and A. Stokes Using OpenMP: Portable Shared Memory Parallel Programming By Barbara Chapman, Gabriele Jost and Ruud van der Pas; Parallel Programming in OpenMP, by Rohit Chandra, Dave Kohr, Jeff McDonald, Morgan Kaufman
Prerequisites Knowledge of C Csci3120: Operating systems Good to have CSci3110 - Analysis of Algorithms
Course Evaluation Assignments50% Midterm25% Final Project20% Participation5% See course web page for assignment copies and due dates
Assignments Selected From Sequential Optimization OpenMP Cilk Thread building blocks MPI Hadoop CUDA/OpenCL Best 4 out of 5 count towards final grade!
Midterm About 2/3 rd of the way through… To test conceptual knowledge gained from classes and readings If you have not done the readings you will not pass the midterm
Final Project Select your own topic Either Optimize an existing codebase Design and implementation of an efficient new code Components: Literature/Code review, some research or programming work, final paper, presentation Main Deliverable: Conference style paper plus short in-class talk
Questions Why are you taking this course? Which performance oriented technologies are you interested in? How will you know if the course has been a success for you?