Combining the strengths of UMIST and The Victoria University of Manchester Matthew Livesey, Hemanth John Jose and Yongping Men COMP60611 – Patterns of.

Slides:

Advertisements

Similar presentations

Shared-Memory Model and Threads Intel Software College Introduction to Parallel Programming – Part 2.

Advertisements

Large-Scale, Adaptive Fabric Configuration for Grid Computing Peter Toft HP Labs, Bristol June 2003 (v1.03) Localised for UK English.

Embedded Systems & Parallel Programming P. Marwedel, Univ. Dortmund/Informatik 12 + ICD/ES, 2007 Universität Dortmund A view on embedded systems.

Introduction Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit TexPoint fonts used in EMF. Read the TexPoint manual.

© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408/CS483, University of Illinois, Urbana-Champaign 1 ECE408 / CS483 Applied Parallel Programming.

ENV 2006 CS4.1 Envisioning Information: Case Study 4 Focus and Context for Volume Visualization.

Prasanna Pandit R. Govindarajan

Cluster Computing with Dryad Mihai Budiu, MSR-SVC LiveLabs, March 2008.

MapReduce: Simplified Data Processing on Large Cluster Jeffrey Dean and Sanjay Ghemawat OSDI 2004 Presented by Long Kai and Philbert Lin.

The Art of Multiprocessor Programming Nir Shavit, Ori Shalev CS Spring 2007 (Based on the book by Herlihy and Shavit)

Shredder GPU-Accelerated Incremental Storage and Computation

Cijo Thomas Janith Kaiprath Valiyalappil CS566 Parallel Programming, Spring '13 1.

CS 6143 COMPUTER ARCHITECTURE II SPRING 2014 ACM Principles and Practice of Parallel Programming, PPoPP, 2006 Panel Presentations Parallel Processing is.

Intro to Map-Reduce Feb 21, map-reduce? A programming model or abstraction. A novel way of thinking about designing a solution to certain problems…

Combining the strengths of UMIST and The Victoria University of Manchester Matthew Livesey, Hemanth John Jose and Yongping Men COMP60611 – Future of large-scale.

Introduction to Data Center Computing Derek Murray October 2010.

Requirements Analysis Moving to Design b521.ppt © Copyright De Montfort University 2000 All Rights Reserved INFO2005 Requirements Analysis.

Lecture 1 – Introduction, Overview

Multi-core processors. 2 Processor development till 2004 Out-of-order Instruction scheduling Out-of-order Instruction scheduling.

Accelerators for HPC: Programming Models Accelerators for HPC: StreamIt on GPU High Performance Applications on Heterogeneous Windows Clusters

HJ-Hadoop An Optimized MapReduce Runtime for Multi-core Systems Yunming Zhang Advised by: Prof. Alan Cox and Vivek Sarkar Rice University 1.

Dan Bassett, Jonathan Canfield December 13, 2011.

© DEEDS – OS Course WS11/12 Lecture 10 - Multiprocessing Support 1 Administrative Issues  Exam date candidates  CW 7 * Feb 14th (Tue): * Feb 16th.

Dynamic Thread Assignment on Heterogeneous Multiprocessor Architectures Pree Thiengburanathum Advanced computer architecture Oct 24,

Multiprocessors— Large vs. Small Scale Multiprocessors— Large vs. Small Scale.

Lecture 6: Multicore Systems

Mapreduce and Hadoop Introduce Mapreduce and Hadoop

A Dynamic World, what can Grids do for Multi-Core computing? Daniel Goodman, Anne Trefethen and Douglas Creager

GPGPU Introduction Alan Gray EPCC The University of Edinburgh.

CISC 879 : Software Support for Multicore Architectures John Cavazos Dept of Computer & Information Sciences University of Delaware

Single-Chip Multiprocessor Nirmal Andrews. Case for single chip multiprocessors Advances in the field of integrated chip processing. - Gate density (More.

Weekly Report Ph.D. Student: Leo Lee date: Oct. 9, 2009.

GPU Computing with CUDA as a focus Christie Donovan.

Google’s Map Reduce. Commodity Clusters Web data sets can be very large – Tens to hundreds of terabytes Cannot mine on a single server Standard architecture.

CISC 879 : Software Support for Multicore Architectures John Cavazos Dept of Computer & Information Sciences University of Delaware

Google’s Map Reduce. Commodity Clusters Web data sets can be very large – Tens to hundreds of terabytes Standard architecture emerging: – Cluster of commodity.

Accelerating Machine Learning Applications on Graphics Processors Narayanan Sundaram and Bryan Catanzaro Presented by Narayanan Sundaram.

Contemporary Languages in Parallel Computing Raymond Hummel.

GPGPU overview. Graphics Processing Unit (GPU) GPU is the chip in computer video cards, PS3, Xbox, etc – Designed to realize the 3D graphics pipeline.

To GPU Synchronize or Not GPU Synchronize? Wu-chun Feng and Shucai Xiao Department of Computer Science, Department of Electrical and Computer Engineering,

Fine Grain MPI Earl J. Dodd Humaira Kamal, Alan University of British Columbia 1.

Prospector : A Toolchain To Help Parallel Programming Minjang Kim, Hyesoon Kim, HPArch Lab, and Chi-Keung Luk Intel This work will be also supported by.

BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1.

GPUs and Accelerators Jonathan Coens Lawrence Tan Yanlin Li.

Introduction to Apache Hadoop Zibo Wang. Introduction  What is Apache Hadoop?  Apache Hadoop is a software framework which provides open source libraries.

General Purpose Computing on Graphics Processing Units: Optimization Strategy Henry Au Space and Naval Warfare Center Pacific 09/12/12.

Programming Concepts in GPU Computing Dušan Gajić, University of Niš Programming Concepts in GPU Computing Dušan B. Gajić CIITLab, Dept. of Computer Science.

Porting Irregular Reductions on Heterogeneous CPU-GPU Configurations Xin Huo, Vignesh T. Ravi, Gagan Agrawal Department of Computer Science and Engineering.

Multiprocessing. Going Multi-core Helps Energy Efficiency William Holt, HOT Chips 2005 Adapted from UC Berkeley "The Beauty and Joy of Computing"

Benchmarking MapReduce-Style Parallel Computing Randal E. Bryant Carnegie Mellon University.

GPUs: Overview of Architecture and Programming Options Lee Barford firstname dot lastname at gmail dot com.

SuperMatrix on Heterogeneous Platforms Jianyu Huang SHPC, UT Austin 1.

1 Presenter: Ming-Shiun Yang 2013/01/21 SAGA : SystemC Acceleration on GPU Architectures Design Automation Conference (DAC), th ACM/EDAC/IEEE Sara.

Introduction to CUDA CAP 4730 Spring 2012 Tushar Athawale.

Memcached Integration with Twister Saliya Ekanayake - Jerome Mitchell - Yiming Sun -

Ray Tracing by GPU Ming Ouhyoung. Outline Introduction Graphics Hardware Streaming Ray Tracing Discussion.

GPGPU introduction. Why is GPU in the picture Seeking exa-scale computing platform Minimize power per operation. – Power is directly correlated to the.

Advanced Science and Technology Letters Vol.43 (Multimedia 2013), pp Superscalar GP-GPU design of SIMT.

Lecture #4 Introduction to Data Parallelism and MapReduce CS492 Special Topics in Computer Science: Distributed Algorithms and Systems.

COMP7330/7336 Advanced Parallel and Distributed Computing MapReduce - Introduction Dr. Xiao Qin Auburn University

Parallel programs Inf-2202 Concurrent and Data-intensive Programming Fall 2016 Lars Ailo Bongo

Hadoop Javad Azimi May What is Hadoop? Software platform that lets one easily write and run applications that process vast amounts of data. It includes:

Parallel Programming Models

An Open Source Project Commonly Used for Processing Big Data Sets

ECE/CS 757: Advanced Computer Architecture II Midterm 2 Review

Accelerating MapReduce on a Coupled CPU-GPU Architecture

Hadoop Technopoints.

Chapter 1 Introduction.

CSE 502: Computer Architecture

Presentation transcript:

Combining the strengths of UMIST and The Victoria University of Manchester Matthew Livesey, Hemanth John Jose and Yongping Men COMP60611 – Patterns of parallel programming Task parallelism Turning a serial program into a parallel one by understanding dependencies between the tasks in the program Traditionally tasks were mapped statically, but dynamic mapping algorithms may improve performance. It may be possible to re-organise a serial program to reduce dependencies, but need to be careful that the re-organisation does not impact the performance of the algorithm Scheduling algorithms which focus on the critical path perform best. 01 a = b + c; 02 d = e + f ; 03 z = a + d ;

Combining the strengths of UMIST and The Victoria University of Manchester Matthew Livesey, Hemanth John Jose and Yongping Men COMP60611 – Patterns of parallel programming Pipeline parallelism A stream of data items is processed one after another by a sequence of threads. Figure 4 from Pautasso and Alonso 2006 [9]

Combining the strengths of UMIST and The Victoria University of Manchester Matthew Livesey, Hemanth John Jose and Yongping Men COMP60611 – Patterns of parallel programming Data parallelism General purpose programming on Graphics processors (GPGPU). Exploiting the parallel power of graphics chips to execute data parallel algorithms. E.g. OpenCL Allows parallel programs to run on whatever (supported) hardware is available E.g. Multi-core CPU and GPU Provides bindings in many popular languages Cross platform, open source Map Reduce Based on a paper from Google. Designed for processing large data sets in parallel Data is processed in parallel by “mappers”, and the results recombined with “reducers” Distributed parallelism across heterogeneous, commodity hardware Requires the problem at hand to fit the map reduce pattern Combining patterns It is possible to combine patterns, for example a task graph approach may guide the order in which pipeline steps are applied. However data and task parallelism are two different ways of looking at a parallel problem. They may be applied heirarchically or sequentially but represent two distinct approaches

Combining the strengths of UMIST and The Victoria University of Manchester Matthew Livesey, Hemanth John Jose and Yongping Men COMP60611 – Patterns of parallel programming Presentation and report available at: References: 1. Michael D McCool (Intel). Structured Parallel Programming with Deterministic Patterns. To appear in HotPar ’10, 2 nd USENIX Workshop on Hot Topics in Parallelism, June 2010, Berkeley CA. Accessed from patterns/ patterns/ 2. Jeffrey Dean and Sanjay Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. OSDI'04: Sixth Symposium on Operating System Design and Implementation, San Francisco, CA, December, The Apache Software Foundation. Hadoop Map Reduce Tutorial [Accessed 15 th October 2010] Available from 4. David Tarditi et al. Accelerator: Using Data Parallelism to Program GPUs for General-Purpose Uses Microsoft Research Nickolls et al. Scalable Parallel Programming. ACM Queue Volume 6, Issue 2 (March/April 2008) Pages: Aaftab Munshi, OpenCL, Parallel Computing on the GPU and CPU, SIGGRAPH 2008: Beyond Programmable Shading (presentation) 7. A. Grama et al. Introduction to Parallel Computing. 2 nd Ed. Addison-Wesley, Thomas Rauber and Gudula R¨unger l. Parallel Programming For Multicore and Cluster systems, 1st edition, Spriner, Pages 280, Pautasso, C. and Alonso, G. Parallel Computing Patterns for Grid Workflows, Workflows in Support of Large-Scale Science, WORKS '06 Workshop on, June C. Charles Law,William J. Schroeder, Kenneth M. Martin and Joshua Temkin. A multi-threaded streaming pipeline architecture for large structured data sets, VIS '99: Proceedings of the conference on Visualization '99: celebrating ten years,page , R. Cytron, M. Hindy and W. Hsiehz. Automatic Generation of DAG Parallelism. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, June 1989, SIGPLAN Notices, Volume 24, Number Y. Kwok and I. Ahmad. Benchmarking and Comparison of the Task Graph Scheduling Algorithms. Journal of Parallel and Distributed Computing, 59, pages (1999)