1/16 CALCULATING PREFIX SUMS Vladimir Jocovi ć 2012/0011.

Slides:



Advertisements
Similar presentations
Chapter 3 Brute Force Brute force is a straightforward approach to solving a problem, usually directly based on the problem’s statement and definitions.
Advertisements

Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
Analysis of Algorithms
Linear Sorts Counting sort Bucket sort Radix sort.
Parallel Programming – OpenMP, Scan, Work Complexity, and Step Complexity David Monismith CS599 Based upon notes from GPU Gems 3, Chapter
Lecture 8 – Collective Pattern Collectives Pattern Parallel Computing CIS 410/510 Department of Computer and Information Science.
Advanced Topics in Algorithms and Data Structures 1 Rooting a tree For doing any tree computation, we need to know the parent p ( v ) for each node v.
Data Parallel Algorithms Presented By: M.Mohsin Butt
Parallel Prefix Sum (Scan) GPU Graphics Gary J. Katz University of Pennsylvania CIS 665 Adapted from articles taken from GPU Gems III.
Chapter 2: Algorithm Analysis Application of Big-Oh to program analysis Running Time Calculations Lydia Sinapova, Simpson College Mark Allen Weiss: Data.
The Design and Analysis of Algorithms
Lower Bounds for Comparison-Based Sorting Algorithms (Ch. 8)
Upcrc.illinois.edu OpenMP Lab Introduction. Compiling for OpenMP Open project Properties dialog box Select OpenMP Support from C/C++ -> Language.
Machine level architecture Computer Architecture Basic units of a Simple Computer.
Introduction to CUDA Programming Scans Andreas Moshovos Winter 2009 Based on slides from: Wen Mei Hwu (UIUC) and David Kirk (NVIDIA) White Paper/Slides.
Chapter 10 Applications of Arrays and Strings. Chapter Objectives Learn how to implement the sequential search algorithm Explore how to sort an array.
Chapter 10 Strings, Searches, Sorts, and Modifications Midterm Review By Ben Razon AP Computer Science Period 3.
Array Cs212: DataStructures Lab 2. Array Group of contiguous memory locations Each memory location has same name Each memory location has same type a.
BUILDING JAVA PROGRAMS CHAPTER 7 Arrays. Exam #2: Chapters 1-6 Thursday Dec. 4th.
© David Kirk/NVIDIA and Wen-mei W. Hwu, 2007 ECE 498AL, University of Illinois, Urbana-Champaign 1 ECE 498AL Lecture 12: Application Lessons When the tires.
 DEFINE COMPUTER ? EXPLAIN CLASSIFICATION OF COMPUTER.  WHAT ARE INPUT AND OUTPUT DEVICES OF COMPUTER ? EXPALIN OUTPUT DEVICES.  WHAT ARE MEMORY AND.
Parallel Algorithms Patrick Cozzi University of Pennsylvania CIS Spring 2012.
Parallel Algorithms Patrick Cozzi University of Pennsylvania CIS Fall 2013.
TRANSITION DIAGRAM BASED LEXICAL ANALYZER and FINITE AUTOMATA Class date : 12 August, 2013 Prepared by : Karimgailiu R Panmei Roll no. : 11CS10020 GROUP.
ENG College of Engineering Engineering Education Innovation Center 1 Array Accessing and Strings in MATLAB Topics Covered: 1.Array addressing. 2.
Basic Data Structures Stacks. A collection of objects Objects can be inserted into or removed from the collection at one end (top) First-in-last-out.
© David Kirk/NVIDIA, Wen-mei W. Hwu, and John Stratton, ECE 498AL, University of Illinois, Urbana-Champaign 1 CUDA Lecture 7: Reductions and.
CS 193G Lecture 5: Parallel Patterns I. Getting out of the trenches So far, we’ve concerned ourselves with low-level details of kernel programming Mapping.
© David Kirk/NVIDIA and Wen-mei W. Hwu, University of Illinois, CS/EE 217 GPU Architecture and Parallel Programming Lecture 11 Parallel Computation.
Comparison Networks Sorting Sorting binary values Sorting arbitrary numbers Implementing symmetric functions.
Sasa Stojanovic Veljko Milutinovic
QuickSort Choosing a Good Pivot Design and Analysis of Algorithms I.
Cholesky decomposition Teodora Aleksi ć, 391/2012.
TWO IMAGE PROCESSING ALGORITHMS BRIGHTNESS GAMA CORRECTION MAJA VUKASOVIĆ 2012/0003.
Higher Computing Science 2016 Prelim Revision. Topics to revise Computational Constructs parameter passing (value and reference, formal and actual) sub-programs/routines,
GPGPU: Parallel Reduction and Scan Joseph Kider University of Pennsylvania CIS Fall 2011 Credit: Patrick Cozzi, Mark Harris Suresh Venkatensuramenan.
Contingency table analyses Miloš Radić 12/0010 1/14.
Miloš Kotlar 2012/115 Single Layer Perceptron Linear Classifier.
Chapter 9: Sorting1 Sorting & Searching Ch. # 9. Chapter 9: Sorting2 Chapter Outline  What is sorting and complexity of sorting  Different types of.
THRESHOLDING (IMAGE PROCESSING) Filip Vuković 2012/0205.
THRESHOLDING (IMAGE PROCESSING) Filip Vuković 2012/0205.
1/16 CALCULATING PREFIX SUMS Vladimir Jocovi ć 2012/0011.
1/4 CALCULATING PREFIX SUMS Vladimir Jocovi ć 2012/0011.
Chapter 3: Sorting and Searching Algorithms 3.1 Searching Algorithms.
Recursion Chapter 16 S. Dandamudi To be used with S. Dandamudi, “Introduction to Assembly Language Programming,” Second Edition, Springer, 2005.
Indexing and Ranking MICROPROCESSOR SYSTEMS – MAXELER PROJECT AUTHOR: NIKOLA MAKSIMOVIC 545/12.
Luka Petrović 69/2012 1/12. The Standard Deviation is a measure of how spread out numbers are. Its symbol is σ (the greek letter sigma) The formula is.
By Leon Gradisar (531/2010) Golden Section Search.
Gaussian Elimination and Back Substitution Aleksandra Cerović 0328/2010 1/15Gaussian Elimination And Back Substitution.
Quasi Random Sequences Author: Stefan Ilijevski. Random sequences? 2/10.
5.3 Sequential Circuits - An Introduction to Informatics WMN Lab. Hey-Jin Lee.
Introduction to Algorithm Complexity Bit Sum Problem.
Chapter 9: Data types and data structures OCR Computing for A Level © Hodder Education 2009.
© David Kirk/NVIDIA and Wen-mei W. Hwu, University of Illinois, CS/EE 217 GPU Architecture and Parallel Programming Lecture 12 Parallel Computation.
Warm Up Compute the following by using long division.
Polynomial Interpolation and Extrapolation
Gray Codes.
1-1 Logic and Syntax A computer program is a solution to a problem.
More on Recursion.
Notes Over 2.1 Function {- 3, - 1, 1, 2 } { 0, 2, 5 }
Data Structures & Algorithms
Mattan Erez The University of Texas at Austin
© 2012 Elsevier, Inc. All rights reserved.
A graphing calculator is required for some problems or parts of problems 2000.
ECE408 Applied Parallel Programming Lecture 14 Parallel Computation Patterns – Parallel Prefix Sum (Scan) Part-2 © David Kirk/NVIDIA and Wen-mei W.
Chapter 7 Functions and Graphs.
Evaluating Logarithms
ECE 498AL Lecture 15: Reductions and Their Implementation
Presentation transcript:

1/16 CALCULATING PREFIX SUMS Vladimir Jocovi ć 2012/0011

2/16 WHAT ARE ALL-PREFIX SUMS? The all-prefix-sums operation takes: The all-prefix-sums operation takes: a binary associative operator ⊕ a binary associative operator ⊕ an ordered set of n elements [a0, a1,..., an − 1] an ordered set of n elements [a0, a1,..., an − 1] And returns the ordered set And returns the ordered set [a0, (a0 ⊕ a1),..., (a0 ⊕ a1 ⊕... ⊕ an − 1)] [a0, (a0 ⊕ a1),..., (a0 ⊕ a1 ⊕... ⊕ an − 1)] Inclusive type Inclusive type

3/16 WHAT ARE ALL-PREFIX SUMS? Example: Example: Operation ⊕ is addition Operation ⊕ is addition Input array - [3, 1, 7, 0, 4, 1, 6, 3] Input array - [3, 1, 7, 0, 4, 1, 6, 3] Would return: Would return: Output array - [3, 4, 11, 11, 15, 16, 22, 25] Output array - [3, 4, 11, 11, 15, 16, 22, 25]

4/16 WHERE ARE ALL-PREFIX SUMS USED? To lexically compare strings of characters To lexically compare strings of characters To evaluate polynomials To evaluate polynomials Sorting algorithms (radix sort, quicksort) Sorting algorithms (radix sort, quicksort)

5/16 HOW DOES THE HARDWARE LOOK LIKE? Graph representing PrefixSumKernel Io.input(“x”, type, …Io.output(“z”, type, … result = x + (cnt < loopVal?0:sum); Storing partial sum

6/16 HOW DOES THE HARDWARE LOOK LIKE? Graph representing PrefixSumKernel at its final step

7/16 HOW DOES THE HARDWARE LOOK LIKE? Manager graph

8/16 ALGORITHM

9/16 ALGORITHM

10/16 ALGORITHM

11/16 ALGORITHM

12/16 KERNEL CODE

13/16 BUILD AND RUN

14/16 CONCLUSION Poor maxeler results? Poor maxeler results? Just a simulation, not a real hardware Just a simulation, not a real hardware

15/16 REFERENCES Varbanescu, Ana Lucia, "Dataflow Programming with MaxCompiler," Delft University of Technology, Netherlands, Blelloch, Guy E., "Prefix Sums and Their Applications," Carnegie Mellon University, USA, Milutinovic, V., editor, "Computer Architecture," (Chapter 9, DataFlow Computation, Dennis, J.,), North Holland, Milutinovic, V., Salom, J., Trifunovic, N., Giorgi, R., "Guide to DataFlow SuperComputing," Springer, Milutinovic, V., editor, Advances in Computers: DataFlow, Elsevier, 2015.

16/16 QUESTIONS AND ANSWERS