Nonlinear and Symbolic Data Dependence Testing Presented by Chen-Yong Cher William Blume, Rudolf Eigenmann.

Slides:



Advertisements
Similar presentations
Optimizing Compilers for Modern Architectures Copyright, 1996 © Dale Carnegie & Associates, Inc. Dependence Testing Allen and Kennedy, Chapter 3 thru Section.
Advertisements

Generalized Index-Set Splitting Christopher Barton Arie Tal Bob Blainey Jose Nelson Amaral.
Problems and Their Classes
COSC513 Operating System Research Paper Fundamental Properties of Programming for Parallelism Student: Feng Chen (134192)
Optimizing Compilers for Modern Architectures Allen and Kennedy, Chapter 13 Compiling Array Assignments.
Delivering High Performance to Parallel Applications Using Advanced Scheduling Nikolaos Drosinos, Georgios Goumas Maria Athanasaki and Nectarios Koziris.
On the Interaction of Tiling and Automatic Parallelization Zhelong Pan, Brian Armstrong, Hansang Bae Rudolf Eigenmann Purdue University, ECE
Static Single Assignment CS 540. Spring Efficient Representations for Reachability Efficiency is measured in terms of the size of the representation.
Recap 1.Programmer enters expression 2.ML checks if expression is “well-typed” Using a precise set of rules, ML tries to find a unique type for the expression.
ECE 103 Engineering Programming Chapter 11 One Minute Synopsis Herbert G. Mayer, PSU CS Status 7/1/2014.
Preliminary Transformations Chapter 4 of Allen and Kennedy Harel Paz.
Automatic Parallelization of Divide and Conquer Algorithms Radu Rugina and Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology.
Computational Complexity 1. Time Complexity 2. Space Complexity.
1 Sorting Problem: Given a sequence of elements, find a permutation such that the resulting sequence is sorted in some order. We have already seen: –Insertion.
Parallel and Cluster Computing 1. 2 Optimising Compilers u The main specific optimization is loop vectorization u The compilers –Try to recognize such.
Practical Dependence Test Gina Goff, Ken Kennedy, Chau-Wen Tseng PLDI ’91 presented by Chong Liang Ooi.
Control Structures 4 Control structures control the flow of execution of a program 4 The categories of control structures are: –Sequence –Selection –Repetition.
Idiom Recognition in the Polaris Parallelizing Compiler Bill Pottenger and Rudolf Eigenmann Presented by Vincent Yau.
Techniques for Reducing the Overhead of Run-time Parallelization Lawrence Rauchwerger Department of Computer Science Texas A&M University
Dependence Testing Optimizing Compilers for Modern Architectures, Chapter 3 Allen and Kennedy Presented by Rachel Tzoref and Rotem Oshman.
Parallelizing Compilers Presented by Yiwei Zhang.
Antoine Monsifrot François Bodin CAPS Team Computer Aided Hand Tuning June 2001.
Program Performance & Asymptotic Notations CSE, POSTECH.
1 Parallel Programming using the Iteration Space Visualizer Yijun YuYijun Yu and Erik H. D'HollanderErik H. D'Hollander University of Ghent, Belgium
The Impact of Data Dependence Analysis on Compilation and Program Parallelization Original Research by Kleanthis Psarris & Konstantinos Kyriakopoulos Year.
CPT: Search/ Computer Programming Techniques Semester 1, 1998 Objectives of these slides: –to discuss searching: its implementation,
Unit III : Introduction To Data Structures and Analysis Of Algorithm 10/8/ Objective : 1.To understand primitive storage structures and types 2.To.
Chapter 19: Searching and Sorting Algorithms
Chapter 5 Control Structures: Loops 5.1 The while Loop The while loop is probably the most frequently used loop construct. The while loop is a conditional.
Toward Efficient Flow-Sensitive Induction Variable Analysis and Dependence Testing for Loop Optimization Yixin Shou, Robert A. van Engelen, Johnnie Birch,
Matlab tutorial course Lesson 6: Programming tips
Sorting and Searching Pepper. Common Collection and Array Actions Sort in a certain order ◦ Max ◦ Min Shuffle Search ◦ Sequential (contains) ◦ Binary.
Control Structures By Shyam Gurram. Control Structure In this chapter we have two different types of structures. Conditional Structure Iterative Control.
Control Structures II Repetition (Loops). Why Is Repetition Needed? How can you solve the following problem: What is the sum of all the numbers from 1.
Mathematical Background and Linked Lists. 2 Iterative Algorithm for Sum Find the sum of the first n integers stored in an array v : sum (v[], n) temp_sum.
Carnegie Mellon Lecture 14 Loop Optimization and Array Analysis I. Motivation II. Data dependence analysis Chapter , 11.6 Dror E. MaydanCS243:
Data Structure Introduction.
High-Level Transformations for Embedded Computing
Java Methods Big-O Analysis of Algorithms Object-Oriented Programming
Radix Sort and Hash-Join for Vector Computers Ripal Nathuji 6.893: Advanced VLSI Computer Architecture 10/12/00.
Optimizing Compilers for Modern Architectures Creating Coarse-grained Parallelism for Loop Nests Chapter 6, Sections 6.3 through 6.9.
Hashing 1 Hashing. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
Review for Final Exam. Contents 5 questions (20 points each) + 1 bonus question (20 points) – Basic concepts in Chapters 1-4 – Chapters 5-9 – Bonus: Chapter.
Using Lists Games Programming in Scratch. Games Programming in Scratch Extension – Using Lists Learning Objectives Create a temporary data store (list)
How Are Computers Programmed? CPS120: Introduction to Computer Science Lecture 5.
 Control Flow statements ◦ Selection statements ◦ Iteration statements ◦ Jump statements.
Big O David Kauchak cs302 Spring Administrative Assignment 1: how’d it go? Assignment 2: out soon… Lab code.
Theorem Suppose {a n } is non-decreasing and bounded above by a number A. Then {a n } converges to some finite limit a, with a  A. Suppose {b n } is non-increasing.
Flow Control in Imperative Languages. Activity 1 What does the word: ‘Imperative’ mean? 5mins …having CONTROL and ORDER!
Learning A Better Compiler Predicting Unroll Factors using Supervised Classification And Integrating CPU and L2 Cache Voltage Scaling using Machine Learning.
Unit – 3 Control structures. Condition Statements 1.If.…..else :- Has someone ever told you, "if you work hard, then you will succeed"? And what happens.
VARIABLES AND EXPRESSIONS. Definitions: Algebra – Is a language of symbols, including variables. Variable - Is a letter or symbol that represents a quantity.
Windows Programming Lecture 03. Pointers and Arrays.
Big-O. Speed as function Function relating input size to execution time – f(n) = steps where n = length of array f(n) = 4(n-1) + 3 = 4n – 1.
DEVRY CIS 115 F INAL E XAM 3 Check this A+ tutorial guideline at For more classes visit
CS314 – Section 5 Recitation 13
Oracle11g: PL/SQL Programming Chapter 2 Basic PL/SQL Block Structures.
Evaluating Algebraic Expressions
Exponential Functions
A Practical Stride Prefetching Implementation in Global Optimizer
Preliminary Transformations
Radu Rugina and Martin Rinard Laboratory for Computer Science
Algorithms Take a look at the worksheet. What do we already know, and what will we have to learn in this term?
Compiler Code Optimizations
Chapter 8: More on the Repetition Structure
Interprocedural Symbolic Range Propagation for Optimizing Compilers
Print the following triangle, using nested loops
CS 583 Analysis of Algorithms
Algorithm Course Algorithms Lecture 3 Sorting Algorithm-1
Presentation transcript:

Nonlinear and Symbolic Data Dependence Testing Presented by Chen-Yong Cher William Blume, Rudolf Eigenmann

Background 80s-90s Benerjee, Omega 80s-90s Benerjee, Omega Can’t handle symbolic and non-linear expr Can’t handle symbolic and non-linear expr Example: Example: i 3, i 2, c*i where c not a known constant i 3, i 2, c*i where c not a known constant Often arise after compiler transformations Often arise after compiler transformations Range Test (1998) Range Test (1998) Check if certain symbolic inequalities hold Check if certain symbolic inequalities hold

Range Test – high level view Disproves Dependences Disproves Dependences No dependence if from iteration i to i+1 No dependence if from iteration i to i+1 Range (i) not overlap Range(i+1) Range (i) not overlap Range(i+1) No overlap if No overlap if max(range(i)) < min(range(i+1)) max(range(i)) < min(range(i+1)) Key: able to evaluate inqualities symbolically Key: able to evaluate inqualities symbolically

Examples 0 2n4n 0 2n n 3n4n5n6n 0 n2n f(*,*) g(*,*) f(0,*)f(1,*)f(2,*)g(0,*)g(1,*)g(2,*) f(0,1)f(1,1)g(0,1)g(1,1) Case 1 : Disprove independence by Theorem 1 Case 2 : Disprove by Theorem 2 or Theorem 3 Case 3 : Reduced to case 1 or 2 through permutation Array subscript for i for j A[f(i,j)] = … = A[g(i,j)]

Theorem 1 If f j max (i 1, … i j ) < g j min (i 1,…,i j ) for all (i 1,…,i j ) € R j, then there’s no dependence 0 f(i1,…,ij) g(i1,…,ij) max min

Theorem 2 If g j min (i 1,…,i j ) is monotonically non-decreasing for i j and If f j max (i 1, … i j ) < g j min (i 1,…,i j +stride j ) for all (i 1,…,i j ) € R j, and lower j <= i j <= upper j – stride j then there’s no dependence Note: Need to apply for f->g and g->f

Theorem 3 If g j min (i 1,…,i j ) is monotonically non-increasing for i j and If f j max (i 1, … i j ) < g j min (i 1,…,i j -stride j ) for all (i 1,…,i j ) € R j, and Lower j + stride j <= i j <= upper j then there’s no dependence Note: Need to apply for f->g and g->f

Permuting Loops for Testing For Case 3, all 3 theorems fail For Case 3, all 3 theorems fail Permute loops to reduce to case 1 or 2 Permute loops to reduce to case 1 or 2 Does not try all possible permutations Does not try all possible permutations Tries to move loop inwards Tries to move loop inwards Make range continuous Make range continuous

Algorithm Refer to paper Refer to paperpaper

Generalizing the Range Test Multidimentional arrays Multidimentional arrays Negetive strides Negetive strides Loop-variant variables Loop-variant variables Not perfectly nested loops Not perfectly nested loops

Symbolic Range Propagation Collect and propagate symbolic ranges Collect and propagate symbolic ranges 2 parts 2 parts Range propagation algorithm Range propagation algorithm Symbolic expression comparison facility Symbolic expression comparison facility Example code segment Example code segment If (a<100) THEN {BODY (know a < 100)} Else if (a < 200) THEN { know a < 200) } {merge point, know a < 100}

Conclusions Identify parallel loops effectively Identify parallel loops effectively Handle non-linear, symbolic expressions Handle non-linear, symbolic expressions The only dependence test in Polaris The only dependence test in Polaris Parallelize Perfect as well as hand-written Parallelize Perfect as well as hand-written Acceptable execution time w/ memoization Acceptable execution time w/ memoization