Dependence Analysis Important and difficult

Slides:



Advertisements
Similar presentations
Chapter 3: Control Flow S. M. Farhad. Statements and Blocks An expression becomes a statement when it is followed by a semicolon Braces { and } are used.
Advertisements

Dependency Test in Loops By Amala Gandhi. Data Dependence Three types of data dependence: 1. Flow (True) dependence : read-after-write int a, b, c; a.
Optimizing Compilers for Modern Architectures Compiler Improvement of Register Usage Chapter 8, through Section 8.4.
Optimizing Compilers for Modern Architectures Coarse-Grain Parallelism Chapter 6 of Allen and Kennedy.
1 ECE734 VLSI Arrays for Digital Signal Processing Loop Transformation.
CS 201 Compiler Construction
Using the Iteration Space Visualizer in Loop Parallelization Yijun YU
Optimizing Compilers for Modern Architectures Allen and Kennedy, Chapter 13 Compiling Array Assignments.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) Parallelism & Locality Optimization.
Instruction-Level Parallel Processors {Objective: executing two or more instructions in parallel} 4.1 Evolution and overview of ILP-processors 4.2 Dependencies.
Carnegie Mellon Lecture 7 Instruction Scheduling I. Basic Block Scheduling II.Global Scheduling (for Non-Numeric Code) Reading: Chapter 10.3 – 10.4 M.
Optimizing single thread performance Dependence Loop transformations.
Bernstein’s Conditions. Techniques to Exploit Parallelism in Sequential Programming Hierarchy of levels of parallelism: Procedure or Methods Statements.
Control Flow Analysis (Chapter 7) Mooly Sagiv (with Contributions by Hanne Riis Nielson)
Dependency Analysis We want to “parallelize” program to make it run faster. For that, we need dependence analysis to ensure correctness.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 8- 1 Overview 8.1 An Array Type for Strings 8.2 The Standard string.
Compiler Challenges for High Performance Architectures
Enhancing Fine-Grained Parallelism Chapter 5 of Allen and Kennedy Optimizing Compilers for Modern Architectures.
Dependence Analysis Kathy Yelick Bebop group meeting, 8/3/01.
Parallel and Cluster Computing 1. 2 Optimising Compilers u The main specific optimization is loop vectorization u The compilers –Try to recognize such.
Enhancing Fine- Grained Parallelism Chapter 5 of Allen and Kennedy Mirit & Haim.
Compiler Challenges, Introduction to Data Dependences Allen and Kennedy, Chapter 1, 2.
CMPUT680 - Fall 2006 Topic A: Data Dependence in Loops José Nelson Amaral
C++ Programming: From Problem Analysis to Program Design, Third Edition Chapter 4: Control Structures I (Selection)
Stanford University CS243 Winter 2006 Wei Li 1 Data Dependences and Parallelization.
Optimizing Compilers for Modern Architectures Coarse-Grain Parallelism Chapter 6 of Allen and Kennedy.
CMPUT Compiler Design and Optimization1 CMPUT680 - Winter 2006 Topic B: Loop Restructuring José Nelson Amaral
Parallelizing Compilers Presented by Yiwei Zhang.
Data Dependences CS 524 – High-Performance Computing.
1 Infinite Loops  The body of a while loop eventually must make the condition false  If not, it is an infinite loop, which will execute until the user.
Optimizing Compilers for Modern Architectures Dependence: Theory and Practice Allen and Kennedy, Chapter 2 pp
©The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 4 th Ed Chapter Chapter 6 Repetition Statements.
Dependence: Theory and Practice Allen and Kennedy, Chapter 2 Liza Fireman.
Optimizing Compilers for Modern Architectures Dependence: Theory and Practice Allen and Kennedy, Chapter 2.
Enhancing Fine-Grained Parallelism Chapter 5 of Allen and Kennedy Optimizing Compilers for Modern Architectures.
Array Dependence Analysis COMP 621 Special Topics By Nurudeen Lameed
Character Arrays Based on the original work by Dr. Roger deBry Version 1.0.
Copyright © Nancy Acemian 2004 For Loops-Break-Continue COMP For loop is a counter controlled loop. For loop is a pretest loop. Used when number.
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Optimizing Compilers CISC 673 Spring 2011 Dependence Analysis and Loop Transformations.
1 Theory and Practice of Dependence Testing Data and control dependences Scalar data dependences  True-, anti-, and output-dependences Loop dependences.
High-Level Transformations for Embedded Computing
Program Analysis & Transformations Loop Parallelization and Vectorization Toheed Aslam.
Optimizing Compilers for Modern Architectures Creating Coarse-grained Parallelism for Loop Nests Chapter 6, Sections 6.3 through 6.9.
Chapter 6 Questions Quick Quiz
Language Find the latest version of this document at
Dependence Analysis and Loops CS 3220 Spring 2016.
Lecture 38: Compiling for Modern Architectures 03 May 02
Parallelizing Loops Moreno Marzolla
G. Pullaiah College of Engineering and Technology
Review 1.
Chapter 4 Repetition Statements (loops)
CS314 – Section 5 Recitation 13
Data Dependence, Parallelization, and Locality Enhancement (courtesy of Tarek Abdelrahman, University of Toronto)
Loop Restructuring Loop unswitching Loop peeling Loop fusion
Introduction to Algorithms
Parallelizing Loops Moreno Marzolla
Engineering Innovation Center
Advanced Programming Behnam Hatami Fall 2017.
Advanced Programming in Java
Parallelization, Compilation and Platforms 5LIM0
C Stuff CS 2308.
Outline Altering flow of control Boolean expressions
Register Pressure Guided Unroll-and-Jam
Building Java Programs
Chapter 4: Control Structures I (Selection)
Program Flow.
Strings #include <stdio.h>
CS31 Discussion 1H Fall18: week 3
Introduction to Optimization
Optimizing single thread performance
Presentation transcript:

Dependence Analysis Important and difficult Parallelize Optimize memory hierarchy Loop --- and non-loops ? Application to program transformation Should yield “better” code Should NOT alter code semantics atof( ) - converts the "character string" pointed to by nptr to a double and returns its value atol( ) - converts the "character string" pointed to by str to a long and returns its value atoi( ) - converts the "character string" pointed to by str to an int

Which Loop(s) Parallelizable ? 1) Do I = 1, n A[I] = 5 * B[I] + A[I] Enddo 2) Do I = 1, n A[I-1] = 5 * B[I] + A[I] Enddo 3) Do I = 1, n temp = 5 * B[I] A[I] = temp Enddo 4) Do I = 1, n A[I+1] = 5 * B[I] + A[I] Enddo

Data Dependence Statement S2 is dependent on S1 iff Dependence “types” S2 follows S1 in execution S2, S1 access same memory location, M One of S2 or S1 write to M Dependence “types” True (or flow) -- S2 reads M set by S1 Anti- -- S2 writes M read by S1 Output -- S1 and S2 write M Dependence Graph

Example S1: a = 3.14 S2: b = 5.0 S3: c = a * b * b S1 S2 S3

Example S1: a = 3.14 S1 S2 S2: b = 5.0 S3: c = a * b * b S4: b = a * c

Program Order, Dependence Sequential order of imperative programs is too restrictive Partial order of dependences needed to ensure “correctness” Reordering transformations should not violate dependence Can we always determine dependence?

Dependence in Loops Parameterize stmts by loop iterations Loop nesting level: number of surrounding loops + 1 Iteration number: value of iteration (index) variable Iteration vector, I, of loop is the value of all index variables for outer loop(s) for a particular iteration of innermost loop Iteration space is set of all possible iteration vectors for a statement