PSUCS322 HM 1 Languages and Compiler Design II IR Code Generation I Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.

Slides:



Advertisements
Similar presentations
Chapter 6 Intermediate Code Generation
Advertisements

ICE1341 Programming Languages Spring 2005 Lecture #13 Lecture #13 In-Young Ko iko.AT. icu.ac.kr iko.AT. icu.ac.kr Information and Communications University.
1 Lecture 10 Intermediate Representations. 2 front end »produces an intermediate representation (IR) for the program. optimizer »transforms the code in.
Intermediate Code Generation
Programming Languages and Paradigms
Adapted from Scott, Chapter 6:: Control Flow Programming Language Pragmatics Michael L. Scott.
Chapter 8 ICS 412. Code Generation Final phase of a compiler construction. It generates executable code for a target machine. A compiler may instead generate.
Intermediate Code Generation. 2 Intermediate languages Declarations Expressions Statements.
Control-Flow Graphs & Dataflow Analysis CS153: Compilers Greg Morrisett.
Backpatching: The syntax directed definition we discussed before can be implemented in two or more passes (we have both synthesized attributes and inheritent.
Lecture 08a – Backpatching & Recap Eran Yahav 1 Reference: Dragon 6.2,6.3,6.4,6.6.
Short circuit code for boolean expressions: Boolean expressions are typically used in the flow of control statements, such as if, while and for statements,
8 Intermediate code generation
1 Compiler Construction Intermediate Code Generation.
Generation of Intermediate Code Compiler Design Lecture (03/30//98) Computer Science Rensselaer Polytechnic.
COMPILERS Basic Blocks and Traces hussein suleman uct csc3005h 2006.
Computer Architecture CSCE 350
Overview of Previous Lesson(s) Over View  Front end analyzes a source program and creates an intermediate representation from which the back end generates.
COP4020 Programming Languages Expression and assignment Prof. Xin Yuan.
ECE 103 Engineering Programming Chapter 11 One Minute Synopsis Herbert G. Mayer, PSU CS Status 7/1/2014.
Compiler Construction
Gary MarsdenSlide 1University of Cape Town Statements & Expressions Gary Marsden Semester 2 – 2000.
Intermediate Representation I High-Level to Low-Level IR Translation EECS 483 – Lecture 17 University of Michigan Monday, November 6, 2006.
CS412/413 Introduction to Compilers Radu Rugina Lecture 16: Efficient Translation to Low IR 25 Feb 02.
1 Chapter 7: Expressions Expressions are the fundamental means of specifying computations in a programming language To understand expression evaluation,
1 Chapter 4 Language Fundamentals. 2 Identifiers Program parts such as packages, classes, and class members have names, which are formally known as identifiers.
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
PSUCS322 HM 1 Languages and Compiler Design II Basic Blocks Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU Spring.
PSUCS322 HM 1 Languages and Compiler Design II Parameter Passing Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.
PSUCS322 HM 1 Languages and Compiler Design II IR Canonicalization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.
PSU CS 106 Computing Fundamentals II VB Statements HM 5/19/2008.
PSUCS322 HM 1 Languages and Compiler Design II Project 2 Hints Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU Spring.
Compiler Construction A Compulsory Module for Students in Computer Science Department Faculty of IT / Al – Al Bayt University Second Semester 2008/2009.
Programming C/C++ on Eclipe C Training Trình bày : Ths HungNM.
PSUCS322 HM 1 Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.
Data Transfer & Decisions I (1) Fall 2005 Lecture 3: MIPS Assembly language Decisions I.
CSC 8310 Programming Languages Meeting 2 September 2/3, 2014.
Topic #10: Optimization EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
CS412/413 Introduction to Compilers Radu Rugina Lecture 15: Translating High IR to Low IR 22 Feb 02.
Ryan Chu. Arithmetic Expressions Arithmetic expressions consist of operators, operands, parentheses, and function calls. The purpose is to specify an.
Chapter 8 Intermediate Code Zhang Jing, Wang HaiLing College of Computer Science & Technology Harbin Engineering University.
Compiler Chapter# 5 Intermediate code generation.
Hello.java Program Output 1 public class Hello { 2 public static void main( String [] args ) 3 { 4 System.out.println( “Hello!" ); 5 } // end method main.
Interpretation Environments and Evaluation. CS 354 Spring Translation Stages Lexical analysis (scanning) Parsing –Recognizing –Building parse tree.
CS 153: Concepts of Compiler Design October 5 Class Meeting Department of Computer Science San Jose State University Fall 2015 Instructor: Ron Mak
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
Expressions and Statements. Expressions Literals and identifiers are expressions More complex expressions are built from simple expressions by the application.
Arithmetic Expressions
These notes were originally developed for CpSc 210 (C version) by Dr. Mike Westall in the Department of Computer Science at Clemson.
Control Structures sequence of execution of high-level statements.
Introduction to Code Generation and Intermediate Representations
 In the java programming language, a keyword is one of 50 reserved words which have a predefined meaning in the language; because of this,
Chapter Seven: Expressions and Assignment Statements Lesson 07.
Code Generation CPSC 388 Ellen Walker Hiram College.
CS 153: Concepts of Compiler Design October 12 Class Meeting Department of Computer Science San Jose State University Fall 2015 Instructor: Ron Mak
Three Address Code Generation of Control Statements continued..
Copyright © 2009 Elsevier Chapter 6:: Control Flow Programming Language Pragmatics Michael L. Scott.
Digital Computer Concept and Practice Copyright ©2012 by Jaejin Lee Control Unit.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 10 Ahmed Ezzat.
CS 536 © CS 536 Spring Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 15.
CS 404 Introduction to Compiler Design
CS2100 Computer Organisation
Expressions and Assignment
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
11/10/2018.
Chap. 6 :: Control Flow Michael L. Scott.
Chapter 6 Intermediate-Code Generation
CMPE 152: Compiler Design October 4 Class Meeting
Chap. 6 :: Control Flow Michael L. Scott.
Intermediate Code Generation
Presentation transcript:

PSUCS322 HM 1 Languages and Compiler Design II IR Code Generation I Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU Spring 2010 rev.: 4/16/2010

PSUCS322 HM 2 Agenda Grammar G1 CodeGen Overview Arithmetic Expression Translation Boolean Expression Translation Various Statement Translations

PSUCS322 HM 3 Grammar G1 Input:AST representation of MINI source Output:Three-address code or IR tree code Approach:Syntax-directed translation Generic Grammar G1, start symbol: S E -> E arithop E | E relop E | E logicop E E -> ‘-’ E | ‘!’ E E -> ‘newArray’ E// new int array size E1 E -> E ‘[’ E ‘]’// indexed element E -> id | num// end-nodes S -> E ‘:=’ E ‘;’// assignment statement S -> ‘if’ ‘(‘ E ‘)’ ‘then’ S ‘else’ S S -> ‘while’ ‘(‘ E ‘)’ S S -> ‘print’ E ‘;’ S -> ‘return’ E ‘;’

PSUCS322 HM 4 CodeGen Overview Arithmetic Expressions: –preserve precedence and associativity –Pay attention, whether language requires check for zero-divide Boolean Expressions: –define short-circuit evaluation vs. complete evaluation –discern bit-wise vs. logical and, or, xor –multiple unary NOT allowed? Array definition: –1D is simple for compiler –Per dimension, record: element size, low-bound, high-bound, total size, index type and type Array element reference: –L-value or r-value? –Nested array reference: index expression can in turn be array element –Discern pass by value or reference, other Statements: –Goto into other scope, out of current scope issue in FTN, C –Return: long-jump in C non-trivial Parameters: –Function parameters in PL/I and Pascal hard –Easy to confuse pointer type parameters with reference parameters (in C)

PSUCS322 HM 5 Arithmetic Expression Translation Generate tree-address code: get new temp per operation E.s holds statements that evaluate E E.t is temp that holds E’s value t = new Temp(); E.s := [ E1.s; E2.s; t := E1.t arithop E2.t; ] E.t := t; t = new Temp(); E.s := [ E1.s; t := unaryop E1.t; ] E.t := t; E -> E1 arithop E2 E -> unaryop E1

PSUCS322 HM 6 Arithmetic Expression Translation, Cont’d To generate IR trees, embed expression subtrees into current root. Attribute E.tr holds IR tree for E E.tr := ( BINOP arithop E1.tr E2.tr ) E.tr := ( UNOP unaryop E1.tr null ) b * -c + b * d / e // assume l-2-r associativity of * / % => t1 := -c t2 := b * t1 t3 := b * d t4 := t3 / e t5 := t2 + t4 => (BINOP + (BINOP * b (UNOP – c ) ) (BINOP / (BINOP * b d ) e ) ) similar to Polish Postfix notation, after Lukasiewicz, 1920 E -> E1 arithop E2 E -> unaryop E1

PSUCS322 HM 7 Boolean Expression Translation Rely on target machine with conditional branches –Condition can be part of instruction –Or condition can be inquired by using machine flags –Or condition can be evaluated separately (canonical execution) and then be provided as one of the arguments –Operands are: condition, target address, and *+1 CodeGen uses temps for intermediate booleans Or CodeGen uses flow of control, so “code locations” imply state of boolean subexpressions Or combination of both Target computer may provide boolean or even bitwise operations: –And –Or –Xor –Not, etc

PSUCS322 HM 8 Boolean Expression Translation, Quads Relational operations need to record their result, e.g. in machine flags. Logical operations can be realized through computation proper or via control flow. See sample expression: a 2 a 2// source => Pure code mapping, with logical and, or, xor instruction: t1 := a < 5// e.g. encode 0 as false, 1 as true t2 := b > 2 t3 := t1 || t2// generate jump out if t3 is false => Control flow mapping, w/o logical and, or, xor : t1 := 1// guess t1 is true, override if needed if a < 5 goto l1// could be quad: Cond_jump_if_less t1 := 0// guess was wrong, override to false l1:t2 := 1// guess: set t2 to true initially if b > 2 goto l2 t2 := 0// wrong guess, set t2 to false l2:t3 := 1// final guess if t1 goto l3 if t2 goto l3 t3 := 0// final guess computed as false l3: // use t3

PSUCS322 HM 9 Better Representation of Booleans, and IR Use target machine’s native logical operations for: and, or, xor ; also in sample expression: a 2 t1 := 1 if a < 5 goto l1 t1 := 0 l1: t2 := 1 if b > 2 goto l2 t2 := 0 l2: t3 := t1 or t2 // use t3 MOVE t3 ( (BINOP || (ESEQ [ [MOVE t1 (CONST 1) ] [CJUMP < (NAME a) (CONST 5) l1 ] [MOVE t1 (CONST 0) ] [LABEL l1] ] t1 ) (ESEQ [ [MOVE t2 (CONST 1)] [CJUMP > (NAME b) (CONST 2) l2 ] [MOVE t2 (CONST 0) ] [LABEL l2] ] t2 )

PSUCS322 HM 10 Value Representation, Relational L := new Label(); t := new Temp(); E.s := [ E1.s; E2.s; t := 1; if ( E1.t relop E2.t ) goto L; t := 0; L:... ] E.t := t; E -> E1 relop E2 Three-Address Code: IR Tree Code: L := new NAME(); t := new TEMP(); E.tr := ( ESEQ [ [MOVE t (CONST 1 ) ] [ CJUMP relop E1.tr E2.tr L ] [MOVE t (CONST 0) ] [LABEL L] t )

PSUCS322 HM 11 Value Representation, Three-Address Code L := new Label(); t := new Temp(); E.s := [ E1.s; E2.s; t := 1; if ( E1.t == 1 ) goto L; if ( E2.t == 1 ) goto L; t := 0; L:... ] E.t := t; E -> E1 ‘||’ E2 L := new Label(); t := new Temp(); E.s := [ E1.s; E2.s; t := 0; if ( E1.t == 0 ) goto L; if ( E2.t == 0 ) goto L; t := 1; L:... ] E.t := t; E -> E1 ‘&&’ E2 t := new Temp(); E.s := [ E1.s; t := 1 – E1.t; ] E.t := t; E -> E1 ‘!’ E2

PSUCS322 HM 12 Value Representation, IR Tree Code L = new NAME(); t = new TEMP(); E.tr := (ESEQ [ [MOVE t (CONST 1) ] [CJUMP == E1.tr (CONST 1) L ] [CJUMP == E2.tr (CONST 1) L ] [MOVE t (CONST 0) ] [LABEL L] t ) E -> E1 ‘||’ E2 E -> E1 ‘&&’ E2 t = new TEMP(); E.tr := (ESEQ [MOVE t (BINOP – (CONST 1) E1.tr)] t ) E -> E1 ‘!’ E2 L = new NAME(); t = new TEMP(); E.tr := (ESEQ [ [MOVE t (CONST 0) ] [CJUMP == E1.tr (CONST 0) L ] [CJUMP == E2.tr (CONST 0) L ] [MOVE t (CONST 1) ] [LABEL L] t )

PSUCS322 HM 13 Control-Flow Mapping, Long If Version Booleans used in programs to direct flow of control, e.g. if ( a 2 ) S1; else S2; Frequently, the Boolean result is not needed afterwards. Thus possible to generate positional code. Instead of: // assume ( a 2 ) stored in t3 // code to compute t3, includes boolean OR || if ( t3 == 0 ) goto l2 L1: code for S1 goto L3 l2: code for S2 L3:... Successor of if-statement... t3 not needed; if machine flag: overridden by S1, S2

PSUCS322 HM 14 Control-Flow Mapping, Shorter If Version 1. No need to create temps to compute boolean value 2. How does code-gen know where to branch to? Use “back- patching!” Can be done by buffering code, or after CodeGen. Ramifications would be good Midterm question if ( a < 5 ) goto L4 if ( b > 2 ) goto L4 goto L5 L4: code for S1 goto L6 l5: code for S2 L6:... Code after if-statement

PSUCS322 HM 15 Control-Flow Mapping, Nested If Statements Data structures needed to back-patch? Object Code Skeleton if ( a >= 5 ) goto L8 // back-patch if ( b <= 2 ) goto L7 // back-patch code for S1 goto L10 // back-patch L7: // L7 resolved code for S2 goto L10 // back-patch L8: // L8 resolved if ( c >= 6 ) goto L9 // back-patch code for S3 goto L10 // back-patch L9: // L9 resolved code for S4 L10: // L10 resolved Source Code Skeleton if ( a < 5 ) if ( b > 2 ) S1; else S2; //end if else if ( c < 6 ) S3; else S4; //end if

PSUCS322 HM 16 Control-Flow Mapping, Elsif Clauses (Ada) How can linked-list of to-be-back-patched addresses be created? Object Code Skeleton if ( a >= 5 ) goto L11 code for S1 goto L14 L11: if ( b <= 2 ) goto L12 code for S2 goto L14 L12: if ( c >= 6 ) goto L13 code for S3 goto L14 L13: code for S4 L14: Source Code Skeleton if a < 5 then S1; elsif b > 2 then S2; elsif c < 6 then S3; else S4; end if;

PSUCS322 HM 17 Control-Flow Mapping, While Are there size (of code) limitations to back-patching? Object Code Skeleton // R1 holds induction variable L15: if ( R1 >= 10 ) goto L16 code for S R1++ goto L15 L16: Source Code Skeleton while ( i < 10 ) { S; i++; } //end while // assume i NOT needed after // “i” is pure IV

PSUCS322 HM 18 Control-Flow Mapping, Repeat (Pascal) Object Code Skeleton // R1 holds induction variable L15: if ( R1 >= 10 ) goto L16 code for S R1++ goto L15 L16: Source Code Skeleton //Pascal source repeat S; i++; until i >= 10; // again no use of “i” after “Fall-Through” in Repeat vs. initial test in While

PSUCS322 HM 19 Control-Flow Mapping, For What happens if “i” (induction variable AKA IV) is defined outside, and used as loop parameter? What is its value after for loop completion? Can it be referenced? i.e. value be printed? What happens if IV is assigned inside loop? What should happen, if IV value is > end-value at start? Object Code Skeleton mov R1, #0 L17: If ( R1 >= 10 ) goto L18 code for S R1++ goto L17 L18: Source Code Skeleton for( int i=0; i<10; i++ ) { S; } //end for // i is undefined/not used // can be IV in reg

PSUCS322 HM 20 Back Patching Example if ( a 2 ) S1; else S2; Handling a<5: if (a ; // needs to be patched; addr. insertion Handling b>2: if (b > 2) goto ;// needs to be patched Handling..||..: if (a ;//.. else fall through if (b > 2) goto ;// is patched to goto // needs to be patched Handling if.. S1 else S2: if (a is patched to L4 if (b > 2) goto L4; goto L5; // is patched to L5 L4: [code for S1] goto L6;// then clause L5: [code for S2]// else clause L6:// end of If Statement

PSUCS322 HM 21 Back Patching: Jump Labels Three-Address Code: Add two attributes E.true — position to jump to when E evaluates to true; E.false — position to jump to when E evaluates to false. E.s := [ E1.s; E2.s; if ( E1.t relop E2.t ) goto E.true; E.false: ] E1.true := E.true; E1.false := new Label(); E2.true := E.true; E2.false := E.false; E.s := [ E1.s; E1.false: E2.s; ] E -> E1 relop E2 E -> E1 ‘||’ E2

PSUCS322 HM 22 Back Patching: Three-Address Code Cont’d E1.true := new Label(); E1.false := E.false; E2.true := E.true; E2.false := E.false; E.s := [ E1.s; E1.true: E2.s; ] E1.true := E.false; E1.false := E.true; E.s := E1.s; E -> E1 ‘&&’ E2 E -> ‘!’ E1

PSUCS322 HM 23 Back Patching: Jump Labels Cont’d IR Tree Code: E.tr := ( ESEQ [CJUMP relop E1.tr E2.tr E.true ] null ) E1.true := E.true; E1.false := new NAME(); E2.true := E.true; E2.false := E.false; E.tr := (ESEQ [stmt( E1.tr); LABEL( E1.false ); stmt( E2.tr); ] null ) E -> E1 relop E2 E -> E1 ‘||’ E2

PSUCS322 HM 24 Back Patching: IR Tree Cont’d E1.true := new NAME(); E1.false := E.false; E2.true := E.true; E2.false := E.false; E.tr := (ESEQ [stmt( E1.tr ); LABEL( E1.true ); stmt( E2.tr ); ] null) E1.true := E.false; E1.false := E.true; E.tr := E1.tr; E -> E1 ‘&&’ E2 E -> ‘!’ E1

PSUCS322 HM 25 Converting Back to Value Actual Boolean value are needed in programs, e.g. boolean x = a 2; We still need to generate a value for the Boolean expression! This can be implemented by patching the two labels E.true and E.false for the Boolean expression E with two assignment statements for assigning 1 and 0, respectively. t = new Temp(); E.true := new Label(); E.false := new Label(); L := new Label(); E.s := [ E.true: t := 1; goto L; E.false: t := 0; L: ] E.t := t; Boolean expression E

PSUCS322 HM 26 New Arrays Storage allocation for E: — Follow Java’s array storage convention. The length of array is stored as the 0 th element. So storage for a 10-element array actually requires 11 cells Cell initialization — All elements automatically initialized to 0; you emit code Pseudo IR Code: L: new Label; t1,t2,t3: new Temps;// wdSize == 4 E.s := [ E1.s; t1 := ( E1.t + 1 ) * wdSize; // number of elements t2 := malloc( t1 ); // t2 points to cell 0 t2[0] := E1.t; // store array length t3 := t2 + ( E1.t * wdSize ); // t3 points to last cell L: t3[0] := 0; // init a cell to 0 t3 := t3 - wdSize; // move down a cell if ( t3 > t2 ) goto L; ] // loop back E.t := t2; E => ‘newArray’ E1

PSUCS322 HM 27 Arrays Element Reference Calculate address for E: addr( a[i] ) = base a + (i+1) * wdSize Bounds check: i >= 0 and i < num-elements. i is general expression! L1,L2: new Label; t1,t2,t3,t4: new Temps; E.s := [ E1.s; E2.s; t1 := E1.t[ 0 ];// t1 holds num elements if ( E2.t < 0 ) goto L1;// too low? if ( E2.t >= t1 ) goto L1;// too high? t2 := E2.t + 1;// must be OK t3 := t2 * wdSize;// compute offset t4 := E1.t[ t3 ];// address = start + offset goto L2;// bypass exception handler L1: param E1.t; param E2.t; call arrayError, 2; L2: ]// t4 holds final address E.t := t4; E => E1 ‘[‘ E2 ‘]’

PSUCS322 HM 28 Statements Assignment Statement => S.s :=[ E1.s; E2.s; E1.t := E2.t; ] S => E1 ‘:=‘ E2 ‘;’ If Statement with Else Clause => L1, L2, L3: new Labels; E.true := L1; E.false := L2; S.s :=[ E.s; L1: S1.s; goto L3; L2: S2.s; L3: ; ] S => ‘if’ ‘(‘ E ‘)’ then’ S1 ‘else’ S2 ‘;’

PSUCS322 HM 29 Statements, Cont’d While Statement => L1, L2, L3: new labels;// no explicit jump to L2 E.true := L2; E.false := L3; S.s :=[ L1: E.s; L2: S1.s; goto L1; L3: ] S => ‘while’ ‘(‘ E ‘)’ S1 ‘;’ Print Statement with 1 argument => S.s :=[ E.s; param E.t; call prInt, 1; ] S => ‘print’ E ‘;’