Presentation is loading. Please wait.

Presentation is loading. Please wait.

PSUCS322 HM 1 Languages and Compiler Design II IR Code Generation I Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.

Similar presentations


Presentation on theme: "PSUCS322 HM 1 Languages and Compiler Design II IR Code Generation I Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU."— Presentation transcript:

1 PSUCS322 HM 1 Languages and Compiler Design II IR Code Generation I Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU Spring 2010 rev.: 4/16/2010

2 PSUCS322 HM 2 Agenda Grammar G1 CodeGen Overview Arithmetic Expression Translation Boolean Expression Translation Various Statement Translations

3 PSUCS322 HM 3 Grammar G1 Input:AST representation of MINI source Output:Three-address code or IR tree code Approach:Syntax-directed translation Generic Grammar G1, start symbol: S E -> E arithop E | E relop E | E logicop E E -> ‘-’ E | ‘!’ E E -> ‘newArray’ E// new int array size E1 E -> E ‘[’ E ‘]’// indexed element E -> id | num// end-nodes S -> E ‘:=’ E ‘;’// assignment statement S -> ‘if’ ‘(‘ E ‘)’ ‘then’ S ‘else’ S S -> ‘while’ ‘(‘ E ‘)’ S S -> ‘print’ E ‘;’ S -> ‘return’ E ‘;’

4 PSUCS322 HM 4 CodeGen Overview Arithmetic Expressions: –preserve precedence and associativity –Pay attention, whether language requires check for zero-divide Boolean Expressions: –define short-circuit evaluation vs. complete evaluation –discern bit-wise vs. logical and, or, xor –multiple unary NOT allowed? Array definition: –1D is simple for compiler –Per dimension, record: element size, low-bound, high-bound, total size, index type and type Array element reference: –L-value or r-value? –Nested array reference: index expression can in turn be array element –Discern pass by value or reference, other Statements: –Goto into other scope, out of current scope issue in FTN, C –Return: long-jump in C non-trivial Parameters: –Function parameters in PL/I and Pascal hard –Easy to confuse pointer type parameters with reference parameters (in C)

5 PSUCS322 HM 5 Arithmetic Expression Translation Generate tree-address code: get new temp per operation E.s holds statements that evaluate E E.t is temp that holds E’s value t = new Temp(); E.s := [ E1.s; E2.s; t := E1.t arithop E2.t; ] E.t := t; t = new Temp(); E.s := [ E1.s; t := unaryop E1.t; ] E.t := t; E -> E1 arithop E2 E -> unaryop E1

6 PSUCS322 HM 6 Arithmetic Expression Translation, Cont’d To generate IR trees, embed expression subtrees into current root. Attribute E.tr holds IR tree for E E.tr := ( BINOP arithop E1.tr E2.tr ) E.tr := ( UNOP unaryop E1.tr null ) b * -c + b * d / e // assume l-2-r associativity of * / % => t1 := -c t2 := b * t1 t3 := b * d t4 := t3 / e t5 := t2 + t4 => (BINOP + (BINOP * b (UNOP – c ) ) (BINOP / (BINOP * b d ) e ) ) similar to Polish Postfix notation, after Lukasiewicz, 1920 E -> E1 arithop E2 E -> unaryop E1

7 PSUCS322 HM 7 Boolean Expression Translation Rely on target machine with conditional branches –Condition can be part of instruction –Or condition can be inquired by using machine flags –Or condition can be evaluated separately (canonical execution) and then be provided as one of the arguments –Operands are: condition, target address, and *+1 CodeGen uses temps for intermediate booleans Or CodeGen uses flow of control, so “code locations” imply state of boolean subexpressions Or combination of both Target computer may provide boolean or even bitwise operations: –And –Or –Xor –Not, etc

8 PSUCS322 HM 8 Boolean Expression Translation, Quads Relational operations need to record their result, e.g. in machine flags. Logical operations can be realized through computation proper or via control flow. See sample expression: a 2 a 2// source => Pure code mapping, with logical and, or, xor instruction: t1 := a < 5// e.g. encode 0 as false, 1 as true t2 := b > 2 t3 := t1 || t2// generate jump out if t3 is false => Control flow mapping, w/o logical and, or, xor : t1 := 1// guess t1 is true, override if needed if a < 5 goto l1// could be quad: Cond_jump_if_less t1 := 0// guess was wrong, override to false l1:t2 := 1// guess: set t2 to true initially if b > 2 goto l2 t2 := 0// wrong guess, set t2 to false l2:t3 := 1// final guess if t1 goto l3 if t2 goto l3 t3 := 0// final guess computed as false l3: // use t3

9 PSUCS322 HM 9 Better Representation of Booleans, and IR Use target machine’s native logical operations for: and, or, xor ; also in sample expression: a 2 t1 := 1 if a < 5 goto l1 t1 := 0 l1: t2 := 1 if b > 2 goto l2 t2 := 0 l2: t3 := t1 or t2 // use t3 MOVE t3 ( (BINOP || (ESEQ [ [MOVE t1 (CONST 1) ] [CJUMP < (NAME a) (CONST 5) l1 ] [MOVE t1 (CONST 0) ] [LABEL l1] ] t1 ) (ESEQ [ [MOVE t2 (CONST 1)] [CJUMP > (NAME b) (CONST 2) l2 ] [MOVE t2 (CONST 0) ] [LABEL l2] ] t2 )

10 PSUCS322 HM 10 Value Representation, Relational L := new Label(); t := new Temp(); E.s := [ E1.s; E2.s; t := 1; if ( E1.t relop E2.t ) goto L; t := 0; L:... ] E.t := t; E -> E1 relop E2 Three-Address Code: IR Tree Code: L := new NAME(); t := new TEMP(); E.tr := ( ESEQ [ [MOVE t (CONST 1 ) ] [ CJUMP relop E1.tr E2.tr L ] [MOVE t (CONST 0) ] [LABEL L] t )

11 PSUCS322 HM 11 Value Representation, Three-Address Code L := new Label(); t := new Temp(); E.s := [ E1.s; E2.s; t := 1; if ( E1.t == 1 ) goto L; if ( E2.t == 1 ) goto L; t := 0; L:... ] E.t := t; E -> E1 ‘||’ E2 L := new Label(); t := new Temp(); E.s := [ E1.s; E2.s; t := 0; if ( E1.t == 0 ) goto L; if ( E2.t == 0 ) goto L; t := 1; L:... ] E.t := t; E -> E1 ‘&&’ E2 t := new Temp(); E.s := [ E1.s; t := 1 – E1.t; ] E.t := t; E -> E1 ‘!’ E2

12 PSUCS322 HM 12 Value Representation, IR Tree Code L = new NAME(); t = new TEMP(); E.tr := (ESEQ [ [MOVE t (CONST 1) ] [CJUMP == E1.tr (CONST 1) L ] [CJUMP == E2.tr (CONST 1) L ] [MOVE t (CONST 0) ] [LABEL L] t ) E -> E1 ‘||’ E2 E -> E1 ‘&&’ E2 t = new TEMP(); E.tr := (ESEQ [MOVE t (BINOP – (CONST 1) E1.tr)] t ) E -> E1 ‘!’ E2 L = new NAME(); t = new TEMP(); E.tr := (ESEQ [ [MOVE t (CONST 0) ] [CJUMP == E1.tr (CONST 0) L ] [CJUMP == E2.tr (CONST 0) L ] [MOVE t (CONST 1) ] [LABEL L] t )

13 PSUCS322 HM 13 Control-Flow Mapping, Long If Version Booleans used in programs to direct flow of control, e.g. if ( a 2 ) S1; else S2; Frequently, the Boolean result is not needed afterwards. Thus possible to generate positional code. Instead of: // assume ( a 2 ) stored in t3 // code to compute t3, includes boolean OR || if ( t3 == 0 ) goto l2 L1: code for S1 goto L3 l2: code for S2 L3:... Successor of if-statement... t3 not needed; if machine flag: overridden by S1, S2

14 PSUCS322 HM 14 Control-Flow Mapping, Shorter If Version 1. No need to create temps to compute boolean value 2. How does code-gen know where to branch to? Use “back- patching!” Can be done by buffering code, or after CodeGen. Ramifications would be good Midterm question if ( a < 5 ) goto L4 if ( b > 2 ) goto L4 goto L5 L4: code for S1 goto L6 l5: code for S2 L6:... Code after if-statement

15 PSUCS322 HM 15 Control-Flow Mapping, Nested If Statements Data structures needed to back-patch? Object Code Skeleton if ( a >= 5 ) goto L8 // back-patch if ( b <= 2 ) goto L7 // back-patch code for S1 goto L10 // back-patch L7: // L7 resolved code for S2 goto L10 // back-patch L8: // L8 resolved if ( c >= 6 ) goto L9 // back-patch code for S3 goto L10 // back-patch L9: // L9 resolved code for S4 L10: // L10 resolved Source Code Skeleton if ( a < 5 ) if ( b > 2 ) S1; else S2; //end if else if ( c < 6 ) S3; else S4; //end if

16 PSUCS322 HM 16 Control-Flow Mapping, Elsif Clauses (Ada) How can linked-list of to-be-back-patched addresses be created? Object Code Skeleton if ( a >= 5 ) goto L11 code for S1 goto L14 L11: if ( b <= 2 ) goto L12 code for S2 goto L14 L12: if ( c >= 6 ) goto L13 code for S3 goto L14 L13: code for S4 L14: Source Code Skeleton if a < 5 then S1; elsif b > 2 then S2; elsif c < 6 then S3; else S4; end if;

17 PSUCS322 HM 17 Control-Flow Mapping, While Are there size (of code) limitations to back-patching? Object Code Skeleton // R1 holds induction variable L15: if ( R1 >= 10 ) goto L16 code for S R1++ goto L15 L16: Source Code Skeleton while ( i < 10 ) { S; i++; } //end while // assume i NOT needed after // “i” is pure IV

18 PSUCS322 HM 18 Control-Flow Mapping, Repeat (Pascal) Object Code Skeleton // R1 holds induction variable L15: if ( R1 >= 10 ) goto L16 code for S R1++ goto L15 L16: Source Code Skeleton //Pascal source repeat S; i++; until i >= 10; // again no use of “i” after “Fall-Through” in Repeat vs. initial test in While

19 PSUCS322 HM 19 Control-Flow Mapping, For What happens if “i” (induction variable AKA IV) is defined outside, and used as loop parameter? What is its value after for loop completion? Can it be referenced? i.e. value be printed? What happens if IV is assigned inside loop? What should happen, if IV value is > end-value at start? Object Code Skeleton mov R1, #0 L17: If ( R1 >= 10 ) goto L18 code for S R1++ goto L17 L18: Source Code Skeleton for( int i=0; i<10; i++ ) { S; } //end for // i is undefined/not used // can be IV in reg

20 PSUCS322 HM 20 Back Patching Example if ( a 2 ) S1; else S2; Handling a<5: if (a ; // needs to be patched; addr. insertion Handling b>2: if (b > 2) goto ;// needs to be patched Handling..||..: if (a ;//.. else fall through if (b > 2) goto ;// is patched to goto // needs to be patched Handling if.. S1 else S2: if (a is patched to L4 if (b > 2) goto L4; goto L5; // is patched to L5 L4: [code for S1] goto L6;// then clause L5: [code for S2]// else clause L6:// end of If Statement

21 PSUCS322 HM 21 Back Patching: Jump Labels Three-Address Code: Add two attributes E.true — position to jump to when E evaluates to true; E.false — position to jump to when E evaluates to false. E.s := [ E1.s; E2.s; if ( E1.t relop E2.t ) goto E.true; E.false: ] E1.true := E.true; E1.false := new Label(); E2.true := E.true; E2.false := E.false; E.s := [ E1.s; E1.false: E2.s; ] E -> E1 relop E2 E -> E1 ‘||’ E2

22 PSUCS322 HM 22 Back Patching: Three-Address Code Cont’d E1.true := new Label(); E1.false := E.false; E2.true := E.true; E2.false := E.false; E.s := [ E1.s; E1.true: E2.s; ] E1.true := E.false; E1.false := E.true; E.s := E1.s; E -> E1 ‘&&’ E2 E -> ‘!’ E1

23 PSUCS322 HM 23 Back Patching: Jump Labels Cont’d IR Tree Code: E.tr := ( ESEQ [CJUMP relop E1.tr E2.tr E.true ] null ) E1.true := E.true; E1.false := new NAME(); E2.true := E.true; E2.false := E.false; E.tr := (ESEQ [stmt( E1.tr); LABEL( E1.false ); stmt( E2.tr); ] null ) E -> E1 relop E2 E -> E1 ‘||’ E2

24 PSUCS322 HM 24 Back Patching: IR Tree Cont’d E1.true := new NAME(); E1.false := E.false; E2.true := E.true; E2.false := E.false; E.tr := (ESEQ [stmt( E1.tr ); LABEL( E1.true ); stmt( E2.tr ); ] null) E1.true := E.false; E1.false := E.true; E.tr := E1.tr; E -> E1 ‘&&’ E2 E -> ‘!’ E1

25 PSUCS322 HM 25 Converting Back to Value Actual Boolean value are needed in programs, e.g. boolean x = a 2; We still need to generate a value for the Boolean expression! This can be implemented by patching the two labels E.true and E.false for the Boolean expression E with two assignment statements for assigning 1 and 0, respectively. t = new Temp(); E.true := new Label(); E.false := new Label(); L := new Label(); E.s := [ E.true: t := 1; goto L; E.false: t := 0; L: ] E.t := t; Boolean expression E

26 PSUCS322 HM 26 New Arrays Storage allocation for E: — Follow Java’s array storage convention. The length of array is stored as the 0 th element. So storage for a 10-element array actually requires 11 cells Cell initialization — All elements automatically initialized to 0; you emit code Pseudo IR Code: L: new Label; t1,t2,t3: new Temps;// wdSize == 4 E.s := [ E1.s; t1 := ( E1.t + 1 ) * wdSize; // number of elements t2 := malloc( t1 ); // t2 points to cell 0 t2[0] := E1.t; // store array length t3 := t2 + ( E1.t * wdSize ); // t3 points to last cell L: t3[0] := 0; // init a cell to 0 t3 := t3 - wdSize; // move down a cell if ( t3 > t2 ) goto L; ] // loop back E.t := t2; E => ‘newArray’ E1

27 PSUCS322 HM 27 Arrays Element Reference Calculate address for E: addr( a[i] ) = base a + (i+1) * wdSize Bounds check: i >= 0 and i < num-elements. i is general expression! L1,L2: new Label; t1,t2,t3,t4: new Temps; E.s := [ E1.s; E2.s; t1 := E1.t[ 0 ];// t1 holds num elements if ( E2.t < 0 ) goto L1;// too low? if ( E2.t >= t1 ) goto L1;// too high? t2 := E2.t + 1;// must be OK t3 := t2 * wdSize;// compute offset t4 := E1.t[ t3 ];// address = start + offset goto L2;// bypass exception handler L1: param E1.t; param E2.t; call arrayError, 2; L2: ]// t4 holds final address E.t := t4; E => E1 ‘[‘ E2 ‘]’

28 PSUCS322 HM 28 Statements Assignment Statement => S.s :=[ E1.s; E2.s; E1.t := E2.t; ] S => E1 ‘:=‘ E2 ‘;’ If Statement with Else Clause => L1, L2, L3: new Labels; E.true := L1; E.false := L2; S.s :=[ E.s; L1: S1.s; goto L3; L2: S2.s; L3: ; ] S => ‘if’ ‘(‘ E ‘)’ then’ S1 ‘else’ S2 ‘;’

29 PSUCS322 HM 29 Statements, Cont’d While Statement => L1, L2, L3: new labels;// no explicit jump to L2 E.true := L2; E.false := L3; S.s :=[ L1: E.s; L2: S1.s; goto L1; L3: ] S => ‘while’ ‘(‘ E ‘)’ S1 ‘;’ Print Statement with 1 argument => S.s :=[ E.s; param E.t; call prInt, 1; ] S => ‘print’ E ‘;’


Download ppt "PSUCS322 HM 1 Languages and Compiler Design II IR Code Generation I Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU."

Similar presentations


Ads by Google