Presentation is loading. Please wait.

Presentation is loading. Please wait.

8 Intermediate code generation

Similar presentations


Presentation on theme: "8 Intermediate code generation"— Presentation transcript:

1 8 Intermediate code generation
Zhang Zhizheng

2 8.0 Overview 1.Position of Intermediate code generator
static checker Token stream Syntax tree Syntax tree Intermediate code parser Code generator

3 2.Benefits for using a machine-independent intermediate form
Retargeting is facilitated; a compiler for a different machine can be created by attaching a back end for the new machine to an existing front end. A machine-independent code optimizer can be applied to the intermediate representation.

4 3.Implementation of Intermediate code generator
Syntax-directed translation, folded into parsing Top-down parsing Bottom-up parsing

5 8. 1 Intermediate languages
1.Intermediate representations Syntax tree Syntax Tree Directed acyclic graph(DAG) Postfix notation Three-address code Quadruple

6 A syntax tree and DAG for the assignment statement :
a:=b*-c+b*-c assign assign a + a + * * * b uminus b uminus b uminus c c c

7 Production of Syntax Tree (and DAG)
Production Semantic rules S id:=E id.value=E.value E E1 + E E.value=E1.value+E1.value E E1 * E E.value=E1.valueE1.value E  - E E.value=-E1.value E  ( E1) E.value=E1.value E  id E.value=id.lexival Semantic rules on value attribution of assignment statement

8 Production Semantic rules
S id:=E S.nptr:=maknode(‘assign’, mkleaf(id, id.place), E. nptr) E E1 + E2 E.nptr= maknode(‘+’, E1.nptr, E1.nptr) E E1 * E2 E.nptr= maknode(‘*’, E1.nptr, E1.nptr) E  - E E.nptr= maknode(‘uminus’, E1.nptr) E  ( E1) E.nptr= E1.nptr E  id E.nptr= maknode(id, id.place) Semantic rules on producing syntax tree of assignment statement

9 Representations of the syntax tree ____the data structure of graph
See Fig. 8.4

10 2.Three-address code(TAC)
A sequence of statements of the general form x= y op z Here, x, y, z are names, constants, or compiler-generated temporaries; op stands for any operator

11 Notes: 1)There is only one operator on the right side of a statement
2) Three address code is a linearized representation of a syntax tree or a DAG in which explicit names correspond to the interior nodes of the graph 3) Each three-address code statement contains three addresses, two for the operands and one for the result

12 E.g, Tree address code corresponding to the above Tree and DAG
t1:=-c t2:=b*t1 t3:=-c t4:=b*t3 t5:=t2+t4 a:=t5 Code for the syntax tree t1:=-c t2:=b*t1 t5:=t2+t2 a:=t5 Code for the DAG

13 3. Types of TAC x:=y op z //assignment statement, op is binary arithmetic or logical operation// x:=op y //assignment statement, op is unary operation as minus, logical negative, conversion operator etc.// x:=y //Copy assignment statement// goto L //Unconditional jump// If x relop y goto L //Conditional jump: if x stands in relation relop to y, then executes the statement with label L, else executed the following statement // param x1 …… param xn call p,n return y //Call procedure P with n parameters (x1,……,xn)//

14 x=y[i] x[i]=y x=&y //the value of x is the location of y// x=*y *x=y

15 4.Syntax-directed Translation into TAC

16 E.g, a:=b*-c+b*-c can be translated into
t1:=-c t2:=b*t1 t3:=-c t4:=b*t3 t5:=t2+t4 a:=t5 How translate??

17 Production Semantic Rules
Sid:=E S.code:=E.code||gen(id.place ‘:=’ E.place) E E1+E2 E.place:=newtemp(); E.code:=E1.code||E2.code|| gen(E.place,’:=’,E1.place ‘+’ E2.place) E E1*E2 E.place:=newtemp(); gen(E.place,’:=’,E1.place ‘*’ E2.place) E -E E.place:=newtemp(); E.code:=E1.code|| gen(E.place,’:=’, ‘uminus’ E2.place) E id E.place:=id.place E.code:=‘’ E.place,the name that will hold the value of E E.code, the sequence of three-address statements evaluating E.

18 Production Semantic Rules
Swhile E do S S.begin=newlabel(); S.after=newlabel(); S.code=gen(S.begin ‘:’)||E.code|| gen(‘if’ E.place ‘=‘ ‘0’ ‘goto’ S.after) || S1.code || gen(‘goto’ S.begin) || gen(S.after ‘:’)

19 Production Semantic Rules
Sif E then S S.after=newlabel(); S.code=E.code|| gen(‘if’ E.place ‘=‘ ‘0’ ‘goto’ S.after) || S1.code || gen(S.after ‘:’)

20 Production Semantic Rules
Sif E then S S.after=newlabel(); else S E.false=newlabel(); S.code=E.code|| gen(‘if’ E.place ‘=‘ ‘0’ ‘goto’ E.false) || S1.code || gen(‘goto’ S.after) || gen(E.false ‘:’) || S2.code || gen(S.after ‘:’)

21 5.Addressing array elements
1)One-dimensional array Addr(A[i])=base+(i-low)*w=i*w+(base-low*w) Notes: 1)Here, we assume the width of each array element is w and the start address of the array block is base. 2)The array is defined as array[low..upper] of type 3)The sub-expression c=base-low*w can be evaluated when the declaration of the array is seen and we assume that c is saved in the symbol table entry for the array.

22 2)two-dimensional array
row-major form Addr(A[i1, i2])=base+((i1-low1)*n2+i2-low2)*w =(i1*n2+i2)*w+base-(low1*n2+low2)*w Where n2=upper2-low2+1 t1=low1*n2 t2=t1+low2 t3=t2*w t4=base-t3 t5=i1*n t6=t5+i t7=t6*w t4[t7]=x x=t4[t7] (2) column-major form

23 3)n-dimensional array Array[l1:u1,, l2:u2,… ln:un] Let di=ui-li+1,i=1,2,…n, the width of each dimension is m D=a+((i1-l1)d2d3…dn+ (i2-l2)d3d4…dn + (in-1-ln-1)dn + (in-ln))m Change into D=conspart+varpart conspart=a-C C=((…(l1d2+l2 )d3+ l3) d3…+ ln-1) dn+ ln)m varpart= ((…(i1d2+i2 )d3+ i3) d3…+ in-1) dn+ in)m

24 6.Short-circuit code of Boolean expressions
Translate a boolean expression into intermediate code without evaluating the entire expression.

25 1)Associate E with two labels E.true E.false
7. Translation methods of Flow of control statements in Short-circuit code 1)Associate E with two labels E.true The label to which control flows if E is true E.false The label to which control flows if E is false

26 2)Associate S with a label S.next
Following S.code is a jump to some label

27 Production Semantic Rules
Sif E then S E.true=newlabel(); E.false=S.next; S1.next=S.next; S.code=E.code ||gen(E.true ‘:’) ||S1.code Sif E then S1 else S E.true=newlabel(); E.false=newlabel(); S1.next=S.next S2.next=S.next ||S1.code||gen(‘goto’ S.next)|| gen(E.false ‘:’)||S2.code

28 Production Semantic Rules
Swhile E do S S.begin=newlabel(); E.true=newlabel(); E.false=S.next; S1.next=S.begin S.code=gen(S.begin ‘:’)||E.code ||gen(E.true ‘:’) ||S1.code||gen(‘goto’ S.begin)

29 Production Semantic Rules
EE1 or E E1.true=E.true; E1.false=newlabel(); E2.true=E.true; E2.false=E.false E.code=E1.code ||gen(E1.false ‘:’) ||E2.code EE1 and E E1.true=newlabel(); E1.false=E.false; E.code=E1.code ||gen(E1.true ‘:’) E id1 relop id E.code=gen(‘if’ id1.place relop.op id2.place ‘goto’ E.true)||gen(‘goto’ E.false)

30 3)Examples (1)a<b or c<d and e<f if a<b goto Ltrue goto L1 L1:if c<d goto L2 goto Lfalse L2:if e<f goto Ltrue Here, we assume that the true and false exits for the entire expression are Ltrue and Lfalse respectively

31 (2)while a<b do if c<d then x=y+z else x=y-z
L1: if a<b goto L2 goto Lnext L2: if c<d goto L3 goto L4 L3:t1=y+z x=t1 goto L1 L4:t2=y-z x=t2 Lnext: (2)while a<b do if c<d then x=y+z else x=y-z

32 Quadruples (op, arg1,arg2,result) Triples (n) (op,arg1,arg2)
8.Implementations of three-address statements Quadruples (op, arg1,arg2,result) Triples (n) (op,arg1,arg2) (m) (op,(n),arg) Notes: A three-address statement is an abstract form of intermediate codes

33 9.Advantages of quadruples
Easy to generate target code Good for optimizing

34 Exercises Please translate the following program fragment into three-address code using the form of short circuit code. i=2; loop=0; while (loop==0 && i<=10) { j=1; while (loop ==0 && j<i) if (a[i,j] == x) loop=1; else j=j+1; if (loop==0) i=i+1; } Notes: Here we assume that the declaration of array A is array [1..10,1..10], each data element of array A would use 2 storage units, and the start address of array A’s storage area is addrA.

35 Translate the following program fragment into three-address code.
loop=0; while (loop==0 && i<=10) { j=1; while (loop ==0 && j<=i) if (a[i,j] != a[j,i]) //”!=” means “not equal to” { loop=1; m=1; } else j=j+1; if (loop==0) i=i+1; Notes: Here we assume that the declaration of array A is array [1..10,1..10], each data element of array A would only use 1 storage unit, and the start address of array A’s storage area is addrA.

36 8. 2 Assignment statements
1、Assignment statements with only id 1) functions NEWTEMP() GEN(OP,ARG1,ARG2,RESULT) 2)Semantic rules for quadruple code generation

37 (1)A i=E {GEN(=, E•PLACE ,_, i.entry}
(2)E -E (1) {T=NEWTEMP(); E(1)•PLACE ,_,T); E•PLACE =T } (3)E E (1)*E(2) {T=NEWTEMP(); GEN(*, E(1)•PLACE , E(2)•PLACE ,T); (4)E E (1) + E(2) {T=NEWTEMP(); GEN(+, E(1)•PLACE , E(2)•PLACE ,T); (5)E (E (1)) {E•PLACE =E(1)•PLACE } (6)E  i {E•PLACE = i.entry}

38 iput SYM PLACE quadruples A=-B*(C+D)# =-B*(C+D)# i -B*(C+D)# i= -- B*(C+D)# i=- --- *(C+D)# i=-i i=-E --- B i=E --T1 (C+D)# i=E* --T1- C+D)# i=E*( --T1-- +D)# i=E*(i --T1--C i=E*(E

39 3.The translation scheme for addressing array elements
1) grammar AV:=E V i[Elist] | i Elist Elist,E | E E E op E | (E) | V

40 3.The translation scheme for addressing array elements
2) Rewriting of the grammar AV:=E V Elist] | i Elist Elist(1),E | i[ E E E op E | (E) | V Notes: This rewriting aims that the various dimensional limits nj of the array be available as we group index expressions into an Elist.

41 3.The translation scheme for addressing array elements
3) semantic variables ARRAY DIM PLACE OFFSET

42 3.The translation scheme for addressing array elements
4) Translation code (1)AV=E {if (V•OFFSET=null) GEN(=,E • PLACE,_,V•PLACE); else GEN([ ]=,E•PLACE,_,V•PLACE[V•OFFSET])}

43 (2)E E(1) op E (2) {T=NEWTEMP();
GEN(op, E(1) •PLACE, E(2) •PLACE,T); E • PLACE =T} (3)E (E (1)) {E • PLACE = E(1) •PLACE} (4)E  V {if (V•OFFSET=null) E • PLACE = V •PLACE; else {T=NEWTEMP(); GEN(=[ ], E • PLACE[V •OFFSET],_,T); E • PLACE =T;}}

44 (5)V Elist] {if (TYPE[ARRAY]<>1)
{T=NEWTEMP(); GEN(*,Elist•PLACE,TYPE[ARRAY],T); Elist •PLACE=T;} V •OFFSET=Elist •PLACE; T=NEWTEMP(); GEN(-,HEAD[ARRAY],CONS[ARRAY],T); V •PLACE=T} (6)V i {V •PLACE=ENTRY[i]; V •OFFSET=null}

45 (7)Elist Elist(1),E {T=NEWTEMP();
k= Elist(1) •DIM+1; dk=LIMIT(Elist(1)•ARRAY,k); GEN(*,Elist (1) •PLACE, dk,T); T1=NEWTEMP(); GEN(+,T,E •PLACE, T1); Elist•ARRAY= Elist(1)•ARRAY; Elist•PLACE= T1; Elist•DIM=k;

46 (8)Elist  i[ E {Elist•PLACE=E•PLACE; Elist•DIM=1; Elist•ARRAY=ENTRY(i)}

47 E.g. Let A be an array:ARRAY[1:10,1:20]; the address of the beginning of the array is a, m=1.
We can get C by the computing: (low1*n2+low2)*m=(1*20+1)*1=21 The quadruples for X=A[I,J] are: (1) (*,I,20,T1) (2) (+, T1,J, T2) (3) (-,a,21, T3) (4) (=[ ], T3[T2],_, T4) (5) (=, T4,_,X)

48 8. 3 Boolean expressions 1.Primary purposes of boolean expressions Compute logical values Used as conditional expressions in statements that alter the flow of control,such as if or while statements. 2.Grammar E E and E | E or E | not E | (E) | i | Ea rop Ea

49 3.Numerical representation
(1)EEa(1) rop Ea(2) {T=NEWTEMP(); GEN(rop, Ea(1)•PLACE , Ea(2)•PLACE ,T); E•PLACE =T } (2)E E (1) bop E(2) {T=NEWTEMP(); GEN(bop, E(1)•PLACE , E(2)•PLACE ,T);

50 3.Numerical representation
(3)E not E (1) {T=NEWTEMP; GEN(not, E(1)•PLACE , _ ,T); E•PLACE =T } (4)E (E (1)) {E•PLACE =E(1)•PLACE } (5)E  i {E•PLACE = ENTRY(i)}

51 3.Numerical representation
E.g. X+Y>Z or A and (not B or C) (+,X,Y,T1) ;E+E (>, T1,Z, T2) ; E >E (not,B,_, T3) ; not E (or, T3,C, T4) ; E or E (and ,A, T4,T5) ; E and E (or, T2, T5, T6) ; E or E

52 4.Short-circuit code Translate a boolean expression into intermediate code without evaluating the entire expression. Represent the value of an expression by a position in the code sequence.

53 E.g. if A or B<D then S1 else S2
(1)(jnz,A,_,(5)) ;E.true, to S1 (2)(j,__,(3)) ;E.false, look at the right of or (3)(j<,B,D,(5)) ;Ea.true, to S1 (4)(j,_,_,(P+1)) ; Ea.false, to S2 (5) S1 …… (P)(j,_,_,(q)) ;jump over S2 (p+1) S2 (q)the code after S2

54 8.4 Backpatching 1.Why and what is backpatching?
When generating code for boolean expressions and flow-of-control statements , we may not know the labels that control must go to. We can get around this problem by generating a series of branching statement with the targets of the jumps temporarily left unspecified. Each such statement will be put on a list of goto statements whose labels will be filled in when the proper label can be determined. This subsequent filling in of labels is called backpatching

55 2.Functions to manipulate lists of labels related to backpatching
Makelist(i) Creates a new list containing only i, an index into the array of quadruples; makelist returns a pointer to the list it has made. Merge(p1,p2) Concatenates the lists pointed to by p1 and p2, and returns a pointer to the concatenated list. Backpatch(p,i) Inserts i as the target label for each of the statements on the list pointed to by p.

56 3.Boolean expression 1)Modify the grammar E EAE | E0E | not E | (E) | i | Ea rop Ea EA E and E0  E or 2)Semantic Rules (1) E i {E•TC=NXQ; E•FC=NXQ+1; GEN(jnz,ENTRY(i),_,0); GEN(j,_,_,0)}

57 3.Boolean expression 2)Semantic Rules (2) E Ea rop Ea {E•TC=NXQ; E•FC=NXQ+1; GEN(jrop, Ea(1)•PLACE, Ea(2)•PLACE,0); GEN(j,_,_,0)} (3) E (E(1)) {E•TC= E(1)•TC; E•FC= E(1)•FC} (4) E not E(1) {E•TC= E(1)•FC; E•FC= E(1)•TC}

58 3.Boolean expression 2)Semantic Rules (5)EA E(1) and {BACKPATCH(E(1)•TC,NXQ); EA•FC= E(1)•FC;} (6) EEAE(2) {E•TC= E(2)•TC; E•FC=MERG(EA•FC,E(2)•FC}

59 3.Boolean expression 2)Semantic Rules (7)E0 E(1) or {BACKPATCH(E(1)•FC,NXQ); E0•TC= E(1)•TC;} (8) EE0E(2) {E•FC= E(2)•FC; E•TC=MERG(E0•TC,E(2)•TC}

60 Translate A and B or not C
-2- --- # EAi or not C# -2 -- # EA B or not C# 2.(j,-,-(5)) -1- #E and 1.(jnz,a,-,(3)) -1 #E and B or not C# #i - # A and B or not C# quadruple FC TC SYM INPUT

61 INPUT SYM TC FC quadruple or not C# # EAE --3 -2 4 3.(jnz,B,-,0)
-4 4.(j,-,-(5)) or not C# #E or -3- -4- not C# # E0 -3 -- C# # E0 not -3- --- # E0 not i -3-- ---- # # E0 notE -3-5 ---6 5.(jnz,C,-,0) # # E0 E -3 6 --5 6.(j,-,-,3) # #E -6 -5 success

62 4.Flow of control statements
modify the grammar S if E then S(1) else S(2)  C if E then T C S(1) else S T S(2) S if E then S(1)  S C S(1)

63 4.Flow of control statements
2) Semantic Rules C if E then {BACKPATCH(E•TC,NXQ); C•CHAIN=E•FC;} T C S(1) else {q=NXQ; GEN(j,-,-0); BACKPATCH(C•CHAIN,NXQ); T •CHAIN=MERG(S(1)•CHAIN,q)} S T S(2) {S•CHAIN=MERG(T•CHAIN,S(2)•CHAIN)} S C S(1) {S•CHAIN=MERG(C•CHAIN,S(1)•CHAIN)}

64 e.g. If a then if b then A:=2 else A:=3 Else if c then A=4 Else a=5
(1) (jnz,a,_,0) (2) (j,_,_,0)

65 If a then if b then A:=2 else A:=3 Else if c then A=4 Else a=5
(1)(jnz,a,_,(3)) (2)(j,_,_,0) (3)(jnz,b,_,0) (4)(j,_,_,0) Ca•CHAIN->2

66 If a then if b then A:=2 else A:=3 Else if c then A=4 Else a=5
(1)(jnz,a,_,(3)) (2)(j,_,_,0) (3)(jnz,b,_,(5)) (4)(j,_,_,0) (5)(:=,2,_,A) Ca•CHAIN->2 Cb•CHAIN->4

67 If a then if b then A:=2 else A:=3 Else if c then A=4 Else a=5
(1)(jnz,a,_,(3)) (2)(j,_,_,0) (3)(jnz,b,_,(5)) (4)(j,_,_,(7)) (5)(:=,2,_,A) (6)(j,_,_,0) Ca•CHAIN->2 Cb•CHAIN->6

68 (2)(j,_,_,(9)) (9)(jnz,c,_,(11)) (3)(jnz,b,_,(5)) (10)(j,_,_,(13))
Answer (1)(jnz,a,_,(3)) (8)(j,_,_,6) (2)(j,_,_,(9)) (9)(jnz,c,_,(11)) (3)(jnz,b,_,(5)) (10)(j,_,_,(13)) (4)(j,_,_,(7)) (11)(:=,4,_,A) (5)(:=,2,_,A) (12)(j,_,_,8) (6)(j,_,_,0) (13)(:=,5,_,A) (7)(:=,3,_,A) S•CHAIN->6->8->12

69 a b S1(A:=2) S2(A:=3) c S3(A:=4) S4(A:=5) 1,2 3,4 5 6 7 TRUE FALSE 8
9,10 11 S3(A:=4) 12 S4(A:=5) 13

70 4.Flow of control statements
3) While statement S while E do S(1)  W while Wd W E do S  Wd S(1)

71 4.flow of control statements
3) While statement W while {W•QUAD=NXQ} Wd W E do {BACKPATCH(E•TC,NXQ); Wd•CHAIN=E•FC; Wd•QUAD=W•QUAD;} S  Wd S(1){BACKPATCH(S(1)•CHAIN, Wd•QUAD); GEN(j,_,_, Wd •QUAD); S • CHAIN= Wd•CHAIN}

72 4.flow of control statements 3) While statement
Code of E Code of S(1) S.CHAIN

73 e.g. While (A<B) do if (C<D) then X:=Y+Z;  (100) (j<,A,B,0)

74 e.g. While (A<B) do if (C<D) then X:=Y+Z;
: (100) (j<,A,B,(102)) (101)(j,_,_,0) (102)(j<,C,D,0) (103)(j,_,_,(100))

75 e.g. While (A<B) do if (C<D) then X:=Y+Z;
: (100) (j<,A,B,(102)) (101)(j,_,_,0) (102)(j<,C,D,(104)) (103)(j,_,_,(100)) (104)(+,Y,Z,T1) (105)(:=, T1,_,X)

76 e.g. While (A<B) do if (C<D) then X:=Y+Z;
: (100) (j<,A,B,(102)) (106)(j,_,_,(100)) (101)(j,_,_,(107)) (102)(j<,C,D,(104)) (103)(j,_,_,(100)) (104)(+,Y,Z,T1) (105)(:=, T1,_,X)


Download ppt "8 Intermediate code generation"

Similar presentations


Ads by Google