# Translation code translation converted code In compilers, this is the task of creating executable from source code. How is it done? So far, we have analysed.

## Presentation on theme: "Translation code translation converted code In compilers, this is the task of creating executable from source code. How is it done? So far, we have analysed."— Presentation transcript:

Translation code translation converted code In compilers, this is the task of creating executable from source code. How is it done? So far, we have analysed the code, identifying the "words" and the syntactical form. Together, these help us understand the meaning of the code, and so we will use the structure we have identified to create the target code.

Postfix Notation Postfix notation is a method for writing expressions which is unambiguous, and corresponds to the processing order we use in bottom-up parsing. a+b is written asab+ a*b is written asab*: In general, we define it recursively: postfix for E1 E2 is (postfix for E1)(postfix for E2) and postfix for (E) is (postfix for E)

Postfix example We can write a+(b*c)*(b*(a+b)) as abc*bab+**+ To read a postfix expression, start from the left and move right. By the time we reach an operator, we take the correct number of operands we have most recently recognised, to get a new expression. Above, b and c are the operands of the first *, a and b are the operands of the first +, etc. Labelling the operators, we get: a+ 1 (b* 1 c)* 2 (b* 3 (a+ 2 b)) which translates to abc* 1 bab+ 2 * 3 * 2 + 1

Translating while parsing 1) S -> S + T 2) S -> T 3) T -> T * F 4) T -> F 5) F -> ( S ) 6) F -> a print ("+") print("*") print(a) Parsing a+a*a+a produces the following sequence: a+a*a+a <= 6 F+a*a+a <=T+a*a+a <= S+a*a+a <= 6 S+F*a+a <=S+T*a+a <= 6 S+T*F+a <= 3 S+T+a <= 1 S+a <= 6 S +F <= S+T <= 1 S The order in which the productions were applied is 6, 6, 6, 3, 1, 6, 1, which causes the output of aaa*+a*

The Value Stack All we are able to do using the previous method is execute an action whenever a rule is used. We can't store up actions for future use. However, we can extend the idea, by associating values with each symbol on the stack. The actions we then carry out can use those values that have been stored. Suppose that we are about to reduce by A -> x1 x2.... xn This means that the furthest right symbols on the symbol stack are: x1 x2... xn Call the values associated with those symbols \$1, \$2,..., \$n When we carry out the reduction, we remove those symbols, and replace by A.

Changing the Value Stack Remove the first n symbols from the stack, and replace by a new value for A, which we will call \$\$. The value we want to store for A will depend on the values we stored for the xi That is, \$\$ = f(\$1, \$2,..., \$n), for some function f. The only other case we need to consider is when we place a terminal on the stack. Where do we get its value? Generally, we expect the lexical analyser to find the value for us. This means that in your Lex script, every time you recognise an integer or a real, you must translate it into a number of the appropriate form.

Computing the value of expressions 1) S -> S + T 2) S -> T 3) T -> T * F 4) T -> F 5) F -> ( S ) 6) F -> 1 7) F -> 2 8) F -> 3 \$\$ := \$1 + \$3 \$\$ := \$1 \$\$ := \$1 * \$3 \$\$ := \$1 \$\$ := \$2 \$\$ := 1 \$\$ := 2 \$\$ := 3 Note: this is a simplification In practice, we would have 6) F -> a, and expect the lexical analyser to return the different integer values

Parsing the expressions Symbol 1 F T S S+ S+2 S+F S+T S+T* S+T*3 S+T*F S+T S Values 1 12 123 16 7 Stack 0 05 03 02 01 016 0165 0163 0169 01697 016975 01697 10 0169 01 Input 1+2*3# +2*3# 2*3# *3# 3# # Action S5 R6 R4 R2 S6 S5 R6 R4 S7 S5 R6 R3 R1 A

Value Stack in Lex Lex must place the values in yylval 1. digit string - compute the value, place in yylval 2. char string - copy to a string array, place the index of its start point in yylval 3. real string - convert to a floating point, store in an array of reals, place the index of its start point in yylval 4. identifier - store as for strings

Lex and yylval %{ #include "y.tab.h" #include extern int yylval; %} % [0-9]+{yylval = atoi(yytext); /* convert string to integer */ return INT_T;} [ \t]; /* ignore space */::: %

Value Stack in Yacc Yacc allows an action after each production. The action will be performed immediately before the reduction. Values are represented using the \$\$ and \$i notation. When the statement is reached by Yacc, it will translate the different \$i's into their appropriate types

Using Yacc's Value Stack % Finish: Expr{printf("%d",\$1);} ; Expr: Expr PLUS_T Term {\$\$ = \$1 + \$3;} | Term ; Term: Term MUL_T Factor {\$\$ = \$1 * \$3;} | Factor ; Factor: OB_T Expr CB_T {\$\$ = \$2;} | INT_T ; %

Syntax-Directed Translation Yacc allows us to use the value stack. However, this method only allows us to associate a single value with each symbol. We may want to record more information: data types places in the symbol table code fragments We will extend the idea of the value stack by associating multiple values with symbols

Attributes With each symbol in the grammar, associate a set of attributes. The attributes can be of any type, and represent any information we can express. With each production in the grammar, associate a set of semantic rules, determining how the values of the attributes are to be computed. The computation can modify the values of the attributes, or can have side-effects, modifying some external structure - e.g. the symbol table - or can output results to the screen or to a file.

Formal attribute definition p) A ->  is a grammar rule. p) has associated with a set of semantic functions of the form b := f(c 1, c 2,..., c n ) where b, c 1, c 2,..., c n are attributes of any symbol appearing in p). If b is an attribute of A, then b is a synthesised attribute. If b is an attribute of one of the symbols in , then b is an inherited attribute

Syntax-directed Definition: Example 1) 2) 3) 4) 5) 6) 7) S -> E E 1 -> E 2 + T E -> T T 1 -> T 2 * F T -> F F -> ( E ) F -> digit print(E.val) E 1.val := E 2.val + T.val E.val := T.val T 1.val := T 2.val * F.val T.val := F.val F.val := E.val F.val := digit.lexval

Synthesised Attributes The value of a synthesised attribute either comes from the child nodes, or from the properties of the symbol itself. As soon as a symbol is recognised in bottom-up parsing, the values of its synthesised attributes can be obtained. Thus, if a derivation of a string uses only symbols with synthesised attributes, we can evaluate all the attributes as we carry out the parse. A syntax-directed definition which uses only synthesised attributes is called an S-attributed definition.

6 + 2 * 3 S E T F val = 12 E T F 6 val = 6 val = 6 val = 6 lexval = 6 T F 2 val = 2 lexval = 2 3 lexval = 3 val = 2 val = 3 val = 6 + * 12 Annotated Parse Tree

Inherited Attributes An inherited attribute has its value determined by the attribute values of its parent or siblings. Inherited attributes are useful for describing the way in which the meaning of a symbol depends upon the context in which it appears. For example, the meaning of the identifier "num" is different in the two cases below: real num; int num; Thus a "type" attribute cannot be determined from the symbol alone, but must be derived from the attribute of parent or sibling symbols.

Inherited Attribute Example 1) 2) 3) 4) 5) D -> T L T -> int T -> real L 1 -> L 2, id L -> id L.t := T.t T.t := integer T.t := real L 2.t := L 1.t, addtype(id.entry, L 1.t) addtype(id.entry, L.t)

D TL real,id L, L t = real entry addentry(...) Augmented Parse Tree real id 1, id 2, id 3

Information Flow D TL real,id L, L t = real entry real id 1, id 2, id 3 addentry(...)

Dependency Graph The augmented parse tree on the previous slide is called a dependency graph. We use dependency graphs to determine the order in which we must evaluate the attributes to get a completely evaluated parse tree. A topological sort is an ordering of the attributes of a graph which is a valid order in which to evaluate the attributes.

Topological Sort D TL real,id L, L t = real entry real id 1, id 2, id 3 addentry(...) 123456789 10

Evaluation methods parse-tree based At compile time, construct a parse tree, then a dependency graph, then a topological sort. Evaluate the attributes in that order. rule based When the compiler is constructed, analyse the rules for dependencies between attributes, and fix the order of evaluation before compilation begins. oblivious Use a fixed evaluation order without analysing the dependencies. This limits the class of grammars that can be implemented.

Syntax Trees A syntax tree is a condensed parse tree, where the operators and keywords do not appear as leaves, but with the parent nodes that would have been their parents in the parse tree. S => if B then S 1 else S 2 Example: has the syntax tree: if then else BS1S1 S2S2

6 + 2 * 3 E ET + T*TF F F 3 2 6 + 6 * 32

Using Syntax Trees A syntax tree allows the translation process to be separated from the parsing process. A grammar that is best for parsing might not explicitly represent the hierarchical nature of the programs it describes The parsing method imposes an order in which the nodes are considered, which might not be the best order for translation.

Constructing Syntax Trees We can use a syntax-directed definition to create syntax trees in a similar way to the way we created postfix expressions. We will represent each node as a simple data structure. Operator structures will have a name and a number of fields containing pointers to each operand. Simple operand structures will have a type and a value. E.g. 2+3 will be represented by: + num 2 num 3

Functions We require the following three functions: mknode(op,left,right): creates an internal node for the operator "op", with two fields for pointers to the left and right operands. mkleaf_id(id,entry): creates a leaf node for the identifier "id", and a field for a pointer to the symbol table entry for "id". mkleaf_num(num,val): creates a leaf node, labelled "num", with a field for the value of the number. Each function returns a pointer to the node just created.

Example Definition 1) 2) 3) 4) 5) 6) 7) E 1 -> E 2 + T E -> T T 1 -> T 2 * F T -> F F -> ( E ) F -> id F -> num E 1.ptr := mknode("+", E 2.ptr,T.ptr) E.ptr := T.ptr T 1.ptr := mknode("*",T 2.ptr, F.ptr) T.ptr := F.ptr F.ptr := E.ptr F.ptr := mkleaf_id(id, id.entry) F.ptr := mkleaf_num(num,num.val)

Constructing 6+2*x E ET + T*TF F F 3 2 6 ptr = + num 6 id * num 2

Compound Statements CStat-> Stat ; CStat CStat-> Stat Stat-> s s ; s ; s ; s CStat Stat ; ; ; s s s s Parse Tree

CStat 1.ptr := mknode(";", Stat.ptr,CStat 2.ptr) CStat.ptr := Stat.ptr Stat.ptr := mkleaf_id(id, s) ; ; ; s s s s Syntax tree

CStat 1.ptr := CStat 2.ptr; addChild(CStat 1.ptr,Stat.ptr) CStat.ptr := mkXnode(Stat.ptr) Stat.ptr := mkleaf_id(id, s) s sss seq id...(s)... Seq-> CStat Seq.ptr := CStat.ptr; seq...

CStat 1.ptr := Stat.ptr; addSib(Stat.ptr,CStat 2.ptr) CStat.ptr := Stat.ptr Stat.ptr := mkleaf_id(id, s) s sss seq id...(s)... sibling Seq-> CStat Seq.ptr := CStat.ptr;

Sample Program int a b c; int g[5]; int testFunc(int x) { real y; y := (x+a)/2; print(y); return a; } main() { a := 1; while (a < 3) do { testFunc(a); a := a + 1; }

Download ppt "Translation code translation converted code In compilers, this is the task of creating executable from source code. How is it done? So far, we have analysed."

Similar presentations