Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bernd Fischer COMP2010: Compiler Engineering Abstract Syntax Trees.

Similar presentations


Presentation on theme: "Bernd Fischer COMP2010: Compiler Engineering Abstract Syntax Trees."— Presentation transcript:

1 Bernd Fischer b.fischer@ecs.soton.ac.uk COMP2010: Compiler Engineering Abstract Syntax Trees

2 Parse trees represent derivations. E F EC ε T FC ( E ) EC NUM(1) F T FC ε * ID(b) T EC ε +F ε ID(a)T (a+1)*b E→ F EC EC→ + F EC | – F EC | ε F→ T FC FC→ * T FC | / T FC | ε T→ ( E ) | ID | NUM each path corresponds to possible call stack contains “punctuation” tokens: (, ), begin,... ⇒ concrete syntax tree contains redundant non-terminal symbols ⇒ chain rules: E → F → T ⇒ too much detail! E * E +ID(a)NUM(1) ID(b) ? How do we get there?

3 Abstract syntax trees represent the essential structure of derivations. Abstract syntax drops detail: punctuation tokens chain productions Abstract syntax rules can be ambiguous only describes structure of legal trees not meant for parsing usually allows unparsing (text reconstruction) ⇒ abstract syntax tree (AST) is clean interface (a+1)*b E * E +ID(a)NUM(1) ID(b) E→ F EC EC→ + F EC | – F EC | ε F→ T FC FC→ * T FC | / T FC | ε T→ ( E ) | ID | NUM E→E + E | E – E | E * E | E / E | ID | NUM

4 Manually building ASTs in Java Design principle based on abstract syntax grammar: One abstract class per non-terminal One concrete class per rule –One field per non-terminal on rhs public abstract class Expr {} public class Num extends Expr { public int val; public Num(int v) { val=v;} } public class Sum extends Expr { public Expr left,right; public Sum(Expr l, Expr r) {left = l; right = r;} } public class Diff extends Expr { … Alternatively: public class Binop extends Expr { public Expr left,right; public int op; public Binop(Expr l, Expr r, int o; ) {left = l; right = r; op = o;} }

5 Manually building ASTs in Java Design principle based on abstract syntax grammar: One abstract class per non-terminal One concrete class per rule –One field per non-terminal on rhs public abstract class Expr {} public class Num extends Expr { public int val; public Num(int v) { val=v;} } public class Sum extends Expr { public Expr left,right; public Sum(Expr l, Expr r) {left = l; right = r;} } public class Diff extends Expr { … For error reporting: public class Expr { public FilePos start,end; } public class Sum extends Expr { public Sum(Expr l,r) {left = l; right = r; start = l.start; end = r.end;} }

6 Manually building ASTs in Java (II) /* T -> ( E ) | Num */ public static Expr T() throws ParseException { Expr r; switch(token) { case '(': advance(); r = E(); eat(')'); return r; case '0': case '1': case '2': case '3': case '4': case '5': case '6': case '7': case '8': case '9': return Num(); break; default: throw new ParseException("in T"); } change return value type from void add explicit returns add auxiliary variables for results of recursive calls

7 Problem: left-factorization moves left arguments upwards. 3 + 2 - 1 - + 1 3 2 ??? E ECT NUM(3) +TEC –T NUM(2) NUM(1) ε E→ T EC EC→ + T EC | – T EC | ε T→ ( E ) | ID | NUM

8 /* E -> F EC */ public static Expr E() throws ParseException { Expr left = F(); return EC(left); } /* EC -> + F EC | - F EC | epsilon */ public static Expr EC(Expr left) throws ParseException { Expr right; switch(token) { case ')': case '\n': return left; case '+': advance(); right = F(); return EC(new Binop(left, right, PLUS); case '-': advance(); right = F(); return EC(new Binop(left, right, MINUS); default:... } Manually building ASTs in Java (III) add semantic value as argument to functions for left-factorized symbols

9 ANTLR automates building ASTs. Design principle: add tree building instructions to rules rule: rule-elems 1 -> build-instr 1 | rule-elems 2 -> build-instr 2... | rule-elems n -> build-instr n ; build instructions are automatically executed when rule is applied build instructions return AST node or AST node list use with options{output=AST;ASTLabelType=CommonTree;}

10 Basic AST building instructions reference: use AST node from parse element trm: '(' exp ')' -> exp; named reference: resolve ambiguities add: l=exp '+' r=exp -> $l $r; node construction: build tagged node ext: 'exit' exp -> ^('exit’ exp); dcl: type ID -> ^(VARDCL ID type); return exp AST, ignore brackets return list with both exp ASTs tag tokenchildren virtual tag token (must be defined in tokens) children ‘exit' exp

11 Collecting and duplicating elements list elements can be collected into a single list: args: arg (',' arg)* -> arg+; individual elements can be copied into lists: dcl: type ID (',' ID)* -> ^(VARDCL type ID+); vs. dcl: type ID (',' ID)* -> ^(VARDCL type ID)+; VARDCL type[ID, ID, ID,...] VARDCL typeID VARDCL typeID →→

12 Building alternative trees nodes can be null : init: exp? -> ^(INIT exp)?; nodes can be built for empty input: skip: -> ^SKIP; sub-trees can be added: for: 'for' '(' dcl? ';' c=exp? ';' i=exp? ')' stmts -> ^('for' dcl? ^(COND $c)? ^(ITER $i)? stmts); nodes can be built in rule alternatives: if: 'if' '(' expr ')' s1=stmt ('else' s2=stmt -> ^(IFELSE expr $s1 $s2) | -> ^('if' expr $s1) ); 'for' dclCONDITER stmts c i

13 Updating trees nodes can be initialized in rule parts and updated: exp: (INT -> INT) ('+' i=INT -> ^('+' $exp $i))*; 1:1+2:1+2+3: INT(1) '+' $expINT(2) '+' $expINT(3) '+' INT(2)INT(1) '+' INT(3) '+' INT(2)INT(1)


Download ppt "Bernd Fischer COMP2010: Compiler Engineering Abstract Syntax Trees."

Similar presentations


Ads by Google