Download presentation
Presentation is loading. Please wait.
Published byBudi Gunardi Modified over 5 years ago
1
CS 153: Concepts of Compiler Design October 22 Class Meeting
Department of Computer Science San Jose State University Fall 2019 Instructor: Ron Mak
2
ANTLR 4 Review Feed ANTLR a .g4 grammar file.
ANTLR generates (in Java or C++): a parser a lexer (scanner) parse tree utilities Therefore, for your compiler projects, you don’t have to write that code. You must have a correct grammar file.
3
ANTLR 4 Plugin for Eclipse
If you use the ANTLR 4 plugin, Eclipse will automatically generate the parser and the lexer from the grammar. Create an ANTLR 4 project. The plugin will generate a syntax diagram from the grammar. The plugin will generate a parse tree from a source program, according to the grammar.
4
ANTLR 4 Plugin for Eclipse, cont’d
5
ANTLR 4 Plugin for Eclipse, cont’d
6
ANTLR Workflow The Definitive ANTLR 4 Reference by Terence Parr
The Pragmatic Programmers, 2012
7
Syntax Error Handling An ANTLR-generated parser has basic syntax error handling and recovery. You can improve the error handling. 193 a = 5 b = 6 (a+b*2 (1+2)*3 Parse tree (Lisp format): (prog (stat (expr 193) \n) (stat a = (expr 5) \n) (stat b = (expr 6) \n) (stat (expr ( (expr (expr a) + (expr (expr b) * (expr 2))) <missing ')'>) \n) (stat (expr (expr ( (expr (expr 1) + (expr 2)) )) * (expr 3)) \n)) line 4:6 missing ')' at '\n' Demo
8
Resolving Ambiguities
Is f() a function call as a standalone statement, or a function call in an expression? stat: expr ';' | ID '(' ')' ';' ; expr: ID '(' ')' | INT The Definitive ANTLR 4 Reference by Terence Parr The Pragmatic Programmers, 2012
9
Resolving Ambiguities, cont’d
Is begin a reserved word or an identifier? ANTLR resolves an ambiguity by choosing the first alternative in the grammar. BEGIN : 'begin' ; ID : [a-z]+ ;
10
ANTLR Parse Trees A token stream is the “pipe” between the lexer and the parser. Each token object records the start and stop character indexes into the character stream. The Definitive ANTLR 4 Reference by Terence Parr The Pragmatic Programmers, 2012
11
ANTLR Parse Trees, cont’d
ANTLR generates a RuleNode subclass for each grammar rule. They are called context objects because they record everything about the recognition phase of a rule. The Definitive ANTLR 4 Reference by Terence Parr The Pragmatic Programmers, 2012
12
ANTLR Parse Trees, cont’d
The ANTLR-generated parser has corresponding parse tree node class names. The Definitive ANTLR 4 Reference by Terence Parr The Pragmatic Programmers, 2012
13
ANTLR Pcl Pcl, a tiny subset of Pascal.
Use ANTLR to generate a Pcl parser and lexer and integrate them with our Pascal interpreter’s symbol table code. ANTLR doesn’t do symbol tables Parse a Pcl program and print the symbol table. Sample program sample.pas: sample.pas PROGRAM sample; VAR i, j : integer; alpha, beta5x : real; BEGIN REPEAT j := 3; i := 2 + 3*j UNTIL i >= j + 2; IF i <= j THEN i := j; IF j > i THEN i := 3*j ELSE BEGIN alpha := 9; beta5x := alpha/3 - alpha*2; END END.
14
Pcl.g4 grammar Pcl; // A tiny subset of Pascal
program : header block '.' ; header : PROGRAM IDENTIFIER ';' ; block : declarations compound_stmt ; declarations : VAR decl_list ';' ; decl_list : decl ( ';' decl )* ; decl : var_list ':' type_id ; var_list : var_id ( ',' var_id )* ; var_id : IDENTIFIER ; type_id : IDENTIFIER ; compound_stmt : BEGIN stmt_list END ; stmt : compound_stmt # compoundStmt | assignment_stmt # assignmentStmt | repeat_stmt # repeatStmt | if_stmt # ifStmt | # emptyStmt ; Pcl.g4
15
Pcl.g4, cont’d stmt_list : stmt ( ';' stmt )* ;
assignment_stmt : variable ':=' expr ; repeat_stmt : REPEAT stmt_list UNTIL expr ; if_stmt : IF expr THEN stmt ( ELSE stmt )? ; variable : IDENTIFIER ; expr : expr mul_div_op expr # mulDivExpr | expr add_sub_op expr # addSubExpr | expr rel_op expr # relExpr | number # numberConst | IDENTIFIER # identifier | '(' expr ')' # parens ; number : sign? INTEGER ; sign : '+' | '-' ; mul_div_op : MUL_OP | DIV_OP ; add_sub_op : ADD_OP | SUB_OP ; rel_op : EQ_OP | NE_OP | LT_OP | LE_OP | GT_OP | GE_OP ; Pcl.g4
16
Pcl.g4, cont’d PROGRAM : 'PROGRAM' ; BEGIN : 'BEGIN' ; END : 'END' ;
VAR : 'VAR' ; REPEAT : 'REPEAT' ; UNTIL : 'UNTIL' ; IF : 'IF' ; THEN : 'THEN' ; ELSE : 'ELSE'; IDENTIFIER : [a-zA-Z][a-zA-Z0-9]* ; INTEGER : [0-9]+ ; MUL_OP : '*' ; DIV_OP : '/' ; ADD_OP : '+' ; SUB_OP : '-' ; MUL_OP : '*' ; DIV_OP : '/' ; ADD_OP : '+' ; SUB_OP : '-' ; EQ_OP : '=' ; NE_OP : '<>' ; LT_OP : '<' ; LE_OP : '<=' ; GT_OP : '>' ; GE_OP : '>=' ; NEWLINE : '\r'? '\n' -> skip ; WS : [ \t]+ -> skip ; Pcl.g4
17
Pcl Syntax Diagrams
18
Pcl Syntax Diagrams, cont’d
19
Pcl Syntax Diagrams, cont’d
20
Pcl Syntax Diagrams, cont’d
21
Pcl Syntax Diagrams, cont’d
22
Assignment #6 Write the first draft of the ANTLR 4 grammar file for your source language. Generate a syntax diagram. Use the Eclipse ANTLR plugin. Generate the parser and lexer. Compile a sample source program. Generate a parse tree from the source program. Use either grun on the command line or the Eclipse ANTLR plugin. Due: Friday, November 1.
23
Starter Main Program for Assignment #6
import org.antlr.v4.runtime.*; import org.antlr.v4.runtime.tree.ParseTree; import java.io.FileInputStream; import java.io.InputStream; public class Pcl { public static void main(String[] args) throws Exception { String inputFile = null; if (args.length > 0) inputFile = args[0]; InputStream is = (inputFile != null) ? new FileInputStream(inputFile) : System.in; CharStream cs = CharStreams.fromStream(is); PclLexer lexer = new PclLexer(cs); CommonTokenStream tokens = new CommonTokenStream(lexer); System.out.println("Tokens:"); tokens.fill(); for (Token token : tokens.getTokens()) { System.out.println(token.toString()); } PclParser parser = new PclParser(tokens); ParseTree tree = parser.program(); CompilerVisitor compiler = new CompilerVisitor(); compiler.visit(tree); } }
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.