Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS 153: Concepts of Compiler Design October 27 Class Meeting Department of Computer Science San Jose State University Fall 2014 Instructor: Ron Mak www.cs.sjsu.edu/~mak.

Similar presentations


Presentation on theme: "CS 153: Concepts of Compiler Design October 27 Class Meeting Department of Computer Science San Jose State University Fall 2014 Instructor: Ron Mak www.cs.sjsu.edu/~mak."— Presentation transcript:

1 CS 153: Concepts of Compiler Design October 27 Class Meeting Department of Computer Science San Jose State University Fall 2014 Instructor: Ron Mak www.cs.sjsu.edu/~mak 1

2 Computer Science Dept. Fall 2014: October 27 CS 153: Concepts of Compiler Design © R. Mak Tesla Motors Headquarters Visit  Palo Alto  Friday afternoon, November 14  See Piazza for details! 2

3 Computer Science Dept. Fall 2014: October 27 CS 153: Concepts of Compiler Design © R. Mak 3 Review: JavaCC Compiler-Compiler  Feed JavaCC the grammar for a source language and it will automatically generate a scanner and a parser. Specify the source language tokens with regular expressions  JavaCC generates a scanner for the source language. Specify the source language syntax rules with Extended BNF  JavaCC generates a parser for the source language.

4 Computer Science Dept. Fall 2014: October 27 CS 153: Concepts of Compiler Design © R. Mak 4 Review: JavaCC Compiler-Compiler, cont’d  The generated scanner and parser are written in Java.  Note: JavaCC calls the scanner the “tokenizer”. _

5 Computer Science Dept. Fall 2014: October 27 CS 153: Concepts of Compiler Design © R. Mak 5 Review: JavaCC Regular Expressions  Literals  Character classes  Character ranges  Alternates Token name Token string

6 Computer Science Dept. Fall 2014: October 27 CS 153: Concepts of Compiler Design © R. Mak 6 Review: JavaCC Regular Expressions, cont’d  Negation  Repetition  Quantifiers

7 Computer Science Dept. Fall 2014: October 27 CS 153: Concepts of Compiler Design © R. Mak 7 JavaCC Parser Specification  Use JavaCC regular expressions to specify tokens.  Use EBNF to specify JavaCC production rules.  Phone number example from Chapter 3 of the JavaCC book. Example phone number: 408-123-4567 EBNF: ::= 0|1|2|3|4|5|6|7|8|9 ::= ::= - -

8 Computer Science Dept. Fall 2014: October 27 CS 153: Concepts of Compiler Design © R. Mak 8 JavaCC Parser Specification, cont’d EBNF : JavaCC : TOKEN : { ){4}> | ){3}> | } void PhoneNumber() : {} { "-" "-" } Token specifications Production rule Java statements can go in here! ::= 0|1|2|3|4|5|6|7|8|9 ::= ::= - - phone.jj Terminal Literal Terminal Nonterminal

9 Computer Science Dept. Fall 2014: October 27 CS 153: Concepts of Compiler Design © R. Mak 9 JavaCC Production Rule Methods  JavaCC generates a top-down recursive-descent parser. Each production rule becomes a Java method of the parser class. You can pass parameters to the methods. void PhoneNumber() : { StringBuffer sb = new StringBuffer(); } { AreaCode(sb) "-" {sb.append(token.image);} "-" {sb.append(token.image);} {System.out.println("Number: " + sb.toString());} } void AreaCode(StringBuffer buf) : {} { {buf.append(token.image);} } Java statement. phone_method_param.jj w/ and w/o parser debug Syntactic action.

10 Computer Science Dept. Fall 2014: October 27 CS 153: Concepts of Compiler Design © R. Mak 10 Grammar Problems  Be very careful when specifying grammars!  JavaCC will not be able to generate a correct parser for a faulty grammar.  Common grammar faults include choice conflict left recursion _

11 Computer Science Dept. Fall 2014: October 27 CS 153: Concepts of Compiler Design © R. Mak 11 Choice Conflict  Suppose we want to parse both local phone numbers and long-distance phone numbers: Local: 123-4567 Long-distance: 201-456-7890 ::= - ::= - - ::=  Choice conflict! While attempting to parse “123-4567”, the parser cannot tell whether the initial “123” is a or an since they are both. phone_choice.jj

12 Computer Science Dept. Fall 2014: October 27 CS 153: Concepts of Compiler Design © R. Mak 12 Choice Conflict Resolution: Left Factoring  One way to resolve a choice conflict is by left factoring. Factor out the common head from the productions. void PhoneNumber() : {} { Head() "-" ( LocalNumber() | LongDistanceNumber() ) } void LocalNumber() : {} { } void LongDistanceNumber() : {} { "-" } void Head() : {} { } phone_left_factored.jj How does this fix the problem?

13 Computer Science Dept. Fall 2014: October 27 CS 153: Concepts of Compiler Design © R. Mak 13 Lookahead  A top-down parser naturally “looks ahead” one token. This token tells the parser which nonterminal it will parse next. “ IF ” : next parse an IF statement “ REPEAT ” : next parse a REPEAT statement  A choice conflict occurs if a one-token lookahead is not sufficient to determine which nonterminal to parse next. Next parse a local number or a long-distance number?

14 Computer Science Dept. Fall 2014: October 27 CS 153: Concepts of Compiler Design © R. Mak 14 Backtracking  The parser cannot backtrack.  Suppose the parser has parsed “123-” It decides that’s an area code, so it must be parsing a long-distance number.  Now it sees “4567”. Oops! It cannot backtrack and reparse “123-” as the prefix to a local number. _

15 Computer Science Dept. Fall 2014: October 27 CS 153: Concepts of Compiler Design © R. Mak 15 Choice Conflict Resolution: Lookahead  Another way to resolve a choice conflict is by telling the parser to look ahead more than just one token.  To decide between parsing a local number and a long-distance telephone number: One-token lookahead is insufficient: “123” Two-token lookahead is insufficient: “123-” Three-token lookahead will distinguish a local number from a long-distance number: “123-4567” void PhoneNumber() : {} { ( LOOKAHEAD(3) LocalNumber() | LongDistanceNumber() ) } By looking ahead three tokens, the parser can successfully choose between LocalNumber() and LongDistanceNumber(). phone_lookahead.jj

16 Computer Science Dept. Fall 2014: October 27 CS 153: Concepts of Compiler Design © R. Mak 16 Lookahead  Global lookahead Major performance penalty. Avoid if possible!  Syntactic lookahead  Semantic lookahead  Nested lookahead  Too convoluted! Minimize the need for these.  Why would you design a grammar that needed these?

17 Computer Science Dept. Fall 2014: October 27 CS 153: Concepts of Compiler Design © R. Mak 17 Lookahead  Lookahead will slow down parsing.  Try to design grammars that do not require more than one token of lookahead.  For example, Pascal only requires one-token lookahead.

18 Computer Science Dept. Fall 2014: October 27 CS 153: Concepts of Compiler Design © R. Mak 18 Left Recursion  Suppose we want to parse very simple expressions like “1+2”, “1+2+3”, “9+4+7+2”, etc. ::= + | ::=  Left recursion! The nonterminal refers to itself recursively such that the recursion will never end.  Because the recursive reference is at the left end of the rule, no tokens are consumed. expression_left_recursion.jj ::= +

19 Computer Science Dept. Fall 2014: October 27 CS 153: Concepts of Compiler Design © R. Mak 19 Left Recursion Resolution: Iteration  Resolve left recursion by replacing it with iteration. Instead of: ::= + | ::= Use EBNF: ::= { + } ::= void Expression() : {} { Term() ("+" Term())* { System.out.println("Parsed expression"); } } expression_iteration.jj

20 Computer Science Dept. Fall 2014: October 27 CS 153: Concepts of Compiler Design © R. Mak 20 Right Recursion  Right recursion: ::= + | ::=  Right recursion is not a problem for JavaCC. Because there are non-recursive references to the left of the recursive reference, tokens are consumed by the scanner.  The parser continues to make forward progress.  The recursion ends as soon as the parser sees a token that doesn’t fit the production rule.

21 Computer Science Dept. Fall 2014: October 27 CS 153: Concepts of Compiler Design © R. Mak 21 Right Recursion expression_right_recursion.jj  However, there may be choice conflicts.  Does a start + or simply ?  How much lookahead do we need?

22 Computer Science Dept. Fall 2014: October 27 CS 153: Concepts of Compiler Design © R. Mak 22 JJDoc  JJDoc produces documentation for your grammar.  Right-click in the.jj edit window.  It generates an HTML file from a.jj grammar file.  Read Chapter 5 of the JavaCC book. Ideal for your project documentation! Demo


Download ppt "CS 153: Concepts of Compiler Design October 27 Class Meeting Department of Computer Science San Jose State University Fall 2014 Instructor: Ron Mak www.cs.sjsu.edu/~mak."

Similar presentations


Ads by Google