Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS 153: Concepts of Compiler Design October 10 Class Meeting Department of Computer Science San Jose State University Fall 2015 Instructor: Ron Mak www.cs.sjsu.edu/~mak.

Similar presentations


Presentation on theme: "CS 153: Concepts of Compiler Design October 10 Class Meeting Department of Computer Science San Jose State University Fall 2015 Instructor: Ron Mak www.cs.sjsu.edu/~mak."— Presentation transcript:

1 CS 153: Concepts of Compiler Design October 10 Class Meeting Department of Computer Science San Jose State University Fall 2015 Instructor: Ron Mak www.cs.sjsu.edu/~mak 1

2 Computer Science Dept. Fall 2015: October 19 CS 153: Concepts of Compiler Design © R. Mak Midterm Question 1  Describe how the interpreter can recover from:  division by zero  Set the quotient to 0 and continue execution. uninitialized variable  Assume the variable has the value 0 and continue execution. value out of range  Use the minimum value of the range and continue execution. 2

3 Computer Science Dept. Fall 2015: October 19 CS 153: Concepts of Compiler Design © R. Mak Midterm Question 2  Why is runtime error recovery potentially more dangerous for an application program than syntax error recovery? If there are any syntax errors, the program does not execute. After recovering from a runtime error, the program continues to execute with incorrect values. This may subsequently cause more serious errors. 3

4 Computer Science Dept. Fall 2015: October 19 CS 153: Concepts of Compiler Design © R. Mak Midterm Question 3  To add the complex data type to Pascal: What changes are required in the front end?  TypeDefinitionsParser: Parse the complex data type.  ExpressionParser: Parse complex expressions. What changes are required in the intermediate tier?  TypeSpecImpl: New complex data type.  TypeChecker: Type check complex expressions.  ICodeNodeTypeImpl: New complex node. What changes are required in the interpreter back end?  ExpressionExecutor: Execute complex expressions. 4

5 Computer Science Dept. Fall 2015: October 19 CS 153: Concepts of Compiler Design © R. Mak Midterm Question 4  Represent a complex constant by a parenthesized ordered pair ( expr 1, expr 2 ). What are the challenges of parsing such complex constants?  In ExpressionParser.parseFactor() sees the opening left parenthesis, it will call parseExpression() to parse what appears to be the start of a parenthesized expression. Then the parser will get into trouble at the comma. How can you overcome those challenges?  Don’t use parentheses to enclose a complex constant. Invent new tokens such as the two-character tokens ($ and $). 5

6 Computer Science Dept. Fall 2015: October 19 CS 153: Concepts of Compiler Design © R. Mak Midterm Question 5  In the body of proc2b, what prevents the interpreter from accessing the value of variable i declared in proc2a ? Variable i is not locally declared in proc2b, but instead it is a level 2 variable. Therefore, the interpreter looks in the topmost level 2 activation record on the runtime stack, and it will use i declared in proc1. 6 PROGRAM prog; VAR i, j : integer; PROCEDURE proc1; VAR i : integer; PROCEDURE proc2b; forward; PROCEDURE proc2a; VAR i, j : integer; BEGIN {proc2a} i := 100; j := 200; proc2b; END; PROCEDURE proc2b; VAR k : integer; BEGIN {proc2b} k := i + j; writeln(k); END; BEGIN {proc1} i := 10; proc2a; END; BEGIN {prog} i := 1; j := 2; proc1; END.

7 Computer Science Dept. Fall 2015: October 19 CS 153: Concepts of Compiler Design © R. Mak Midterm Question 5  When executing the same assignment statement, how is the interpreter able to access the value of variable j declared at the top level? Variable j is not locally declared in proc2b, but instead it is a level 1 variable. Therefore, the interpreter looks in the topmost level 1 activation record on the runtime stack, and it will use j declared in prog.  What value of k is written? 12 7 PROGRAM prog; VAR i, j : integer; PROCEDURE proc1; VAR i : integer; PROCEDURE proc2b; forward; PROCEDURE proc2a; VAR i, j : integer; BEGIN {proc2a} i := 100; j := 200; proc2b; END; PROCEDURE proc2b; VAR k : integer; BEGIN {proc2b} k := i + j; writeln(k); END; BEGIN {proc1} i := 10; proc2a; END; BEGIN {prog} i := 1; j := 2; proc1; END.

8 Computer Science Dept. Fall 2015: October 19 CS 153: Concepts of Compiler Design © R. Mak 8 Minimum Acceptable Compiler Project  At least two data types with type checking.  Basic arithmetic operations with operator precedence.  Assignment statements.  At least one conditional control statement (e.g., IF).  At least one looping control statement.  Procedures or functions with calls and returns.  Parameters passed by value or by reference.  Basic error recovery (skip to semicolon or end of line).  “Nontrivial” sample programs written in the source language.  Generate Jasmin code that can be assembled.  Execute the resulting.class file standalone (preferred) or with a test harness.  No crashes (e.g., null pointer exceptions). 70 points/100

9 Computer Science Dept. Fall 2015: October 19 CS 153: Concepts of Compiler Design © R. Mak 9 Ideas for Programming Languages  A language that works with a database such as MySQL Combines Pascal and SQL for writing database applications. Compiled code hides JDBC calls from the programmer. Not PL/SQL – use the language to write client programs.  A language that can access web pages Statements that “scrape” pages to extract information.  A language for generating business reports A Pascal-like language with features that make it easy to generate reports.  A string-processing language Combines Pascal and Perl for writing applications that involve pattern matching and string transformations. DSL = Domain-Specific Language

10 Computer Science Dept. Fall 2015: October 19 CS 153: Concepts of Compiler Design © R. Mak 10 Can We Build a Better Scanner?  Our scanner in the front end is relatively easy to understand and follow. Separate scanner classes for each token type.  However, it’s big and slow. Separate scanner classes for each token type. Creates lots of objects and makes lots of method calls.  We can write a more compact and faster scanner. However, it may be harder to understand and follow.

11 Computer Science Dept. Fall 2015: October 19 CS 153: Concepts of Compiler Design © R. Mak 11 Deterministic Finite Automata (DFA)  Pascal identifier Regular expression: ( | )* Implement the regular expression with a finite automaton (AKA finite state machine): 123 letter digit [other]

12 Computer Science Dept. Fall 2015: October 19 CS 153: Concepts of Compiler Design © R. Mak 12 Deterministic Finite Automata (DFA) 123 letter digit [other] start state accepting state transition  This automaton is a deterministic finite automaton (DFA). At each state, the next input character uniquely determines which transition to take to the next state.

13 Computer Science Dept. Fall 2015: October 19 CS 153: Concepts of Compiler Design © R. Mak 13 State-Transition Matrix  Represent the behavior of a DFA by a state-transition matrix: 123 letter digit [other]

14 Computer Science Dept. Fall 2015: October 19 CS 153: Concepts of Compiler Design © R. Mak 14 DFA for a Pascal Number 69104711 digit + + - E E. 58 12 [other] 3 - 0 Note that this diagram allows only an upper-case E for an exponent. What changes are required to also allow a lower-case e ?

15 Computer Science Dept. Fall 2015: October 19 CS 153: Concepts of Compiler Design © R. Mak 15 DFA for a Pascal Identifier or Number 69104711 digit + + - E E. 58 12 [other] 3 - digit 1 2 0 letter [other] letter private static final int matrix[][] = { /* letter digit + -. E other */ /* 0 */ { 1, 4, 3, 3, ERR, 1, ERR }, /* 1 */ { 1, 1, -2, -2, -2, 1, -2 }, /* 2 */ { ERR, ERR, ERR, ERR, ERR, ERR, ERR }, /* 3 */ { ERR, 4, ERR, ERR, ERR, ERR, ERR }, /* 4 */ { -5, 4, -5, -5, 6, 9, -5 }, /* 5 */ { ERR, ERR, ERR, ERR, ERR, ERR, ERR }, /* 6 */ { ERR, 7, ERR, ERR, ERR, ERR, ERR }, /* 7 */ { -8, 7, -8, -8, -8, 9, -8 }, /* 8 */ { ERR, ERR, ERR, ERR, ERR, ERR, ERR }, /* 9 */ { ERR, 11, 10, 10, ERR, ERR, ERR }, /* 10 */ { ERR, 11, ERR, ERR, ERR, ERR, ERR }, /* 11 */ { -12, 11, -12, -12, -12, -12, -12 }, /* 12 */ { ERR, ERR, ERR, ERR, ERR, ERR, ERR }, }; Negative numbers in the matrix are the accepting states. Notice how the letter E is handled!

16 Computer Science Dept. Fall 2015: October 19 CS 153: Concepts of Compiler Design © R. Mak 16 A Simple DFA Scanner public class SimpleDFAScanner { // Input characters. private static final int LETTER = 0; private static final int DIGIT = 1; private static final int PLUS = 2; private static final int MINUS = 3; private static final int DOT = 4; private static final int E = 5; private static final int OTHER = 6; private static final int ERR = -99999; // error state private static final int matrix[][] = {... }; private char ch; // current input character private int state; // current state... }

17 Computer Science Dept. Fall 2015: October 19 CS 153: Concepts of Compiler Design © R. Mak 17 A Simple DFA Scanner, cont’d int typeOf(char ch) { return (ch == 'E') ? E : Character.isLetter(ch) ? LETTER : Character.isDigit(ch) ? DIGIT : (ch == '+') ? PLUS : (ch == '-') ? MINUS : (ch == '.') ? DOT : OTHER; }

18 Computer Science Dept. Fall 2015: October 19 CS 153: Concepts of Compiler Design © R. Mak 18 A Simple DFA Scanner, cont’d private String nextToken() throws IOException { while (Character.isWhitespace(ch)) nextChar(); if (ch == 0) return null; // EOF? state = 0; // start state StringBuilder buffer = new StringBuilder(); while (state >= 0) { // not accepting state state = matrix[state][typeOf(ch)]; // transit if ((state >= 0) || (state == ERR)) { buffer.append(ch); // build token string nextChar(); } } return buffer.toString(); } This is the heart of the scanner. Table-driven scanners can be very fast!

19 Computer Science Dept. Fall 2015: October 19 CS 153: Concepts of Compiler Design © R. Mak 19 A Simple DFA Scanner, cont’d private void scan() throws IOException { nextChar(); while (ch != 0) { // EOF? String token = nextToken(); if (token != null) { System.out.print("=====> \"" + token + "\" "); String tokenType = (state == -2) ? "IDENTIFIER" : (state == -5) ? "INTEGER" : (state == -8) ? "REAL (fraction only)" : (state == -12) ? "REAL" : "*** ERROR ***"; System.out.println(tokenType); } } } How do we know which token we just got? Demo


Download ppt "CS 153: Concepts of Compiler Design October 10 Class Meeting Department of Computer Science San Jose State University Fall 2015 Instructor: Ron Mak www.cs.sjsu.edu/~mak."

Similar presentations


Ads by Google