CMPE 152: Compiler Design August 28 Class Meeting

Slides:



Advertisements
Similar presentations
Mini-Pascal Compiling Mini-Pascal (MPC) language
Advertisements

CS 153: Concepts of Compiler Design August 25 Class Meeting Department of Computer Science San Jose State University Fall 2014 Instructor: Ron Mak
CS 153: Concepts of Compiler Design August 24 Class Meeting Department of Computer Science San Jose State University Fall 2015 Instructor: Ron Mak
CS 153: Concepts of Compiler Design September 2 Class Meeting Department of Computer Science San Jose State University Fall 2015 Instructor: Ron Mak
CS 153: Concepts of Compiler Design August 31 Class Meeting Department of Computer Science San Jose State University Fall 2015 Instructor: Ron Mak
CS 153: Concepts of Compiler Design September 9 Class Meeting Department of Computer Science San Jose State University Fall 2015 Instructor: Ron Mak
CS 153: Concepts of Compiler Design October 5 Class Meeting Department of Computer Science San Jose State University Fall 2015 Instructor: Ron Mak
CS 152: Programming Language Paradigms April 2 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak
CS 153: Concepts of Compiler Design August 26 Class Meeting Department of Computer Science San Jose State University Fall 2015 Instructor: Ron Mak
CS 153: Concepts of Compiler Design September 16 Class Meeting Department of Computer Science San Jose State University Fall 2015 Instructor: Ron Mak
CS 153: Concepts of Compiler Design September 21 Class Meeting Department of Computer Science San Jose State University Fall 2015 Instructor: Ron Mak
CS 153: Concepts of Compiler Design October 10 Class Meeting Department of Computer Science San Jose State University Fall 2015 Instructor: Ron Mak
CS 153: Concepts of Compiler Design September 30 Class Meeting Department of Computer Science San Jose State University Fall 2015 Instructor: Ron Mak
 In the java programming language, a keyword is one of 50 reserved words which have a predefined meaning in the language; because of this,
CS 153: Concepts of Compiler Design September 23 Class Meeting Department of Computer Science San Jose State University Fall 2015 Instructor: Ron Mak
CS 153: Concepts of Compiler Design September 28 Class Meeting Department of Computer Science San Jose State University Fall 2015 Instructor: Ron Mak
CS 153: Concepts of Compiler Design September 14 Class Meeting
CS 153: Concepts of Compiler Design August 24 Class Meeting
Constructing Precedence Table
CS 153: Concepts of Compiler Design August 29 Class Meeting
CS 432: Compiler Construction Lecture 3
CS 153: Concepts of Compiler Design October 17 Class Meeting
CS 153: Concepts of Compiler Design October 5 Class Meeting
CS 432: Compiler Construction Lecture 10
CS 432: Compiler Construction Lecture 2
CS 153: Concepts of Compiler Design August 31 Class Meeting
CS 153: Concepts of Compiler Design September 7 Class Meeting
CS 153: Concepts of Compiler Design October 3 Class Meeting
CS 153: Concepts of Compiler Design November 30 Class Meeting
CMPE 152: Compiler Design December 5 Class Meeting
CMPE 152: Compiler Design February 6 Class Meeting
CMPE 152: Compiler Design September 25 Class Meeting
Lecturer: Mukhtar Mohamed Ali “Hakaale”
CMPE 152: Compiler Design September 4 Class Meeting
CMPE 152: Compiler Design September 6 Class Meeting
CMPE 152: Compiler Design August 30 Class Meeting
CS 432: Compiler Construction Lecture 7
CMPE 152: Compiler Design September 11 Class Meeting
CMPE 152: Compiler Design September 18 Class Meeting
CMPE 152: Compiler Design September 13 Class Meeting
CMPE 152: Compiler Design October 4 Class Meeting
CMPE 152: Compiler Design September 11/13 Lab
CMPE 152: Compiler Design October 4 Class Meeting
CMPE 152: Compiler Design August 23 Class Meeting
CMPE 152: Compiler Design August 21/23 Lab
CMPE 152: Compiler Design September 20 Class Meeting
CMPE 152: Compiler Design September 27 Class Meeting
CMPE 152: Compiler Design January 31 Class Meeting
CMPE 152: Compiler Design February 14 Class Meeting
CMPE 152: Compiler Design February 28 Class Meeting
CMPE 152: Compiler Design March 7 Class Meeting
CMPE 152: Compiler Design April 9 Class Meeting
CMPE 152: Compiler Design January 29 Class Meeting
CMPE 152: Compiler Design February 12 Class Meeting
CMPE 152: Compiler Design February 7 Class Meeting
CMPE 152: Compiler Design February 21/26 Lab
CMPE 152: Compiler Design February 28 / March 5 Lab
CMPE 152: Compiler Design February 21 Class Meeting
CMPE 152: Compiler Design February 26 Class Meeting
CMPE 152: Compiler Design February 19 Class Meeting
CMPE 152: Compiler Design March 5 Class Meeting
CMPE 152: Compiler Design December 4 Class Meeting
CMPE 152: Compiler Design March 19 Class Meeting
CMPE 152: Compiler Design March 7/12 Lab
CMPE 152: Compiler Design September 3 Class Meeting
CMPE 152: Compiler Design August 27 Class Meeting
CMPE 152: Compiler Design September 17 Class Meeting
CMPE 152: Compiler Design February 7 Class Meeting
CMPE 152: Compiler Design September 26 Class Meeting
CMPE 152: Compiler Design September 19 Class Meeting
Presentation transcript:

CMPE 152: Compiler Design August 28 Class Meeting Department of Computer Engineering San Jose State University Fall 2018 Instructor: Ron Mak www.cs.sjsu.edu/~mak

Pascal-Specific Front End Classes PascalParserTD is a subclass of Parser and implements the parse() and getErrorCount() methods for Pascal. TD for “top down” PascalScanner is a subclass of Scanner and implements the extractToken() method for Pascal. Strategy Design Pattern

The Pascal Parser Class The initial version of method parse() does hardly anything, but it forces the scanner into action and serves our purpose of doing end-to-end testing. frontend/pascal/PascalParserTD.cpp void PascalParserTD::parse() throw (string) {     steady_clock::time_point start_time = steady_clock::now();     int last_line_number;     Token *token = nullptr;     while ((token = next_token(token)) != nullptr)     {         last_line_number = token->get_line_number();     }     steady_clock::time_point end_time = steady_clock::now();     double elapsed_time =             duration_cast<duration<double>>(end_time - start_time).count();     Message message(PARSER_SUMMARY,                     LINE_COUNT, to_string(last_line_number),                     ERROR_COUNT, to_string(get_error_count()),                     ELAPSED_TIME, to_string(elapsed_time));     send_message(message); } What does this while loop do? Send the parser summary message.

The Pascal Scanner Class The initial version of method extractToken() doesn’t do much either, other than create and return either a default token or the EOF token. frontend/pascal/PascalScanner.cpp Token *PascalScanner::extract_token() throw (string) {     Token *token;     char current_ch = current_char();     // Construct the next token.  The current character determines the     // token type.     if (current_ch == Source::END_OF_FILE)     {         token = nullptr;     }     else         token = new Token(source);     return token; } Remember that the Scanner method nextToken() calls the abstract method extractToken(). Here, the Scanner subclass PascalScanner implements method extractToken().

The Token Class The Token class’s default extract() method extracts just one character from the source. This method will be overridden by the various token subclasses. It serves our purpose of doing end-to-end testing. frontend/Token.cpp void Token::extract() throw (string) {     text = to_string(current_char());     next_char();  // consume current character }

The Token Class, cont’d A character (or a token) is “consumed” after it has been read and processed, and the next one is about to be read. If you forget to consume, you will loop forever on the same character or token.

A Front End Factory Class A language-specific parser goes together with a scanner for the same language. But we don’t want the framework classes to be tied to a specific language. Framework classes should be language-independent. We use a factory class to create a matching parser-scanner pair. Factory Method Design Pattern

A Front End Factory Class, cont’d Good: Arguments to the createParser() method enable it to create and return a parser bound to an appropriate scanner. Variable parser doesn’t have to know what kind of parser subclass the factory created. Once again, the idea is to maintain loose coupling. “Coding to the interface.” Parser parser = FrontendFactory::create_parser( … );

A Front End Factory Class, cont’d Good: Bad: Why is this bad? Now variable parser is tied to a specific language. Parser parser = FrontendFactory::create_parser( … ); PascalParserTD parser = new PascalParserTD( … )

A Front End Factory Class, cont’d Parser *FrontendFactory::create_parser(string language, string type,                                        Source *source)     throw (string) {     if ((language == "Pascal") && (type == "top-down"))     {         Scanner *scanner = new PascalScanner(source);         return new PascalParserTD(scanner);     }     else if (language != "Pascal") {         throw new string("Parser factory: Invalid language '" +                          language + "'");     else {         throw new string("Parser factory: Invalid type '" +                          type + "'"); } frontend/FrontendFactory.cpp

Initial Back End Subclasses The CodeGenerator and Executor subclasses will only be (do-nothing) stubs for now. Strategy Design Pattern

The Code Generator Class All the process() method does for now is send the COMPILER_SUMMARY message. number of instructions generated (none for now) code generation time (nearly no time at all for now) void CodeGenerator::process(ICode *icode, SymTab *symtab) throw (string) {     steady_clock::time_point start_time = steady_clock::now();     int instruction_count = 0;     // Send the compiler summary message.     steady_clock::time_point end_time = steady_clock::now();     double elapsed_time =         duration_cast<duration<double>>(end_time - start_time).count();     Message message(COMPILER_SUMMARY,                     INSTRUCTION_COUNT, to_string(instruction_count),                     ELAPSED_TIME, to_string(elapsed_time));     send_message(message); } backend/compiler/CodeGenerator.cpp

The Executor Class All the process() method does for now is send the INTERPRETER_SUMMARY message. number of statements executed (none for now) number of runtime errors (none for now) execution time (nearly no time at all for now) void Executor::process(ICode *icode, SymTab *symtab) throw (string) {     steady_clock::time_point start_time = steady_clock::now();     int execution_count = 0;     int runtime_errors = 0;     // Send the interpreter summary message.     steady_clock::time_point end_time = steady_clock::now();     double elapsed_time =             duration_cast<duration<double>>(end_time - start_time).count();     Message message(INTERPRETER_SUMMARY,                     EXECUTION_COUNT, to_string(execution_count),                     ERROR_COUNT, to_string(runtime_errors),                     ELAPSED_TIME, to_string(elapsed_time));     send_message(message); } backend/interpreter/Executor.cpp

A Back End Factory Class Backend *BackendFactory::create_backend(string operation) throw (string) {     if (operation == "compile")         return new CodeGenerator();     }     else if (operation == "execute")         return new Executor();     else         throw new string("Backend factory: Invalid operation '" +                          operation + "'"); } backend/BackendFactory.cpp

End-to-End: Program Listings Here’s the heart of the main Pascal class’s constructor: Pascal.cpp source = new Source(input); source->add_message_listener(this); parser = FrontendFactory::create_parser("Pascal", "top-down", source); parser->add_message_listener(this); parser->parse(); source->close(); symtab = parser->get_symtab(); icode = parser->get_icode(); backend = BackendFactory::create_backend(operation); backend->add_message_listener(this); backend->process(icode, symtab); The front end parser creates the intermediate code and the symbol table of the intermediate tier. The back end processes the intermediate code and the symbol table .

Listening to Messages Class Pascal implements the MessageListener interface. const string Pascal::SOURCE_LINE_FORMAT = "%03d %s\n"; void Pascal::message_received(Message& message) {     MessageType type = message.get_type();     switch (type)     {         case SOURCE_LINE:         {             string line_number = message[LINE_NUMBER];             string line_text = message[LINE_TEXT];             printf(SOURCE_LINE_FORMAT.c_str(),                    stoi(line_number), line_text.c_str());             break;         } } } Demo Pascal.cpp

Is it Really Worth All this Trouble? Major software engineering challenges: Managing change. Managing complexity. To help manage change, use the open-closed principle. Close the code for modification. Open the code for extension. Closed: The language-independent framework classes. Open: The language-specific subclasses.

Is it Really Worth All this Trouble? cont’d Techniques to help manage complexity: Partitioning Loose coupling Incremental development Always build upon working code. Good object-oriented design with design patterns.

Source Files from the Book Download the Java source code from each chapter of the book: http://www.cs.sjsu.edu/~mak/CMPE152/sources/ You will not survive this course if you use a simple text editor like Notepad to view and edit the Java code. The complete Pascal interpreter in Chapter 12 contains over 120 classes.

Integrated Development Environment (IDE) You can use either Eclipse CDT or NetBeans. Eclipse is preferred because later you will be able to use an ANTLR plug-in. Learn how to create projects, edit source files, single-step execution, set breakpoints, examine variables, read stack dumps, etc.

Pascal-Specific Front End Classes

The Payoff Now that we have … Source language-independent framework classes Pascal-specific subclasses Mostly just placeholders for now An end-to-end test (the program listing generator) … we can work on the individual components Without worrying (too much) about breaking the rest of the code.

Front End Framework Classes

Pascal-Specific Subclasses

Pascal-Specific Token Classes Each class PascalWordToken, PascalNumberToken, PascalStringToken, PascalSpecial-SymbolToken, and PascalErrorToken is is a subclass of class PascalToken. PascalToken is a subclass of class Token. Each Pascal token subclass overrides the default extract() method of class Token. The default method could only create single-character tokens. Loosely coupled. Highly cohesive.

PascalTokenType Details

PascalTokenType Each token is an enumerated value. enum class PascalTokenType {     // Reserved words.     AND, ARRAY, BEGIN, CASE, CONST, DIV, DO, DOWNTO, ELSE, END,     FILE, FOR, FUNCTION, GOTO, IF, IN, LABEL, MOD, NIL, NOT,     OF, OR, PACKED, PROCEDURE, PROGRAM, RECORD, REPEAT, SET,     THEN, TO, TYPE, UNTIL, VAR, WHILE, WITH,     // Special symbols.     PLUS, MINUS, STAR, SLASH, COLON_EQUALS,     DOT, COMMA, SEMICOLON, COLON, QUOTE,     EQUALS, NOT_EQUALS, LESS_THAN, LESS_EQUALS,     GREATER_EQUALS, GREATER_THAN, LEFT_PAREN, RIGHT_PAREN,     LEFT_BRACKET, RIGHT_BRACKET, LEFT_BRACE, RIGHT_BRACE,     UP_ARROW, DOT_DOT,     IDENTIFIER, INTEGER, REAL, STRING,     ERROR, END_OF_FILE, }; frontend/pascal/PascalToken.h

PascalTokenType, cont’d The static set RESERVED_WORDS contains all of Pascal’s reserved word strings. vector<string> rw_strings = {     "AND", "ARRAY", "BEGIN", "CASE", "CONST", "DIV", "DO", "DOWNTO",     "ELSE", "END", "FILE", "FOR", "FUNCTION", "GOTO", "IF", "IN",     "LABEL", "MOD", "NIL", "NOT", "OF", "OR", "PACKED", "PROCEDURE",     "PROGRAM", "RECORD", "REPEAT", "SET", "THEN", "TO", "TYPE",     "UNTIL", "VAR", "WHILE", "WITH" }; frontend/pascal/PascalToken.cpp vector<PascalTokenType> rw_keys = {     PascalTokenType::AND,     PascalTokenType::ARRAY,     PascalTokenType::BEGIN,     PascalTokenType::CASE, ... };

PascalTokenType, cont’d for (int i = 0; i < rw_strings.size(); i++) {     RESERVED_WORDS[rw_strings[i]] = rw_keys[i]; } frontend/pascal/PascalToken.cpp

PascalTokenType, cont’d We can test whether a token is a reserved word or an identifier: // Is it a reserved word or an identifier? string upper_case(text); transform(upper_case.begin(), upper_case.end(),           upper_case.begin(), ::toupper); if (PascalToken::RESERVED_WORDS.find(upper_case)         != PascalToken::RESERVED_WORDS.end()) {     // Reserved word.     type = (TokenType) PascalToken::RESERVED_WORDS[upper_case];     value = new DataValue(upper_case); } else     // Identifier.     type = (TokenType) PT_IDENTIFIER; frontend/pascal/tokens/PascalWordToken.cpp

PascalTokenType, cont’d Static hash table SPECIAL_SYMBOLS contains all of Pascal’s special symbols. Each entry’s key is the string, such as "<" , "=" , "<=” Each entry’s value is the corresponding enumerated value. frontend/pascal/PascalToken.cpp vector<string> ss_strings = {     "+", "-", "*", "/", ":=", ".", ",", ";", ":", "'", "=", "<>",     "<", "<=", ">=", ">", "(", ")", "[", "]", "{", "}",  "^", ".." }; vector<PascalTokenType> ss_keys =     PascalTokenType::PLUS,     PascalTokenType::MINUS,     PascalTokenType::STAR,     PascalTokenType::SLASH, ... }

PascalTokenType, cont’d for (int i = 0; i < ss_strings.size(); i++) {     SPECIAL_SYMBOLS[ss_strings[i]] = ss_keys[i]; } frontend/pascal/PascalToken.cpp We can test whether a token is a special symbol: if (PascalToken::SPECIAL_SYMBOLS.find(string_ch)        != PascalToken::SPECIAL_SYMBOLS.end()) {     token = new PascalSpecialSymbolToken(source); } frontend/pascal/PascalScanner.cpp

An Apt Quote? Before I came here, I was confused about this subject. Having listened to your lecture, I am still confused, but on a higher level. Enrico Fermi, physicist, 1901-1954