Presentation is loading. Please wait.

Presentation is loading. Please wait.

CMPE 152: Compiler Design September 3 Class Meeting

Similar presentations


Presentation on theme: "CMPE 152: Compiler Design September 3 Class Meeting"— Presentation transcript:

1 CMPE 152: Compiler Design September 3 Class Meeting
Department of Computer Engineering San Jose State University Fall 2019 Instructor: Ron Mak

2 Chapter 3 Syntax Diagrams

3 How to Scan for Tokens Suppose the source line contains IF (index >= 10) THEN The scanner skips over the leading blanks. The current character is I, so the next token must be a word. The scanner extracts a word token by copying characters up to but not including the first character that is not valid for a word, which in this case is a blank. The blank becomes the current character. The scanner determines that the word is a reserved word.

4 How to Scan for Tokens, cont’d
The scanner skips over any blanks between tokens. The current character is (. The next token must be a special symbol. After extracting the special symbol token, the current character is i. The next token must be a word. After extracting the word token, the current character is a blank.

5 How to Scan for Tokens, cont’d
Skip the blank. The current character is >. Extract the special symbol token. The current character is a blank. Skip the blank. The current character is 1, so the next token must be a number. After extracting the number token, the current character is ).

6 How to Scan for Tokens, cont’d
Extract the special symbol token. The current character is a blank. Skip the blank. The current character is T, so the next token must be a word. Extract the word token. Determine that it’s a reserved word. The current character is \n, so the scanner is done with this line.

7 Basic Scanning Algorithm
Skip any blanks until the current character is nonblank. In Pascal, a comment and the end-of-line character each should be treated as a blank. The current (nonblank) character determines what the next token is and becomes that token’s first character. Extract the rest of the next token by copying successive characters up to but not including the first character that does not belong to that token. Extracting a token consumes all the source characters that constitute the token. After extracting a token, the current character is the first character after the last character of that token.

8 Pascal-Specific Subclasses

9 Class PascalScanner The first character determines the type
Token *PascalScanner::extract_token() throw (string) {     skip_white_space();     Token *token;     char current_ch = current_char();     string string_ch = " ";     string_ch[0] = current_ch;     // Construct the next token.  The current character determines the     // token type.     if (current_ch == Source::END_OF_FILE)     {         token = nullptr;     }     else if (isalpha(current_ch))         token = new PascalWordToken(source);     else if (isdigit(current_ch))         token = new PascalNumberToken(source); ... return token; } The first character determines the type of the next token. wci/frontend/pascal/PascalScanner.cpp

10 Pascal-Specific Token Classes
Each class PascalWordToken, PascalNumberToken, PascalStringToken, PascalSpecial-SymbolToken, and PascalErrorToken is is a subclass of class PascalToken. PascalToken is a subclass of class Token. Each Pascal token subclass overrides the default extract() method of class Token. The default method could only create single-character tokens. Loosely coupled. Highly cohesive.

11 Class PascalWordToken
void PascalWordToken::extract() throw (string) {     char current_ch = current_char();     // Get the word characters (letter or digit). The scanner has     // already determined that the first character is a letter.     while (isalnum(current_ch))     {         text += current_ch;         current_ch = next_char();  // consume character     }     // Is it a reserved word or an identifier?     string upper_case = to_upper(text);     if (PascalToken::RESERVED_WORDS.find(upper_case)             != PascalToken::RESERVED_WORDS.end())         // Reserved word.         type = (TokenType) PascalToken::RESERVED_WORDS[upper_case];         value = new DataValue(upper_case);     else         // Identifier.         type = (TokenType) PT_IDENTIFIER; } frontend/pascal/tokens/PascalWordToken.cpp

12 Syntax Error Handling Error handling is a three-step process:
Detect the presence of a syntax error. Flag the error by pointing it out or highlighting it, and display a descriptive error message. Recover by moving past the error and resume parsing. For now, we’ll just move on, starting with the current character, and attempt to extract the next token. SYNTAX_ERROR message source line number beginning source position token text syntax error message

13 Class PascalParserTD frontend/pascal/PascalParserTD.cpp
void PascalParserTD::parse() throw (string) {     ...     // Loop over each token until the end of file.     while ((token = next_token(token)) != nullptr)     {         TokenType token_type = token->get_type();         last_line_number = token->get_line_number();         Object value = token->get_value();         string type_str;         string value_str;         switch ((PascalTokenType) token_type)         {             case PT_STRING:             {                 type_str = "STRING";                 value_str = cast(value, string);                 break;             } frontend/pascal/PascalParserTD.cpp

14 Class PascalParserTD, cont’d
frontend/pascal/PascalParserTD.cpp             case PT_IDENTIFIER:             {                 type_str = "IDENTIFIER";                 value_str = "";                 break;             }             case PT_INTEGER:                 type_str = "INTEGER";                 value_str = to_string(cast(value, int));             case PT_REAL:                 type_str = "REAL";                 value_str = to_string(cast(value, float));             case PT_ERROR: break;

15 Class PascalParserTD, cont’d
            default:  // reserved word or special character             {                 // Reserved word                 if (!value.empty())                 {                     value_str = cast(value, string);                     type_str = value_str;                 }                 // Special symbol                 else                     type_str =                         PascalToken::SPECIAL_SYMBOL_NAMES[                                            (PascalTokenType) token_type];                 break;             }         } frontend/pascal/PascalParserTD.cpp

16 Class PascalParserTD, cont’d
frontend/pascal/PascalParserTD.cpp         if (token_type != (TokenType) PT_ERROR)         {             // Format and send a message about each token.             Message message(TOKEN);             message.content.token.token_type = type_str.c_str();             message.content.token.line_number = token->get_line_number();             message.content.token.position = token->get_position();             message.content.token.token_text = strdup(token->get_text().c_str());             message.content.token.token_value = value_str.c_str();             send_message(message);             free(const_cast<char *>(message.content.token.token_text));         }         else             PascalErrorCode error_code = (PascalErrorCode) cast(value, int);             error_handler.flag(token, error_code, this);     } ... }

17 Program: Pascal Tokenizer
Verify the correctness of the Pascal token subclasses. Verify the correctness of the Pascal scanner. Demo (Chapter 3)

18 Lab Assignment #2 Modify the Pascal scanner by adding the following new tokens: Reserved words LOOP, WHEN, and AGAIN. Special symbol (3 characters) arrow: ==> Run the Pascal tokenizer on the following source program. Team assignment, due Monday, Sept. 9 Only the code from Chapter 3. Remember that Pascal is case insensitive. Which Pascal compiler source files do you need to modify?

19 Lab Assignment #2, cont’d
PROGRAM LoopTest; VAR i : integer; BEGIN {main} i := 1; LOOP writeln('i = ', i); When i = 10 ==>; i := i + 1; agaIn; writeln('Done!'); END.


Download ppt "CMPE 152: Compiler Design September 3 Class Meeting"

Similar presentations


Ads by Google