Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSC3315 (Spring 2009)1 CSC 3315 Lexical and Syntax Analysis Hamid Harroud School of Science and Engineering, Akhawayn University

Similar presentations


Presentation on theme: "CSC3315 (Spring 2009)1 CSC 3315 Lexical and Syntax Analysis Hamid Harroud School of Science and Engineering, Akhawayn University"— Presentation transcript:

1 CSC3315 (Spring 2009)1 CSC 3315 Lexical and Syntax Analysis Hamid Harroud School of Science and Engineering, Akhawayn University http://www.aui.ma/~H.Harroud/csc3315/

2 Constructing a Lexical Analyzer state = S // S is the start state repeat { k = next character from the input if k == EOF // the end of input if state is a final state then accept else reject state = T[state,k] if state = empty then reject // got stuck }

3 Constructing a Lexical Analyzer

4 int LexAnalyzer() { getChar(); if (isLetter(nextChar)) { addChar(); getChar(); while (isLetter(nextChar) || isDigit(nextChar)) { addChar(); getChar(); } return lookup(lexeme); }...

5 Constructing a Lexical Analyzer int LexAnalyzer() { getChar(); if (isLetter(nextChar)) {... } else if (isDigit(nextChar)) { addChar(); getChar(); while (isDigit(nextChar)) { addChar(); getChar(); } return INT_LIT; break; }

6 Lexical Errors Consider the following two programs:

7 Lexical Errors

8 Jlex: a scanner generator JLex.Main (java) javac P.main (java) jlex specification xxx.jlex xxx.jlex.java generated scanner xxx.jlex.java Yylex.class input program test.sim Output of P.main

9 public class P { public static void main(String[] args) { FileReader inFile = new FileReader(args[0]); Yylex scanner = new Yylex(inFile); Symbol token = scanner.next_token(); while (token.sym != sym.EOF) { switch (token.sym) { case sym.INTLITERAL: System.out.println("INTLITERAL (" + ((IntLitTokenVal)token.value).intVal \ + ")"); break; … } token = scanner.next_token(); } Jlex: a scanner generator

10 Regular expression rules regular-expression { action } pattern to be matchedcode to be executed when the pattern is matched When next_token() method is called, it repeats: Find the longest sequence of characters in the input (starting with the current character) that matches a pattern. Perform the associated action until a return in an action is executed.

11 Matching rules If several patterns that match the same sequence of characters, then the longest pattern is considered to be matched. If several patterns that match the same (longest) sequence of characters, then the first such pattern is considered to be matched so the order of the patterns can be important! If an input character is not matched in any pattern, the scanner throws an exception

12 An Example % DIGIT= [0-9] LETTER= [a-zA-Z] WHITESPACE= [ \t\n] // space, tab, newline {LETTER}({LETTER}|{DIGIT}*) {System.out.println(yyline+1 + ": ID " + yytext());} {DIGIT}+ {System.out.println(yyline+1 + ": INT");} "=" {System.out.println(yyline+1 + ": ASSIGN");} "==" {System.out.println(yyline+1 + ": EQUALS");} {WHITESPACE}* { }. {System.out.println(yyline+1 + ": bad char");}


Download ppt "CSC3315 (Spring 2009)1 CSC 3315 Lexical and Syntax Analysis Hamid Harroud School of Science and Engineering, Akhawayn University"

Similar presentations


Ads by Google