Presentation is loading. Please wait.

Presentation is loading. Please wait.

LEX SUNG-DONG KIM, DEPT. OF COMPUTER ENGINEERING, HANSUNG UNIVERSITY.

Similar presentations


Presentation on theme: "LEX SUNG-DONG KIM, DEPT. OF COMPUTER ENGINEERING, HANSUNG UNIVERSITY."— Presentation transcript:

1 LEX SUNG-DONG KIM, DEPT. OF COMPUTER ENGINEERING, HANSUNG UNIVERSITY

2 LEX  1975, Lesk  Input: regular expression + action code (tiny.l)  Output: C program (lex.yy.c or lexyy.c) Procedure yylex Table-driven implementation of a DFA Similar to “getToken” (2013-1) Compiler 2 Lex Scanner (C code) RE + action

3 LEX CONVENTION (1)  Metacharacters Quotes: actual characters For not metacharacters: “if”, if For metacharacters: “(” Backslash \(\* = “\*” \n, \t (aa|bb)(a|b)*c? = (“aa”|“bb”)(“a”|“b”)* “c”? (2013-1) Compiler 3

4 LEX CONVENTION (2)  [...] : any one of them [abxz]: any one of the characters a, b, x, z (aa|bb)(ab)*c?  Hyphen Ranges of characters [0-9] (2013-1) Compiler 4

5 LEX CONVENTION (3) . Represents a set of characters Any character except a newline  ^ Complementary sets [^0-9abc]: any character that is not a digit and is not one of the letter a, b, c (2013-1) Compiler 5

6 LEX CONVENTION (4)  Square bracket Most of the metacharacters lose their special status [-+] == (“+”|“-”) [+-]: from “+”, all characters [.”?]: any of the three characters., ”, ? [\^\\]: ^ or \ (2013-1) Compiler 6

7 LEX CONVENTION (5)  Curly bracket Names of regular expressions (2013-1) Compiler 7 nat = [0-9]+ signedNat = (“+”|“-”)? nat nat [0-9]+ signedNat (“+”|“-”)? {nat}

8 FORMAT OF LEX INPUT (1)  Input file = regular expression + C code Definitions Any C code that must be inserted to any function - %{…}% Names of regular expressions Rules Regular expressions + C code (action) Auxiliary routines (optional) C code + main program (if needed) (2013-1) Compiler 8

9 FORMAT OF LEX INPUT (2)  Layout (2013-1) Compiler 9 {definitions} % {rules} % {auxiliary routines}

10 (2013-1) Compiler 10 EXAMPLE 1: SCANNER THAT ADDS LINE NUMBERS TO TEXT %{ /* a Lex program that adds line numbers to lines of text, printing the new text to the standard output */ #include int lineno = 1; %} line.*\n % {line} {printf(“%5d %s”,lineno++,yytext); } % main() { yylex(); return 0; }

11 (2013-1) Compiler 11 %{ /* a Lex program that changes all numbers from decimal to hexadecimal notation, printing a summary statistic stderr */ #include int count = 0; %} digit [0-9] number {digit}+ % {number} { int n = atoi(yytext); printf(“%x”, n); if (n > 9) count++; } %

12 main() { yylex(); fprintf(stderr, “number of replacements = %d”, count); return 0; } (2013-1) Compiler 12

13 (2013-1) Compiler 13 %{ /* Selects only lines that end or begin with the letter ‘a’. Deletes everything else. */ #include %} ends_with_a.*a\n begins_with_a a.*\n % {ends_with_a} ECHO; {begins_with_a} ECHO;.*\n ; % main() { yylex(); return 0; }

14 SUMMARY (1)  Ambiguity resolution The principles of longest substring Substring with equal length: first-match first-serve No match: copy the next character to the output and continue (2013-1) Compiler 14

15 SUMMARY (2)  Insertion of C Code %{ … %}: exact copy Auxiliary procedure section: exact copy at the end Any code following a RE (action): at the appropriate place in yylex (2013-1) Compiler 15

16 LEX INTERNAL NAMES  lex.yy.c: Lex output file name or lexyy.c  yylex: Lex scanning routine  yytext: String matched on current action  yyin: Lex input file (default: stdin)  yyout: Lex output file (default: stdout)  input: Lex buffered input routine  ECHO: Lex default action (print yytext to yyout) (2013-1) Compiler 16

17 %{ #include “globals.h” #include “util.h” #include “scan.h” /* lexeme of identifier or reserved word */ char tokenString[MAXTOKENLEN+1]; */ digit[0-9] number{digit}+ letter[a-zA-Z] identifier{letter}+ newline\n whitespace[ \t] % LEX FOR TINY (2013-1) Compiler 17

18 “if”{ return IF; } “then”{ return THEN; } “else”{ return ELSE; } “end”{ return END; } “repeat”{ return REPEAT; } “until”{ return UNTIL; } “read”{ return READ; } “write”{ return WRITE; } “:=”{ return ASSIGN; } “=”{ return EQ; } “<”{ return LT; } “+”{ return PLUS; } “-”{ return MINUS; } “*”{ return TIMES; } “/”{ return OVER; } “(”{ return LPAREN; } “)”{ return RPAREN; } “;”{ return SEMI; } (2013-1) Compiler 18

19 (2013-1) Compiler 19 {number}{ return NUM; } {identifier}{ return ID; } {newline}{ lineno++; } {whitespace}{ /* skip whitespace */ } “{”{ char c; do { c = input(); if (c == ‘\n’) lineno++; } while (c != ‘}’); }.{ return ERROR; } %

20 (2013-1) Compiler 20 TokenType getToken(void) {static int firstTime = TRUE; TokenType currentToken; if (firstTime) { firstTime = FALSE; lineno++; yyin = source; yyout = listing; } currentToken = yylex(); strncpy(tokenString, yytext, MAXTOKENLEN); if (TraceScan) { fprintf(listing, “\t%d: “, lineno); printToken(currentToken, tokenString); } return currentToken; }

21 참고  교재 Lex & Yacc 2nd Edition, John R. Levine, Tony Mason, Doug Brown, O'Reilly,1992  예제 http://myweb.stedwards.edu/laurab/cosc4342/lex- examples.htmlhttp://myweb.stedwards.edu/laurab/cosc4342/lex- examples.html  lex 함수 설명 등 http://docs.sun.com/app/docs/doc/801- 6734/6i13drksb?l=ko&a=viewhttp://docs.sun.com/app/docs/doc/801- 6734/6i13drksb?l=ko&a=view (2013-1) Compiler 21

22 EXERCISES  입력파일에 있는 단어의 개수를 구하는 프로그램을 위한 lex input 파일을 작성하시오.  입력파일에 있는 문장의 개수를 구하는 프로그램을 위한 lex input 파일을 작성하시오. (2013-1) Compiler 22


Download ppt "LEX SUNG-DONG KIM, DEPT. OF COMPUTER ENGINEERING, HANSUNG UNIVERSITY."

Similar presentations


Ads by Google