1 January 18, 2016 1 January 18, 2016January 18, 2016January 18, 2016 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa.

Slides:



Advertisements
Similar presentations
COS 320 Compilers David Walker. Outline Last Week –Introduction to ML Today: –Lexical Analysis –Reading: Chapter 2 of Appel.
Advertisements

COMP-421 Compiler Design Presented by Dr Ioanna Dionysiou.
CS-338 Compiler Design Dr. Syed Noman Hasany Assistant Professor College of Computer, Qassim University.
1 IMPLEMENTATION OF FINITE AUTOMAT IN CODE There are several ways to translate either a DFA or an NFA into code. Consider, again the example of a DFA that.
Lexical Analysis III Recognizing Tokens Lecture 4 CS 4318/5331 Apan Qasem Texas State University Spring 2015.
Compiler Design Lexical Analysis Syntactical Analysis Semantic Analysis Optimization Code Generation.
1 Foundations of Software Design Lecture 23: Finite Automata and Context-Free Grammars Marti Hearst Fall 2002.
 We are given the following regular definition: if -> if then -> then else -> else relop -> |>|>= id -> letter(letter|digit)* num -> digit + (.digit.
Chapter 3 Lexical Analysis
September 7, September 7, 2015September 7, 2015September 7, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University.
Topic #3: Lexical Analysis
Lexical Analysis Natawut Nupairoj, Ph.D.
1 October 1, October 1, 2015October 1, 2015October 1, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa.
1 October 2, October 2, 2015October 2, 2015October 2, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa.
Compiler Phases: Source program Lexical analyzer Syntax analyzer Semantic analyzer Machine-independent code improvement Target code generation Machine-specific.
Lexical Analysis - An Introduction. The Front End The purpose of the front end is to deal with the input language Perform a membership test: code  source.
1 Regular Expressions. 2 Regular expressions describe regular languages Example: describes the language.
1 October 14, October 14, 2015October 14, 2015October 14, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa.
Lecture # 3 Chapter #3: Lexical Analysis. Role of Lexical Analyzer It is the first phase of compiler Its main task is to read the input characters and.
1 October 16, October 16, 2015October 16, 2015October 16, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa.
Topic #3: Lexical Analysis EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
Lexical Analysis I Specifying Tokens Lecture 2 CS 4318/5531 Spring 2010 Apan Qasem Texas State University *some slides adopted from Cooper and Torczon.
Lexical Analyzer (Checker)
Lexical and Syntax Analysis
CS412/413 Introduction to Compilers Radu Rugina Lecture 4: Lexical Analyzers 28 Jan 02.
May 31, May 31, 2016May 31, 2016May 31, 2016 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa Pacific University,
CS 461 – Sept. 19 Last word on finite automata… –Scanning tokens in a compiler –How do we implement a “state” ? Chapter 2 introduces the 2 nd model of.
TRANSITION DIAGRAM BASED LEXICAL ANALYZER and FINITE AUTOMATA Class date : 12 August, 2013 Prepared by : Karimgailiu R Panmei Roll no. : 11CS10020 GROUP.
1 November 1, November 1, 2015November 1, 2015November 1, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa.
Lexical Analysis Lecture 2 Mon, Jan 19, Tokens A token has a type and a value. Types include ID, NUM, ASSGN, LPAREN, etc. Values are used primarily.
Bernd Fischer RW713: Compiler and Software Language Engineering.
Lexical Analyzer in Perspective
Lexical Analysis: Finite Automata CS 471 September 5, 2007.
1 Lexical Analysis and Lexical Analyzer Generators Chapter 3 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University,
Review: Compiler Phases: Source program Lexical analyzer Syntax analyzer Semantic analyzer Intermediate code generator Code optimizer Code generator Symbol.
1 November 19, November 19, 2015November 19, 2015November 19, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University.
1.  It is the first phase of compiler.  In computer science, lexical analysis is the process of converting a sequence of characters into a sequence.
YACC. Introduction What is YACC ? a tool for automatically generating a parser given a grammar written in a yacc specification (.y file) YACC (Yet Another.
Overview of Previous Lesson(s) Over View  Symbol tables are data structures that are used by compilers to hold information about source-program constructs.
Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi.
Unit-3 Parsing Theory (Syntax Analyzer) PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE.
CSC3315 (Spring 2009)1 CSC 3315 Lexical and Syntax Analysis Hamid Harroud School of Science and Engineering, Akhawayn University
Lexical Analysis.
CSC312 Automata Theory Lecture # 26 Chapter # 12 by Cohen Context Free Grammars.
Programming Languages and Design Lecture 2 Syntax Specifications of Programming Languages Instructor: Li Ma Department of Computer Science Texas Southern.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 2: Lexical Analysis.
1 February 17, February 17, 2016February 17, 2016February 17, 2016 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University.
1 February 23, February 23, 2016February 23, 2016February 23, 2016 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University.
1 February 23, February 23, 2016February 23, 2016February 23, 2016 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University.
1 Topic #4: Syntactic Analysis (Parsing) CSC 338 – Compiler Design and implementation Dr. Mohamed Ben Othman ( )
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 1 Ahmed Ezzat.
March 19, March 19, 2016March 19, 2016March 19, 2016 Azusa, CA Sheldon X. Liang Ph. D. Software Engineering in CS at APU Azusa Pacific University,
Compiler Chapter 4. Lexical Analysis Dept. of Computer Engineering, Hansung University, Sung-Dong Kim.
COMP 3438 – Part II - Lecture 3 Lexical Analysis II Par III: Finite Automata Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ. 1.
Lexical Analyzer in Perspective
CS510 Compiler Lecture 2.
Chapter 3 Lexical Analysis.
Compilers Welcome to a journey to CS419 Lecture5: Lexical Analysis:
Syntax Analysis Chapter 4.
PROGRAMMING LANGUAGES
CS 363 Comparative Programming Languages
Syntax Analysis Sections :.
Lexical and Syntax Analysis
Chapter 3: Lexical Analysis
Example TDs : id and delim
Lexical Analysis and Lexical Analyzer Generators
Review: Compiler Phases:
Lexical Analysis Lecture 2 Mon, Jan 17, 2005.
R.Rajkumar Asst.Professor CSE
Compiler Construction
Presentation transcript:

1 January 18, January 18, 2016January 18, 2016January 18, 2016 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction

2  Formalization January 18, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction Lexical Analysis & Lexical Analyzer Generators  Regular Expressions  Finite Automata  RE  Conversion  FA  Lexer Design

3 January 18, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction Keep in mind with following questions Regular Expressions –a–a concise and flexible means for identifying strings of text –w–written in a formal language –I–Interpreted by a RegEx processor Why RegEx –P–Precise definition of language –L–Layered definition of language –L–Lexical/Syntax/Semantic Further use of RegEx –S–Supportive foundation of Lexer –F–Formal communication –C–Common application ***

4 January 18, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction Why: Language Definition Problem How to precisely define language Layered structure of language definition –Start with a set of letters in language –Lexical structure - identifies “words” in language (each word is a sequence of letters) –Syntactic structure - identifies “sentences” in language (each sentence is a sequence of words) –Semantics - meaning of program (specifies what result should be for each input) –Today’s topic: lexical and syntactic structures

5 Basis symbols: –  is a regular expression denoting language {  } –a   is a regular expression denoting {a} If r and s are regular expressions denoting languages L(r) and M(s) respectively, then –r  s is a regular expression denoting L(r)  M(s) –rs is a regular expression denoting L(r)M(s) –r * is a regular expression denoting L(r) * –(r) is a regular expression denoting L(r) A language defined by a regular expression is called a regular set January 18, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction Specification of Patterns for Tokens: Regular Expressions

6 January 18, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction

7 January 18, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction

8 Specification of Patterns for Tokens: Regular Definitions Regular definitions introduce a naming convention: d 1  r 1 d 2  r 2 … d n  r n where each r i is a regular expression over   {d 1, d 2, …, d i-1 } Any d j in r i can be textually substituted in r i to obtain an equivalent set of definitions January 18, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction

9 Specification of Patterns for Tokens: Regular Definitions Example: letter  A  B  …  Z  a  b  …  z digit  0  1  …  9 id  letter ( letter  digit ) * Regular definitions are not recursive: digits  digit digits  digitwrong! January 18, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction

10 Specification of Patterns for Tokens: Notational Shorthand The following shorthands are often used: r + = rr * r? = r  [ a - z ] = a  b  c  …  z Examples: digit  [ ] num  digit + (. digit + )? ( E (+  -)? digit + )? January 18, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction

11 Regular Definitions and Grammars stmt  if expr then stmt  if expr then stmt else stmt   expr  term relop term  term term  id  num if  if then  then else  else relop   >  >=  = id  letter ( letter | digit ) * num  digit + (. digit + )? ( E (+  -)? digit + )? Grammar Regular definitions January 18, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction

12 Coding Regular Definitions in Transition Diagrams January 18, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction

13 Coding Regular Definitions in Transition Diagrams: Code token nexttoken() { while (1) { switch (state) { case 0: c = nextchar(); if (c==blank || c==tab || c==newline) { state = 0; lexeme_beginning++; } else if (c==‘<’) state = 1; else if (c==‘=’) state = 5; else if (c==‘>’) state = 6; else state = fail(); break; case 1: … case 9: c = nextchar(); if (isletter(c)) state = 10; else state = fail(); break; case 10: c = nextchar(); if (isletter(c)) state = 10; else if (isdigit(c)) state = 10; else state = 11; break; … int fail() { forward = token_beginning; swith (start) { case 0: start = 9; break; case 9: start = 12; break; case 12: start = 20; break; case 20: start = 25; break; case 25: recover(); break; default: /* error */ } return start; } Decides the next start state to check January 18, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction

14 Common Application of Regular Expressions January 18, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction  Validate passwords and addresses  Extract specific sections from an HML page  Parse data files  Replace values (strings)

15 January 18, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction Got it with following questions Regular Expressions –a–a concise and flexible means for identifying strings of text –w–written in a formal language –I–Interpreted by a RegEx processor Why RegEx –P–Precise definition of language –L–Layered definition of language –L–Lexical/Syntax/Semantic Further use of RegEx –S–Supportive foundation of Lexer –F–Formal communication –C–Common application ***

16 Thank you very much! Questions? January 18, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction