Intro to Lexing & Parsing CS 153. Two pieces conceptually: – Recognizing syntactically valid phrases. – Extracting semantic content from the syntax. E.g.,

Slides:



Advertisements
Similar presentations
Translator Architecture Code Generator ParserTokenizer string of characters (source code) string of tokens abstract program string of integers (object.
Advertisements

1 Parsing The scanner recognizes words The parser recognizes syntactic units Parser operations: Check and verify syntax based on specified syntax rules.
Functional Design and Programming Lecture 9: Lexical analysis and parsing.
Top-Down Parsing.
1 Chapter 5 Compilers Source Code (with macro) Macro Processor Expanded Code Compiler or Assembler obj.
CS 330 Programming Languages 09 / 13 / 2007 Instructor: Michael Eckmann.
ISBN Chapter 4 Lexical and Syntax Analysis.
ISBN Chapter 4 Lexical and Syntax Analysis.
Fall 2007CS 2251 Miscellaneous Topics Deque Recursion and Grammars.
ISBN Chapter 4 Lexical and Syntax Analysis The Parsing Problem Recursive-Descent Parsing.
Prof. Bodik CS 164 Lecture 61 Building a Parser II CS164 3:30-5:00 TT 10 Evans.
Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Design Chapter 2.2 (Partial) Hashlama 11:00-14:00.
Parsing — Part II (Ambiguity, Top-down parsing, Left-recursion Removal)
Prof. Fateman CS 164 Lecture 91 Bottom-Up Parsing Lecture 9.
1 The Parser Its job: –Check and verify syntax based on specified syntax rules –Report errors –Build IR Good news –the process can be automated.
Professor Yihjia Tsai Tamkang University
Table-driven parsing Parsing performed by a finite state machine. Parsing algorithm is language-independent. FSM driven by table (s) generated automatically.
COP4020 Programming Languages
1 Chapter 3 Context-Free Grammars and Parsing. 2 Parsing: Syntax Analysis decides which part of the incoming token stream should be grouped together.
CPSC Compiler Tutorial 3 Parser. Parsing The syntax of most programming languages can be specified by a Context-free Grammar (CGF) Parsing: Given.
CSC3315 (Spring 2009)1 CSC 3315 Lexical and Syntax Analysis Hamid Harroud School of Science and Engineering, Akhawayn University
Joey Paquet, 2000, 2002, Lecture 3 Syntactic Analysis Part I.
Parsing. Goals of Parsing Check the input for syntactic accuracy Return appropriate error messages Recover if possible Produce, or at least traverse,
Top-Down Parsing - recursive descent - predictive parsing
4 4 (c) parsing. Parsing A grammar describes the strings of tokens that are syntactically legal in a PL A recogniser simply accepts or rejects strings.
LANGUAGE TRANSLATORS: WEEK 3 LECTURE: Grammar Theory Introduction to Parsing Parser - Generators TUTORIAL: Questions on grammar theory WEEKLY WORK: Read.
AN IMPLEMENTATION OF A REGULAR EXPRESSION PARSER
Syntax Specification and BNF © Allan C. Milne Abertay University v
Parsing III (Top-down parsing: recursive descent & LL(1) )
COMP Parsing 2 of 4 Lecture 22. How do we write programs to do this? The process of getting from the input string to the parse tree consists of.
Profs. Necula CS 164 Lecture Top-Down Parsing ICOM 4036 Lecture 5.
Lesson 3 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Lexical and Syntax Analysis
CS 153 A little bit about LR Parsing. Background We’ve seen three ways to write parsers:  By hand, typically recursive descent  Using parsing combinators.
1 Syntax In Text: Chapter 3. 2 Chapter 3: Syntax and Semantics Outline Syntax: Recognizer vs. generator BNF EBNF.
Syntax and Semantics Structure of programming languages.
COP4020 Programming Languages Parsing Prof. Xin Yuan.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 3: Introduction to Syntactic Analysis.
Muhammad Idrees, Lecturer University of Lahore 1 Top-Down Parsing Top down parsing can be viewed as an attempt to find a leftmost derivation for an input.
Joey Paquet, 2000, Lecture 2 Lexical Analysis.
Recursive Data Structures and Grammars Themes –Recursive Description of Data Structures –Grammars and Parsing –Recursive Definitions of Properties of Data.
Syntax and Grammars.
Syntax Analysis (Chapter 4) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.
Top-Down Parsing CS 671 January 29, CS 671 – Spring Where Are We? Source code: if (b==0) a = “Hi”; Token Stream: if (b == 0) a = “Hi”; Abstract.
Syntax Analysis - Parsing Compiler Design Lecture (01/28/98) Computer Science Rensselaer Polytechnic.
TOP-DOWN PARSING Recursive-Descent, Predictive Parsing.
Top-Down Parsing.
LECTURE 4 Syntax. SPECIFYING SYNTAX Programming languages must be very well defined – there’s no room for ambiguity. Language designers must use formal.
LECTURE 7 Lex and Intro to Parsing. LEX Last lecture, we learned a little bit about how we can take our regular expressions (which specify our valid tokens)
Chapter 4: Syntax analysis Syntax analysis is done by the parser. –Detects whether the program is written following the grammar rules and reports syntax.
COMP 3438 – Part II-Lecture 5 Syntax Analysis II Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
Copyright © 2004 Pearson Addison-Wesley. All rights reserved.3-1 Language Specification and Translation Lecture 8.
1 Topic #4: Syntactic Analysis (Parsing) CSC 338 – Compiler Design and implementation Dr. Mohamed Ben Othman ( )
NATURAL LANGUAGE PROCESSING
UMBC  CSEE   1 Chapter 4 Chapter 4 (b) parsing.
Parsing III (Top-down parsing: recursive descent & LL(1) )
COMP 3438 – Part II-Lecture 6 Syntax Analysis III Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
CS 154 Formal Languages and Computability March 22 Class Meeting Department of Computer Science San Jose State University Spring 2016 Instructor: Ron Mak.
Chapter 3 – Describing Syntax
Chapter 4 - Parsing CSCE 343.
Programming Languages Translator
Basic Parsing with Context Free Grammars Chapter 13
4 (c) parsing.
Syntax One - Hybrid CMSC 331.
Compiler Design 4. Language Grammars
Lexical and Syntax Analysis
ABNF in ACL2 Alessandro Coglio Kestrel Institute Workshop 2017.
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Parsing CSCI 432 Computer Science Theory
Presentation transcript:

Intro to Lexing & Parsing CS 153

Two pieces conceptually: – Recognizing syntactically valid phrases. – Extracting semantic content from the syntax. E.g., What is the subject of the sentence? E.g., What is the verb phrase? E.g., Is the syntax ambiguous? If so, which meaning do we take? – “Fruit flies like a banana” – “2 * 3 + 4” – “x ^ f y” In practice, solve both problems at the same time.

We use grammars to specify the syntax of a language. Exp ::= int | var | exp ‘+’ exp | exp ‘*’ exp | ‘let’ var ‘=‘ exp ‘in’ exp ‘end’ Int ::= ‘-’?digit+ Var ::= alpha(alpha|digit)* Digit ::= ‘0’ | ‘1’ | ‘2’ | ‘3’ | ‘4’ | … | ‘9’ Alpha ::= [a-zA-Z]

To see if a sentence is legal, start with the first non-terminal), and keep expanding non- terminals until you can match against the sentence. N ::= ‘a’ | ‘(’ N ‘)’ “((a))” N -> ‘(’N ‘)’ -> ‘(’‘(’N ‘)’ ‘)’ -> ‘(’‘(’‘a’‘)’‘)’ = “((a))”

Start with the sentence, and replace phrases with corresponding non-terminals, repeating until you derive the start non-terminal. N ::= ‘a’ | ‘(‘ N ‘)’ “((a))” ‘(’‘(‘ ‘a’‘)’‘)’ -> ‘(‘ ‘(‘ N ‘)’ ‘)’ -> ‘(‘ N ‘)' -> N

For real grammars, automating this non-deterministic search is non-trivial. – As we’ll see, naïve implementations must do a lot of back-tracking in the search. Ideally, given a grammar, we would like an efficient, deterministic algorithm to see if a string matches it. – There is a very general cubic time algorithm. – Only linguists use it. – (In part, we don’t because recognition is only half the problem.) Certain classes of grammars have much more efficient implementations.

Manual parsing (say, recursive descent). – Tedious, error prone, hard to maintain. – But fast & good error messages. Parsing combinators – Encode grammars as (higher-order) functions. Basically, functions that generate recursive-descent parsers. Makes it easy to write & maintain grammars. – But can do a lot of back-tracking, and requires a limited form of grammar (e.g., no left-recursion.) Lex and Yacc – Domain-Specific-Languages that generate very efficient, table-driven parsers for general classes of grammars. – Learn about the theory in 121 – Need to know a bit here to understand how to effectively use these tools.