Chapter 4 - Parsing CSCE 343.

Slides:



Advertisements
Similar presentations
Parsing V: Bottom-up Parsing
Advertisements

Chapter 4 Lexical and Syntax Analysis Sections
Chapter 4 Lexical and Syntax Analysis Sections 1-4.
ISBN Chapter 4 Lexical and Syntax Analysis.
ISBN Chapter 4 Lexical and Syntax Analysis.
Slide1 Chapter 4 Lexical and Syntax Analysis. slide2 OutLines: In this chapter a major topics will be discussed : Introduction to lexical analysis, including.
ISBN Chapter 4 Lexical and Syntax Analysis The Parsing Problem Recursive-Descent Parsing.
CS 330 Programming Languages 09 / 23 / 2008 Instructor: Michael Eckmann.
Lexical and Syntax Analysis
Lecture 4 Concepts of Programming Languages Arne Kutzner Hanyang University / Seoul Korea.
ISBN Lecture 04 Lexical and Syntax Analysis.
Chapter 4 Lexical and Syntax Analysis. Chapter 4 Topics Introduction Lexical Analysis The Parsing Problem Recursive-Descent Parsing Bottom-Up Parsing.
Lexical and syntax analysis
CSC3315 (Spring 2009)1 CSC 3315 Lexical and Syntax Analysis Hamid Harroud School of Science and Engineering, Akhawayn University
Parsing. Goals of Parsing Check the input for syntactic accuracy Return appropriate error messages Recover if possible Produce, or at least traverse,
Chapter 4 Lexical and Syntax Analysis. 4-2 Chapter 4 Topics 4.1 Introduction 4.2 Lexical Analysis 4.3 The Parsing Problem 4.4 Recursive-Descent Parsing.
BİL 744 Derleyici Gerçekleştirimi (Compiler Design)1 Syntax Analyzer Syntax Analyzer creates the syntactic structure of the given source program. This.
CS 330 Programming Languages 09 / 26 / 2006 Instructor: Michael Eckmann.
Lexical and Syntax Analysis
ISBN Chapter 4 Lexical and Syntax Analysis.
ISBN Chapter 4 Lexical and Syntax Analysis.
ISBN Chapter 4 Lexical and Syntax Analysis.
Bottom-Up Parsing David Woolbright. The Parsing Problem Produce a parse tree starting at the leaves The order will be that of a rightmost derivation The.
ISBN Chapter 4 Lexical and Syntax Analysis.
ISBN Chapter 4 Lexical and Syntax Analysis.
CS 330 Programming Languages 09 / 25 / 2007 Instructor: Michael Eckmann.
Copyright © 2004 Pearson Addison-Wesley. All rights reserved.3-1 Language Specification and Translation Lecture 8.
lec02-parserCFG May 8, 2018 Syntax Analyzer
Lecture 4 Concepts of Programming Languages
4.1 Introduction - Language implementation systems must analyze
Lexical and Syntax Analysis
Programming Languages Translator
Chapter 4 Lexical and Syntax Analysis.
Bottom-Up Parsing.
Lexical and Syntax Analysis
Unit-3 Bottom-Up-Parsing.
Lecture #12 Parsing Types.
CS 488 Spring 2012 Lecture 4 Bapa Rao Cal State L.A.
Parsing IV Bottom-up Parsing
Table-driven parsing Parsing performed by a finite state machine.
Bottom-Up Syntax Analysis
Lexical and Syntax Analysis
CS 363 Comparative Programming Languages
4 (c) parsing.
Lexical and Syntax Analysis
4d Bottom Up Parsing.
Lecture 8 Bottom Up Parsing
Bottom Up Parsing.
Lexical and Syntax Analysis
Parsing IV Bottom-up Parsing
COS 301: Programming Languages
Chapter 4: Lexical and Syntax Analysis Sangho Ha
4d Bottom Up Parsing.
Lexical and Syntax Analysis
Programming Language Specification and Translation
Chapter 3 Syntactic Analysis I.
4d Bottom Up Parsing.
4d Bottom Up Parsing.
Programming Language Specification and Translation
4d Bottom Up Parsing.
Lexical and Syntax Analysis
lec02-parserCFG May 27, 2019 Syntax Analyzer
Programming Language Specification and Translation
Lexical and Syntax Analysis
4d Bottom Up Parsing.
Lexical and Syntax Analysis
4.1 Introduction - Language implementation systems must analyze
Parsing CSCI 432 Computer Science Theory
Presentation transcript:

Chapter 4 - Parsing CSCE 343

The Parsing Problem Goals of the parser: Find syntax errors Produce good error messages Recover quickly to find as many errors as possible Produce the parse tree Parse tree is used for language translation Parsers that work for any unambiguous grammar are O(n3) We restrict the grammars and examine O(n) parsers

Parsers Two categories of parsers: Parsers look only one token ahead Top down Produce the parse tree beginning at the root. Order is a leftmost derivation Builds the parse tree in pre-order Bottom up Produce the parse tree beginning at leaves Order is a reverse of a rightmost derivation Parsers look only one token ahead Where do the tokens come from?

Top-Down Parsers Basic process: Top-down algorithms (LL algorithms): Given a sentential form xAα, the parser must: Find the correct A-rule to get the next sentential form in the leftmost derivation. Can only look ahead one token. Top-down algorithms (LL algorithms): Recursive descent – coded implementation Table driven implementation LL: Left-to-right scan of input, Leftmost derivation

Recursive-Descent Parsing Well suited for EBNF but only works for restricted forms of EBNF One subprogram (function/method) for each non-terminal Subprogram parses sentences generated by non-term and creates parse tree. All subprograms have access to: Lexical analyzer method lex(), puts next token code in nextToken

Recursive Descent For S //method for S --- S has only one rule //SaBc S(): if (nextToken == a) lex() B() else error if (nextToken == c) reture

Example E  A { (+ | - ) A } A  F { (* | / ) F } F  ( E ) | a | b | c Write the code for E, A, and F Trace a call to E with this string a + ( b – c ) * a

RD Parsing / EBNF Restrictions If there is more than one RHS for a nonterminal A  α1 | α2 Must determine which to use Choose based on next token of input Next token is compared with first token that can be generated by each RHS First (α) = { a | α  * aβ} If no match, error pairwise disjointness test First (α1) ∩ First (α2) = Ø

RD Parsing / EBNF Restrictions Left recursion problem Direct or indirect left recursion, cannot be parsed by a top-down parser A  Ab B  Ca C  Bc Can be modified to remove left recursion A  A bT | A AX | bc | X | aa A  bcA’ | XA’ | aaA’ A’  bTA’ | AXA’ | ε

Grammars for RD Parsing Must pass the pairwise disjoint test Can often use left factoring to resolve problem <var>  <ident> | <ident> [<expr>] //array reference <var>  <ident> <new> <new>  ε | [<expr>] Can not have direct or indirect left recursion This problem can always be solved, but could get messy

RD Parsing For each grammar determine if the RHSs of A are pairwise disjoint A  a | bB | cAb A  a | aB A  B | C B  bC C  ( E ) | d

Bottom up Parsing Bottom up Parsing does not have the same restrictions as top down parsing left recursion pairwise disjoint RHS sentential forms Uses BNF not EBNF Parse order is the reverse of a rightmost derivation

Bottom-up Parsing Problem: Find the correct RHS in a right-sentential form that reduces to the previous right-sentential form (the handle) Def:  is the handle of the right sentential form  = w if and only if S =>*rm Aw =>rm w Look at a parse tree phrase (leaf nodes of internal nodes in parse tree) simple phrase (leaf nodes of internal nodes at level 1 in parse tree) Handle: The handle of a right-sentential form is its leftmost simple phrase Given a parse tree, easy to find handle

Bottom Up Parsing Example 1(Textbook Figure 4.3) Example 2 Example 3 S AB B bBc | bc A  aAb | ab Find parse tree, rightmost derivation, phrases, simple phrases, and handle for aabbbbcc Example 3 E  A { (+ | - ) A } A  F { (* | / ) F } F  ( E ) | a | b | c Find (partial) parse tree, rightmost derivation, phrases, simple phrases and handle for A+(E)*a

Grammars and PDA Formal Languages has a machine called a push down automata that can recognize strings generated by CFG These machines are nondeterministic Knuth realized that this problem could be resolved by adding a finite number of states to the stack Canonical LR algorithm(Knuth 1965)

Shift-Reduce Algorithms Canonical LR (Knuth 1965): Reduce: replace the handle on the top of the parse stack with its LHS Shift: moving the next input token to the top of the parse stack LR parsing table constructed with a tool (yacc)

Canonical LR State:

Parser Table

LR Example EE + T E T T  T * F T  F F  ( E ) F  id

Advantages of LR Parsers They will work for nearly all grammars that describe programming languages. They work on a larger class of grammars than other bottom-up algorithms, but are as efficient as any other bottom-up parser. They can detect syntax errors as soon as it is possible. The LR class of grammars is a superset of the class parsable by LL parsers.