Parsing Costas Busch - LSU.

Slides:



Advertisements
Similar presentations
Simplifications of Context-Free Grammars
Advertisements

CPSC Compiler Tutorial 4 Midterm Review. Deterministic Finite Automata (DFA) Q: finite set of states Σ: finite set of “letters” (input alphabet)
Fall 2004COMP 3351 Simplifications of Context-Free Grammars.
Courtesy Costas Busch - RPI1 Positive Properties of Context-Free languages.
Courtesy Costas Buch - RPI1 Simplifications of Context-Free Grammars.
Costas Busch - RPI1 NPDAs Accept Context-Free Languages.
Costas Buch - RPI1 Simplifications of Context-Free Grammars.
1 Normal Forms for Context-free Grammars. 2 Chomsky Normal Form All productions have form: variable and terminal.
Costas Busch - RPI1 Positive Properties of Context-Free languages.
Fall 2003Costas Busch - RPI1 Decidable Problems of Regular Languages.
Costas Busch - RPI1 Context-Free Languages. Costas Busch - RPI2 Regular Languages.
1 Simplifications of Context-Free Grammars. 2 A Substitution Rule substitute B equivalent grammar.
1 Normal Forms for Context-free Grammars. 2 Chomsky Normal Form All productions have form: variable and terminal.
1 Simplifications of Context-Free Grammars. 2 A Substitution Rule Substitute Equivalent grammar.
1 Compilers. 2 Compiler Program v = 5; if (v>5) x = 12 + v; while (x !=3) { x = x - 3; v = 10; } Add v,v,0 cmp v,5 jmplt ELSE THEN: add x, 12,v.
Fall 2006Costas Busch - RPI1 Non-Deterministic Finite Automata.
Costas Busch - LSU1 Non-Deterministic Finite Automata.
Prof. Busch - LSU1 Linear Grammars Grammars with at most one variable at the right side of a production Examples:
Fall 2004COMP 3351 Context-Free Languages. Fall 2004COMP 3352 Regular Languages.
Prof. Busch - LSU1 Context-Free Languages. Prof. Busch - LSU2 Regular Languages Context-Free Languages.
Prof. Busch - RPI1 Decidable Problems of Regular Languages.
CPSC Compiler Tutorial 3 Parser. Parsing The syntax of most programming languages can be specified by a Context-free Grammar (CGF) Parsing: Given.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 7 Mälardalen University 2010.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 5 Mälardalen University 2005.
Lexical Analysis - An Introduction. The Front End The purpose of the front end is to deal with the input language Perform a membership test: code  source.
1 CDT314 FABER Formal Languages, Automata and Models of Computation Lecture 5 Mälardalen University 2010.
Membership problem CYK Algorithm Project presentation CS 5800 Spring 2013 Professor : Dr. Elise de Doncker Presented by : Savitha parur venkitachalam.
CYK Algorithm for Parsing General Context-Free Grammars
Prof. Busch - LSU1 NFAs accept the Regular Languages.
1 Lex. 2 Lex is a lexical analyzer Var = ; if (test > 20) temp = 0; else while (a < 20) temp++; Lex Ident: Var Integer: 12 Oper: + Integer: 9 Semicolumn:
11 CDT314 FABER Formal Languages, Automata and Models of Computation Lecture 7 School of Innovation, Design and Engineering Mälardalen University 2012.
CYK Algorithm for Parsing General Context-Free Grammars.
Context-Free Languages. Regular Languages Context-Free Languages.
Costas Busch - LSU1 Time Complexity. Costas Busch - LSU2 Consider a deterministic Turing Machine which decides a language.
Costas Busch - LSU1 Parsing. Costas Busch - LSU2 Compiler Program File v = 5; if (v>5) x = 12 + v; while (x !=3) { x = x - 3; v = 10; } Add v,v,5.
Grammar Set of variables Set of terminal symbols Start variable Set of Production rules.
Costas Busch - LSU1 Linear Grammars Grammars with at most one variable at the right side of a production Examples:
Mid-Terms Exam Scope and Introduction. Format Grades: 100 points -> 20% in the final grade Multiple Choice Questions –8 questions, 7 points each Short.
Costas Busch - LSU1 PDAs Accept Context-Free Languages.
Chapter 3 – Describing Syntax
lec02-parserCFG May 8, 2018 Syntax Analyzer
Non-regular languages
Context-Free Languages
G. Pullaiah College of Engineering and Technology
CS510 Compiler Lecture 4.
PDAs Accept Context-Free Languages
Time Complexity Costas Busch - LSU.
Reductions Costas Busch - LSU.
NPDAs Accept Context-Free Languages
Simplifications of Context-Free Grammars
Simplifications of Context-Free Grammars
PDAs Accept Context-Free Languages
Syntax versus Semantics
NPDAs Accept Context-Free Languages
CSE 3302 Programming Languages
Non-Deterministic Finite Automata
R.Rajkumar Asst.Professor CSE
Deterministic PDAs - DPDAs
The Post Correspondence Problem
Pumping Lemma for Context-free Languages
Closure Properties of Context-Free languages
Designing a Predictive Parser
Decidable Problems of Regular Languages
Teori Bahasa dan Automata Lecture 9: Contex-Free Grammars
BNF 9-Apr-19.
Applications of Regular Closure
Lec00-outline May 18, 2019 Compiler Design CS416 Compiler Design.
lec02-parserCFG May 27, 2019 Syntax Analyzer
Normal Forms for Context-free Grammars
Faculty of Computer Science and Information System
Presentation transcript:

Parsing Costas Busch - LSU

Machine Code Program File Add v,v,5 cmp v,5 jmplt ELSE THEN: add x, 12,v ELSE: WHILE: cmp x,3 ... v = 5; if (v>5) x = 12 + v; while (x !=3) { x = x - 3; v = 10; } ...... Compiler Costas Busch - LSU

Compiler Lexical analyzer parser Input String Output Program file machine code Costas Busch - LSU

Lexical analyzer: Recognizes the lexemes of the input program file: Keywords (if, then, else, while,…), Integers, Identifiers (variables), etc It is built with DFAs (based on the theory of regular languages) Costas Busch - LSU

Parser: Knows the grammar of the programming language to be compiled Constructs derivation (and derivation tree) for input program file (input string) Converts derivation to machine code Costas Busch - LSU

Example Parser PROGRAM STMT_LIST STMT_LIST STMT; STMT_LIST | STMT; STMT EXPR | IF_STMT | WHILE_STMT | { STMT_LIST } EXPR EXPR + EXPR | EXPR - EXPR | ID IF_STMT if (EXPR) then STMT | if (EXPR) then STMT else STMT WHILE_STMT while (EXPR) do STMT Costas Busch - LSU

The parser finds the derivation of a particular input file Example Parser E => E + E => E + E * E => 10 + E*E => 10 + 2 * E => 10 + 2 * 5 Input string E -> E + E | E * E | INT 10 + 2 * 5 Costas Busch - LSU

derivation derivation tree b E => E + E => E + E * E => 10 + 2 * 5 a + E E 10 E * E 2 5 machine code Derivation trees are used to build Machine code mult a, 2, 5 add b, 10, a Costas Busch - LSU

A simple (exhaustive) parser Costas Busch - LSU

We will build an exhaustive search parser that examines all possible derivations Exhaustive Parser input string derivation grammar Costas Busch - LSU

Find derivation of string Example: Find derivation of string Exhaustive Parser derivation Input string ? Costas Busch - LSU

Exhaustive Search Phase 1: All possible derivations of length 1 Find derivation of All possible derivations of length 1 Costas Busch - LSU

Cannot possibly produce Phase 1: Find derivation of Cannot possibly produce Costas Busch - LSU

In Phase 2, explore the next step of each derivation Phase 1 from Phase 1 Phase 1 Costas Busch - LSU

Phase 2 Phase 1 Find derivation of Costas Busch - LSU

all possible derivations Phase 2 Find derivation of In Phase 3 explore all possible derivations Costas Busch - LSU

Phase 2 A possible derivation of Phase 3 Find derivation of Costas Busch - LSU

Final result of exhaustive search Exhaustive Parser Input string derivation Costas Busch - LSU

Time Complexity Suppose that the grammar does not have productions of the form ( -productions) (unit productions) Costas Busch - LSU

Since the are no -productions For any derivation of a string of terminals it holds that for all Costas Busch - LSU

Since the are no unit productions 1. At most derivation steps are needed to produce a string with at most variables 2. At most derivation steps are needed to convert the variables of to the string of terminals Costas Busch - LSU

Therefore, at most derivation steps are required to produce The exhaustive search requires at most phases Costas Busch - LSU

Suppose the grammar has productions Possible derivation choices to be examined in phase 1: at most Costas Busch - LSU

Choices for phase 2: at most Choices of phase 1 Number of Productions In General Choices for phase i: at most Choices of phase i-1 Number of Productions Costas Busch - LSU

Extremely bad!!! Total exploration choices for string : phase 1 phase 2|w| phase 2 Exponential to the string length Extremely bad!!! Costas Busch - LSU

Faster Parsers Costas Busch - LSU

There exist faster parsing algorithms for specialized grammars S-grammar: Symbol String of variables Each pair of variable, terminal appears once in a production (a restricted version of Greinbach Normal form) Costas Busch - LSU

Each string has a unique derivation S-grammar example: Each string has a unique derivation Costas Busch - LSU

In the exhaustive search parsing For S-grammars: In the exhaustive search parsing there is only one choice in each phase Steps for a phase: Total steps for parsing string : Costas Busch - LSU

For general context-free grammars: Next, we give a parsing algorithm that parses a string in time (this time is very close to the worst case optimal since parsing can be used to solve the matrix multiplication problem) Costas Busch - LSU

The CYK Parsing Algorithm Input: Arbitrary Grammar in Chomsky Normal Form String Output: Determine if Number of Steps: Can be easily converted to a Parser Costas Busch - LSU

Basic Idea Consider a grammar In Chomsky Normal Form Denote by the set of variables that generate a string if Costas Busch - LSU

Suppose that we have computed Check if : YES NO Costas Busch - LSU

can be computed recursively: prefix suffix Write If and and there is production Then Costas Busch - LSU

Examine all prefix-suffix decompositions of Set of Variables that generate Length 1 2 |w|-1 Result: Costas Busch - LSU

At the basis of the recursion we have strings of length 1 symbol Very easy to find Costas Busch - LSU

The whole algorithm can be implemented with dynamic programming: Remark: The whole algorithm can be implemented with dynamic programming: First compute for smaller substrings and then use this to compute the result for larger substrings of Costas Busch - LSU

Example: Grammar : Determine if Costas Busch - LSU

to all possible substrings Decompose the string to all possible substrings Length 1 2 3 4 5 Costas Busch - LSU

Costas Busch - LSU

Costas Busch - LSU

There is no production of form Thus, prefix suffix There is no production of form Thus, prefix suffix There are two productions of form Thus, Costas Busch - LSU

Costas Busch - LSU

There is no production of form There are 2 productions of form Decomposition 1 prefix suffix There is no production of form There are 2 productions of form Costas Busch - LSU

There is no production of form Decomposition 2 prefix suffix There is no production of form Costas Busch - LSU

Since Costas Busch - LSU

Approximate time complexity: Number of substrings Number of Prefix-suffix decompositions for a string Costas Busch - LSU