Extensible syntax analyzers Pavel Egorov

Slides:



Advertisements
Similar presentations
Translator Architecture Code Generator ParserTokenizer string of characters (source code) string of tokens abstract program string of integers (object.
Advertisements

1 Parsing The scanner recognizes words The parser recognizes syntactic units Parser operations: Check and verify syntax based on specified syntax rules.
May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Approaches to Parsing.
Grammars, Languages and Parse Trees. Language Let V be an alphabet or vocabulary V* is set of all strings over V A language L is a subset of V*, i.e.,
CSE 425: Semantic Analysis Semantic Analysis Allows rigorous specification of a program’s meaning –Lets (parts of) programming languages be proven correct.
Understanding Natural Language
Transparency No. P2C4-1 Formal Language and Automata Theory Part II Chapter 4 Parse Trees and Parsing.
A basis for computer theory and A means of specifying languages
Discussion #31/20 Discussion #3 Grammar Formalization & Parse-Tree Construction.
Slide 1 Chapter 2-b Syntax, Semantics. Slide 2 Syntax, Semantics - Definition The syntax of a programming language is the form of its expressions, statements.
Parsing — Part II (Ambiguity, Top-down parsing, Left-recursion Removal)
Parsing SLP Chapter 13. 7/2/2015 Speech and Language Processing - Jurafsky and Martin 2 Outline  Parsing with CFGs  Bottom-up, top-down  CKY parsing.
Parsing II : Top-down Parsing Lecture 7 CS 4318/5331 Apan Qasem Texas State University Spring 2015 *some slides adopted from Cooper and Torczon.
CPSC Compiler Tutorial 3 Parser. Parsing The syntax of most programming languages can be specified by a Context-free Grammar (CGF) Parsing: Given.
CSC3315 (Spring 2009)1 CSC 3315 Lexical and Syntax Analysis Hamid Harroud School of Science and Engineering, Akhawayn University
Problem of the DAY Create a regular context-free grammar that generates L= {w  {a,b}* : the number of a’s in w is not divisible by 3} Hint: start by designing.
The College of Saint Rose CIS 433 – Programming Languages David Goldschmidt, Ph.D. from Concepts of Programming Languages, 9th edition by Robert W. Sebesta,
Syntax Directed Definitions Synthesized Attributes
BİL 744 Derleyici Gerçekleştirimi (Compiler Design)1 Syntax Analyzer Syntax Analyzer creates the syntactic structure of the given source program. This.
Syntax Analysis The recognition problem: given a grammar G and a string w, is w  L(G)? The parsing problem: if G is a grammar and w  L(G), how can w.
-Mandakinee Singh (11CS10026).  What is parsing? ◦ Discovering the derivation of a string: If one exists. ◦ Harder than generating strings.  Two major.
Classification of grammars Definition: A grammar G is said to be 1)Right-linear if each production in P is of the form A  xB or A  x where A and B are.
Context-Free Grammars
Lesson 3 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Copyright © by Curt Hill Grammar Types The Chomsky Hierarchy BNF and Derivation Trees.
Phrase-structure grammar A phrase-structure grammar is a quadruple G = (V, T, P, S) where V is a finite set of symbols called nonterminals, T is a set.
11 Chapter 4 Grammars and Parsing Grammar Grammars, or more precisely, context-free grammars, are the formalism for describing the structure of.
6/4/2016IT 3271 The most practical Parsers: Predictive parser: 1.input (token string) 2.Stacks, parsing table 3.output (syntax tree, intermediate codes)
Parsing Introduction Syntactic Analysis I. Parsing Introduction 2 The Role of the Parser The Syntactic Analyzer, or Parser, is the heart of the front.
Bernd Fischer RW713: Compiler and Software Language Engineering.
CFG1 CSC 4181Compiler Construction Context-Free Grammars Using grammars in parsers.
Parsing — Part II (Top-down parsing, left-recursion removal) Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.
Chapter 3 Context-Free Grammars and Parsing. The Parsing Process sequence of tokens syntax tree parser Duties of parser: Determine correct syntax Build.
Grammars Hopcroft, Motawi, Ullman, Chap 5. Grammars Describes underlying rules (syntax) of programming languages Compilers (parsers) are based on such.
Grammars CS 130: Theory of Computation HMU textbook, Chap 5.
Unit-3 Parsing Theory (Syntax Analyzer) PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE.
1 A Simple Syntax-Directed Translator CS308 Compiler Theory.
Introduction Finite Automata accept all regular languages and only regular languages Even very simple languages are non regular (  = {a,b}): - {a n b.
Syntax Analyzer (Parser)
Programming Languages and Design Lecture 2 Syntax Specifications of Programming Languages Instructor: Li Ma Department of Computer Science Texas Southern.
1 February 23, February 23, 2016February 23, 2016February 23, 2016 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University.
Chapter 4: Syntax analysis Syntax analysis is done by the parser. –Detects whether the program is written following the grammar rules and reports syntax.
Comp 311 Principles of Programming Languages Lecture 2 Syntax Corky Cartwright August 26, 2009.
Lecture # 10 Grammar Problems. Problems with grammar Ambiguity Left Recursion Left Factoring Removal of Useless Symbols These can create problems for.
Compiler Construction Lecture Five: Parsing - Part Two CSC 2103: Compiler Construction Lecture Five: Parsing - Part Two Joyce Nakatumba-Nabende 1.
Organization of Programming Languages Meeting 3 January 15, 2016.
Syntax Analysis By Noor Dhia Left Recursion: Example1: S → S0s1s | 01 The grammar after eliminate left recursion is: S → 01 S’ S' → 0s1sS’
Chapter 3: Describing Syntax and Semantics
Chapter 3 – Describing Syntax
Parsing & Context-Free Grammars
Chapter 4 - Parsing CSCE 343.
Unit-3 Bottom-Up-Parsing.
Table-driven parsing Parsing performed by a finite state machine.
Parsing — Part II (Top-down parsing, left-recursion removal)
Context-Free Grammars
Compiler Construction
PARSE TREES.
Context-Free Grammars
Context-Free Grammars
Chapter 2: A Simple One Pass Compiler
Thinking about grammars
CSC 4181Compiler Construction Context-Free Grammars
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
CSC 4181 Compiler Construction Context-Free Grammars
Context-Free Grammars
Context-Free Grammars
Parsing & Context-Free Grammars Hal Perkins Autumn 2005
Programming Languages and Compilers (CS 421)
COP 4620 / 5625 Programming Language Translation / Compiler Writing Fall 2003 Lecture 2, 09/04/2003 Prof. Roy Levow.
Parsing CSCI 432 Computer Science Theory
Presentation transcript:

Extensible syntax analyzers Pavel Egorov

Problem statements L – some formal language; T : L  R – it’s translator; L ext  L – some extended language; T ext : L ext  R ext – it’s extended translator.  L T ext (  ) = T(  ) We want compatibility!

Disclaimer Consider syntax analyzers only. Fixed algorithm of syntax analysis. Extension due to modification of the grammar only.

What is syntax analyzer? SA converts a word to a syntax tree. What is syntax tree? – Root is labeled by the axiom of the grammar – Inner nodes are labeled by non-terminals – Leaf nodes are labeled by terminals or non- terminals or 

Classic parse tree Grammar: A  aBd B  b B  A Word: “aabdd” Derivation: A  aBd  aAd   aaBdd  aabdd A B A B b da da

Classic parse tree Is it good enough? G 1 : A  abc Extended G 2 : A  aBc B  b Word: “abc” A B b ca A bca They are different!

Classic parse tree Is it good enough? G 1 : A  a Extended G 2 : A  aE E   Word: “a” A E  a A a They are different!

Cancelled tree Driven by labeled grammar Cut off some inner nodes from parse tree Cut off  -leafs from parse tree Is much better for parser extension!

Cancelled tree example Labeled grammar: A  a 1 B 0 d 0 B 1 B  b 1 E 0 B  x 1 E   0 Symbols labeled by 0 are not present in tree  -leafs are always labeled by 0 A B b d a  A b a parse tree cancelled tree B x B x E

Intermediate results Labeled grammar Syntax analyzer driven by labeled grammar Canceled tree as a result of syntax analysis So how can we extend grammars without breaking compatibility? Compatibility:  L SA ext (  ) = SA (  )

Extending transformations of grammars f – extending grammar transformation G - grammar L(f(G))  L(G)

Extending transformations ADD-transform adds some fixed new production to a grammar. EXTRACT-transform: before:A   x  y  z after:A   x B 0  z B   y

Extending transformations AE *  all transforms, composed from arbitrary number of ADD- and EXTRACT-transforms.

Results

Compatibility 1.Any transform from AE* doesn’t break compatibility of corresponding syntax analyzers. 2.A grammar extended by AE* transform has the same production labeling as initial grammar has on the common productions. (AE* doesn’t change labeling of unchanged productions)

Independent transforms G G2G2 G1G1 G 12 ???   When it is possible? How can we do it?      AE * G 2  Domain(   ) ? G 1  Domain(   ) ?

Independent transforms What if G 1  Domain(  2 )? Sometimes we can solve this problem! A    1 : A   C  ; C    2 : A   B  ; B    2 /  1 : A   B  ; B   C Conflict! Decision!

 a  fixed transform  “fixed” , which can be applied after  Sometimes we can’t fix   A   A  X  ; X   A  Y; Y  Recursive definition of the function  for  and  from AE *. Sometimes  is not defined.

Independent transforms G G2G2 G1G1 G 12     /     /     /     ?   /     G 21 ?

Independent transforms G G2G2 G1G1 G 12     /     /     /     =   /    

Conclusion Syntax analyzers – driven by labeled grammars – cancelled trees as a result AE* - a way to extend grammars safelly  – a way to compose independent grammar transforms Order of application of the extending transforms doesn’t matter!

Thank you! Questions?