Syntax Analysis The recognition problem: given a grammar G and a string w, is w  L(G)? The parsing problem: if G is a grammar and w  L(G), how can w.

Slides:



Advertisements
Similar presentations
Lecture # 8 Chapter # 4: Syntax Analysis. Practice Context Free Grammars a) CFG generating alternating sequence of 0’s and 1’s b) CFG in which no consecutive.
Advertisements

1 Parse Trees Definitions Relationship to Left- and Rightmost Derivations Ambiguity in Grammars.
1 Parsing The scanner recognizes words The parser recognizes syntactic units Parser operations: Check and verify syntax based on specified syntax rules.
Grammars, constituency and order A grammar describes the legal strings of a language in terms of constituency and order. For example, a grammar for a fragment.
Simplifying CFGs There are several ways in which context-free grammars can be simplified. One natural way is to eliminate useless symbols those that cannot.
CS5371 Theory of Computation
CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Context-free.
Context-Free Grammars Lecture 7
Prof. Bodik CS 164 Lecture 81 Grammars and ambiguity CS164 3:30-5:00 TT 10 Evans.
Balanced Parentheses G = (V, , S, P) V = {S}  = {(,)} Start variable is S P = { S --> (S) | SS | /\}
1 The Parser Its job: –Check and verify syntax based on specified syntax rules –Report errors –Build IR Good news –the process can be automated.
January 15, 2014CS21 Lecture 61 CS21 Decidability and Tractability Lecture 6 January 16, 2015.
Lecture 9UofH - COSC Dr. Verma 1 COSC 3340: Introduction to Theory of Computation University of Houston Dr. Verma Lecture 9.
Context-Free Grammars Chapter 3. 2 Context-Free Grammars and Languages n Defn A context-free grammar is a quadruple (V, , P, S), where  V is.
1 Lecture 27 Parse/Derivation Trees –Leftmost derivations, rightmost derivations Ambiguous Grammars –Examples Arithmetic expressions If-then-else Statements.
Chapter 4 Context-Free Languages Copyright © 2011 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1.
Context-free Grammars Example : S   Shortened notation : S  aSaS   | aSa | bSb S  bSb Which strings can be generated from S ? [Section 6.1]
1 Parse Trees Definitions Relationship to lm and rm Derivations Ambiguity in Grammars.
1 Context-Free Languages Not all languages are regular. L 1 = {a n b n | n  0} is not regular. L 2 = {(), (()), ((())),...} is not regular.  some properties.
Classification of grammars Definition: A grammar G is said to be 1)Right-linear if each production in P is of the form A  xB or A  x where A and B are.
Context Free Grammars CIS 361. Introduction Finite Automata accept all regular languages and only regular languages Many simple languages are non regular:
Context-Free Grammars
Chapter 5 Context-Free Grammars
Grammars CPSC 5135.
PART I: overview material
Languages & Grammars. Grammars  A set of rules which govern the structure of a language Fritz Fritz The dog The dog ate ate left left.
Lecture 15 Naveen Z Quazilbash Ambiguity. Overview S-Grammars Ambiguity in Grammars Ambiguous grammars and Unambiguous Grammars.
Lecture # 9 Chap 4: Ambiguous Grammar. 2 Chomsky Hierarchy: Language Classification A grammar G is said to be – Regular if it is right linear where each.
CS 3240: Languages and Computation Context-Free Languages.
Parsing Prepared by Manuel E. Bermúdez, Ph.D. Associate Professor University of Florida Programming Language Principles Lecture 3.
Context Free Grammars. Context Free Languages (CFL) The pumping lemma showed there are languages that are not regular –There are many classes “larger”
CFG1 CSC 4181Compiler Construction Context-Free Grammars Using grammars in parsers.
1 Parse Trees Definitions Relationship to Left- and Rightmost Derivations Ambiguity in Grammars.
Re-enter Chomsky More about grammars. 2 Parse trees S  A B A  aA | a B  bB | b Consider L = { a m b n | m, n > 0 } (one/more a ’s followed by one/more.
Grammars Hopcroft, Motawi, Ullman, Chap 5. Grammars Describes underlying rules (syntax) of programming languages Compilers (parsers) are based on such.
Grammars CS 130: Theory of Computation HMU textbook, Chap 5.
Module 29 Parse/Derivation Trees Ambiguous Grammars
Chapter 3 Context-Free Grammars Dr. Frank Lee. 3.1 CFG Definition The next phase of compilation after lexical analysis is syntax analysis. This phase.
Introduction Finite Automata accept all regular languages and only regular languages Even very simple languages are non regular (  = {a,b}): - {a n b.
Syntax Analysis – Part I EECS 483 – Lecture 4 University of Michigan Monday, September 17, 2006.
Chapter 5 Context-free Languages
Top-Down Parsing.
Syntax Analyzer (Parser)
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
Context-Free Languages
CSC312 Automata Theory Lecture # 26 Chapter # 12 by Cohen Context Free Grammars.
Lecture # 10 Grammar Problems. Problems with grammar Ambiguity Left Recursion Left Factoring Removal of Useless Symbols These can create problems for.
Compiler Construction Lecture Five: Parsing - Part Two CSC 2103: Compiler Construction Lecture Five: Parsing - Part Two Joyce Nakatumba-Nabende 1.
Syntax Analysis By Noor Dhia Syntax analysis:- Syntax analysis or parsing is the most important phase of a compiler. The syntax analyzer considers.
Syntax(1). 2 Syntax  The syntax of a programming language is a precise description of all its grammatically correct programs.  Levels of syntax Lexical.
CONTEXT-FREE LANGUAGES
Context-Free Grammars: an overview
CS510 Compiler Lecture 4.
Fall Compiler Principles Context-free Grammars Refresher
Chapter 3 Context-Free Grammar and Parsing
Introduction to Parsing (adapted from CS 164 at Berkeley)
Even-Even Devise a grammar that generates strings with even number of a’s and even number of b’s.
Context-Free Grammars
Context-Free Grammars
Relationship to Left- and Rightmost Derivations
Context-Free Grammars
CHAPTER 2 Context-Free Languages
CSC 4181Compiler Construction Context-Free Grammars
Relationship to Left- and Rightmost Derivations
Theory of Computation Lecture #
Context-Free Grammars
Fall Compiler Principles Context-free Grammars Refresher
COSC 3340: Introduction to Theory of Computation
Context-Free Grammars
Answer Questions about Exam2 problems
Presentation transcript:

Syntax Analysis The recognition problem: given a grammar G and a string w, is w  L(G)? The parsing problem: if G is a grammar and w  L(G), how can w be derived in G? Both of these problems are decidable - that is, there are algorithms which will give a definite (correct) yes or no answer for any given instance of the problems. Parsing is important, because understanding the derivation of a structure helps us to understand the meaning of the structure.

Derivation Structure Consider the expression in the language G 0 : a +( a * a) In order to process this expression, it helps to consider the (a*a) substring as a more significant sub-unit than a+(a, for example. We can use the derivation of the string: 1) S -> S + S 2) S -> S * S 3) S -> (S) 4) S -> a.

Derivation Structure Consider the expression in the language G 0 : a +( a * a) In order to process this expression, it helps to consider the (a*a) substring as a more significant sub-unit than a+(a, for example. We can use the derivation of the string: S => S+S => S+(S) => S+(S*S) => S+(S*a) => S+(a*a) => a+(a*a). S S + S ( S ) S * S aa a 1) S -> S + S 2) S -> S * S 3) S -> (S) 4) S -> a.

Derivation Trees For any derivation, we can construct a derivation tree. The root of the tree will be a node representing the start symbol. Every time we apply a production A -> , we add a subtree below A A is the root, and there is a branch for every symbol of , in the same left-to-right order in which they appear in . We read the string represented by the derivation tree by reading the "leaf" nodes in left-to-right order. Note: "left-to-right" order means the "structural" order - the leftmost path, then the same path, but with the next-to-left branch at the last node where there was a choice, etc. - and not any order which may appear in the sketch.

S => S+S => S+(S) => S+(S*S) => S+(S*a) => S+(a*a) => a+(a*a). S S S + S ( S ) => S S + S S S + S ( S ) S * S S S + S ( S ) S * S aa S S + S ( S ) S * S aa a S S + S ( S ) S * S a

Equivalent Derivations Two different derivations can have the same derivation tree. Example: S => S+S => S+a => a+a and S => S+S => a+S => a+a both produce the tree S S + S a a In CFG's, the order of applying productions is irrelevant, as long as the same production is applied to the same symbol. 1) S -> S + S 2) S -> S * S 3) S -> (S) 4) S -> a.

Multiple Derivation Trees Consider the two derivations below: 1. S => S+S => S+S*S => S+S*a => S+a*a => a+a*a 2. S => S*S => S*a=> S+S*a => S+a*a => a+a*a These give essentially different derivation trees for the same final sentence. S S a + S S * S a a 1. S S a+ S S * S a a 2. This causes problems for our attempt to understand a string by considering its derivation. 1) S -> S + S 2) S -> S * S 3) S -> (S) 4) S -> a.

Ambiguous Grammars A derivation in which at each step the rightmost non-terminal is replaced is a right derivation. In a right derivation, the order of symbols to be replaced is fixed. A string has two different right derivations iff it has two different derivation trees. A CFG is ambiguous if there is at least one string in L(G) having two or more different right derivations (or, equally, two or more different derivation trees).

The Problem With Ambiguity By the previous example, the grammar of algebraic expressions, G 0, is ambiguous. Problem: 2+2*2 = ? Under derivation 1., we get 2 + (2*2) = 6. Under derivation 2., we get (2+2)*2 = 8. Which do we select? Why is this a problem? Suppose we are attempting to analyse strings in the language of G 0, in order to perform simple arithmetic - the structure of the derivation will tell us which operation to apply when. 1) S -> S + S 2) S -> S * S 3) S -> (S) 4) S -> a.

Unambiguous Expressions We are aiming to produce an unambiguous version of G 0. Essentially, we want to assign priorities to the operators, and reflect this in the grammar. Also, although it makes no difference to the evaluated expression, we want a+a+a to be (a+a)+a. We will do this by introducing new symbols - a term, T, will represent a product; a factor, F, will represent things that can be multiplied; and S will represent sums. An expression can be a sum of an expression and a term, or simply a term. A term can be a product of a term and a factor, or simply a factor. A factor can be an expression (in parentheses), or simply a symbol. 1) S -> S + S 2) S -> S * S 3) S -> (S) 4) S -> a.

Unambiguous Expressions We are aiming to produce an unambiguous version of G 0. Essentially, we want to assign priorities to the operators, and reflect this in the grammar. Also, although it makes no difference to the evaluated expression, we want a+a+a to be (a+a)+a. Example: Grammar G 1. S -> S + T | T T -> T * F | F F -> (S) | a We will do this by introducing new symbols - a term, T, will represent a product; a factor, F, will represent things that can be multiplied; and S will represent sums. An expression can be a sum of an expression and a term, or simply a term. A term can be a product of a term and a factor, or simply a factor. A factor can be an expression (in parentheses), or simply a symbol. 1) S -> S + S 2) S -> S * S 3) S -> (S) 4) S -> a.

Ambiguity and Decidability The ambiguity we have seen so far has always been a property of the grammar, and not of the langauge. However, there exist languages for which every grammar defining them is ambiguous. Example: {a i b j c k : i = j or j = k } A language for which every defining grammar is ambiguous is inherently ambiguous. More importantly, there is no algorithm which will determine whether or not a given grammar is ambiguous.