Download presentation

Presentation is loading. Please wait.

Published byCorinne Dexter Modified over 2 years ago

1
Grammars, Languages and Parse Trees

2
Language Let V be an alphabet or vocabulary V* is set of all strings over V A language L is a subset of V*, i.e., L V* L may be finite or infinite Programming language –Set of all possible programs (valid, very long string) –Programs with syntax errors are not in the set –Infinite number of programs

3
Language Representation Finite –Enumerate all sentences Infinite language –Cannot be specified by enumeration –Use a generative device, i.e., a grammar Specifies the set of all legal sentences Defined recursively (or inductively)

4
Sample Grammar Simple arithmetic expressions (E) Basis Rules: –A Variable is an E –An Integer is an E Inductive Rules: –If E 1 and E 2 are Es, so is (E 1 + E 2 ) –If E 1 and E 2 are Es, so is (E 1 * E 2 ) Examples: x, y, 3, 12, (x + y), (z * (x + y)), ((z * (x + y)) + 12)

5
Production Rules Use symbols (aka syntactical categories) and meta-symbols to define basis and inductive rules For our example: E V E I E (E + E) E (E * E) Inductive Rules Basis Rules

6
Formal Definition of a Grammar G = (V N, V T, S, ), where – V N, V T, sets of non-terminal and terminal symbols – S V N, a start symbol – = a finite set of relations from (V T V N ) + to (V T V N ) * An element ( , ) of , is written as and is called a production rule or a rewrite rule

7
Sample Grammar Revisited 1.E V | I | (E + E) | (E * E) 2.V L | VL | VD 3.I D | ID 4.D 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 5.L x | y | z V N : E, V, I, D, L V T : 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, x, y, z S = E : rules 1-5

8
Another Simple Grammar Symbols: S: sentence V: verb O: object A: article N: noun SP: subject phrase VP: verb phrase NP: noun phrase Rules: S SP VP SP A N A a | the N monkey | banana | tree VP V O V ate | climbs O NP NP A N

9
Context-Free Grammar A context-free grammar is a grammar with the following restriction: – The relation is a finite set of relations from V N to (V T V N ) + The left hand side of a production is a single non-terminal The right hand side of any production cannot be empty Context-free grammars generate context-free languages. With slight variations, essentially all programming languages are context-free languages. We will focus on context-free grammars

10
More Grammars G 1 = (V N, V T, S, ), where: V N = {S, B} V T = {a, b, c} S = S = { S aBSc, S abc, Ba aB, Bb bb } G 2 = (V N, V T, S, ), where: V N = {I, L, D} V T = {a, b, …, z, 0, 1, …, 9} S = I = { I L | ID | IL, L a | b | … | z, D 0 | 1 | … | 9 } G 3 = (V N, V T, S, ), where: = { S aA, V N = {S, A, B } A aA | bB, V T = {a, b} B bB | } S = S Which are context-free?

11
Direct Derivative Let G = (V N, V T, S, ) be a grammar Let α, β (V N V T ) * β is said to be a direct derivative of α, written α β, if there are strings 1 and 2 such that: α = 1 L 2, β = 1 λ 2, L V N and L λ is a production of G We go from α to β using a single rule

12
Examples of Direct Derivatives G = (V N, V T, S, ), where: V N = {I, L, D} V T = {a, b, …, z, 0, 1, …, 9} S = I = { I L | ID | IL L a | b | … | z D 0 | 1 | … | 9 } αβRule Used 11 22 IL I L IbLb I L b Lbab L a b IDDI0D D 0 ID

13
Derivation Let G = (V N, V T, S, ) be a grammar A string α produces ω, or α reduces to ω, or ω is a derivation of α, written α + ω, if there are strings 1, …, n (n≥1) such that: α 1 2 … n-1 n ω We go from α to ω using several rules

14
1.E V | I | (E + E) | (E * E) 2.V L | VL | VD 3.I D | ID 4.D 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 5.L x | y | z ( ( z * ( x + y ) ) + 12 ) ? Example of Derivation E ( E + E ) ( ( E * E ) + E ) ( ( E * ( E + E ) ) + E ) ( ( V * ( V + V ) ) + I ) ( ( L * ( L + L ) ) + ID ) ( ( z * ( x + y ) ) + DD ) ( ( z * ( x + y ) ) + 12 ) How about: ( x + 2 ) ( 21 * ( x4 + 7 ) ) 3 * z 2y

15
Grammar-generated Language If G is a grammar with start symbol S, a sentential form is any derivative of S A language L generated by a grammar G is the set of all sentential forms whose symbols are all terminals: L(G) = { | S + and V T * }

16
Example of Language Let G = (V N, V T, S, ), where: V N = {I, L, D} V T = {a, b, …, z, 0, 1, …, 9} S = I = { I L | ID | IL L a | b | … | z D 0 | 1 | … | 9 } L(G) = {abc12, x, m934897773645, a1b2c3, …} I ID IDD ILDD ILLDD LLLDD aLLDD abLDD abcDD abc1D abc12

17
Syntax Analysis: Parsing The parse of a sentence is the construction of a derivation for that sentence The parsing of a sentence results in – acceptance or rejection – and, if acceptance, then also a parse tree We are looking for an algorithm to parse a sentence (i.e., to parse a program) and produce a parse tree

18
Parse Trees A parse tree is composed of – interior nodes representing elements of V N – leaf nodes representing elements of V T For each interior node N, the transition from N to its children represents the application of one production rule

19
Parse Tree Construction Top-down – Start with the root (start symbol) – Proceed downward to leaves using productions Bottom-up – Start from leaves – Proceed upward to the root Although these seem like reasonable approaches to develop a parsing algorithm, we’ll see later that neither is ideal we’ll find a better way!

20
1.A V | I | (A + A) | (A * A) 2.V L | VL | VD 3.I D | ID 4.D 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 5.L x | y | z ( ( z * ( x + y ) ) + 1 2 ) ( ( L * ( L + L ) ) + D D ) 1.A V | I | (A + A) | (A * A) 2.V L | VL | VD 3.I D | ID 4.D 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 5.L x | y | z ( ( V * ( V + V ) ) + I D ) 1.A V | I | (A + A) | (A * A) 2.V L | VL | VD 3.I D | ID 4.D 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 5.L x | y | z ( ( A * ( A + A ) ) + I ) 1.A V | I | (A + A) | (A * A) 2.V L | VL | VD 3.I D | ID 4.D 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 5.L x | y | z ( ( A * A ) + A ) 1.A V | I | (A + A) | (A * A) 2.V L | VL | VD 3.I D | ID 4.D 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 5.L x | y | z ( A + A ) 1.A V | I | (A + A) | (A * A) 2.V L | VL | VD 3.I D | ID 4.D 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 5.L x | y | z A 1.A V | I | (A + A) | (A * A) 2.V L | VL | VD 3.I D | ID 4.D 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 5.L x | y | z ( ( z * ( x + y ) ) + 12 ) Top down

21
( ( z * ( x + y ) ) + 1 2 ) ( ( V * ( V + V ) ) + I D) A ( A + A ) ( ( L * ( L + L ) ) + D D) ( ( A * ( A + A ) ) + I ) ( ( A * A ) + A ) 1.A V | I | (A + A) | (A * A) 2.V L | VL | VD 3.I D | ID 4.D 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 5.L x | y | z ( ( z * ( x + y ) ) + 12 ) Bottom up

22
Lexical Analyzer and Parser Lexical analyzers –Input: symbols of length 1 –Output: classified tokens Parsers –Input: classified tokens –Output: parse tree (i.e., syntactically correct program) A syntactically correct program will run. Will it do what you want? [a monkey ate a banana / a banana climbs the tree]

23
Backus-Naur Form (BNF) A traditional meta-language to represent grammars for programming languages – Every non-terminal is enclosed in – Instead of the symbol , we use ::= Example I L | ID | IL L a | b | … | z D 0 | 1 | … | 9 ::= | | ::= a | b | … | z ::= 0 | 1 | … | 9 WHY?

Similar presentations

OK

May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars.

May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars.

© 2018 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on obesity management certification Ppt on the art of war audio Ppt on fmcg sector in india Ppt on simple carburetor operation Ppt on oxidation and reduction reactions Ppt on teachers day gift Ppt on life study of mathematician jobs Ppt on campus recruitment systems Ppt on buddhism and jainism Ppt on micro hydro power plant