Presentation is loading. Please wait.

Presentation is loading. Please wait.

Languages A Language L over a finite alphabet  is a set of strings of characters from the alphabet. Examples are : 1.L = { s | s is a grammatical English.

Similar presentations


Presentation on theme: "Languages A Language L over a finite alphabet  is a set of strings of characters from the alphabet. Examples are : 1.L = { s | s is a grammatical English."— Presentation transcript:

1 Languages A Language L over a finite alphabet  is a set of strings of characters from the alphabet. Examples are : 1.L = { s | s is a grammatical English sentence} and  = { w | w is a word in the dictionary} (consisting of over a hundred thousand words, each treated as an atomic symbol). 2.L = { s | s is a string of digits} and  = { 0,1,2,3,4,5,6,7,8,9} 3.L = { P | P is a string of ASCII characters forming a Java program} and  = the printable ASCII character set.

2 Grammars A context free grammar G is a 4-tuple : G = ( V, ,P,S ) where 1. V is a set of nonterminals (or string variables), each representing a sublanguage from which the variable takes its values. Examples are which can take on values such as “the big box” and T which can take on string values used to represent products in an algebraic expression. 2.  is a finite alphabet. Examples are the English vocabulary (consisting of over a hundred thousand words, each treated as an atomic symbol). Another example is the printable ASCII character set. The binary alphabet consists of {0,1}. The alphabet contains the symbols from which language strings are formed. 3. P is a finite set of productions or rules used to define the sublanguages represented by the nonterminals. In a context free grammar, a rule has the format A  X where A  V and X  ( V   ) *. The interpretation is that the strings in the sublanguage represented by A can be constructed according to the format indicated by X. For a terminal character in X, the terminal character is used in the A string and for a variable in X, a string in the sublanguage is substituted for the variable. Examples are  and T  a * T. 4. S is a designated variable (referred to as the start symbol or the head of the language). It represents the language being defined by the grammar G.

3 Grammars and Derivations Derivations If u,v are strings in ( V   ) *, A is in V and A  X is in P, then uAv  uXv, referred to as uAv “derives” uXv by application of the rule A  X. For repeated applications of 0 or more rules, the symbol  * is used. Language Definition The language L(G) defined by G is { x | x   *, S  * x }

4 Parsing Given a Grammar G with distinguished nonterminal S and a string X over the alphabet, does S  * X? Parsing attempts to find a sequence of rules by which – S  * X

5 Grammar for Decimal Numbers I  d I I  d I  D D  d D D  d Parse tree for d d. d d d I d I d I D d D d A parse tree has intermediate nodes for nonterminals, a child node for each RHS character in the production used to replace the nonterminal, a leaf node for each character in the language string produced by the derivation. The language is the set of strings for which there exist parse trees.

6 A Grammar for Sentences S  NvP# N  dAn A  aA A  P  pN Example derivation S  NvP#  dAnvP  danvP  danvpN  danvpdAn  danvpdn The young woman went to the market.

7 A Grammar for Sentences S  NVNP S  NVa N  dAn N  dn A  aA A  a V  mv V  v P  pN Example derivation S  NVa  dnVa  dnva The car is speeding thed carn isv speedinga Alphabet or Vocabulary

8 Top down Left to Right Parse Repeat Select a rule to replace the leftmost nonterminal whose right hand side will ultimately generate a prefix of the remaining source.

9 Top down Left to Right Parse Leftmost character of the sentential form is. Select the rule  [the] and click to “expand”.

10 Top down Left to Right Parse

11 Lexemes and Tokens A lexeme is a string of terminal characters belonging to some lexical class such as adjective, determiner, noun, etc. Examples are : “young” – adjective - a “the” - determiner “woman” - noun A token with a syntactic or lexical code. Examples are :

12 Finite state automata and language recognition S I D dd d · Finite state automaton has  = {d,}, start state S and legal final states I and D. The transition function is represented by above diagram or table below: d S I F I I D F D D D - Accepts : ddd, d.dd,.ddd Rejects d.dd.d · F d

13 Top down Left to Right Parse LL(1) Parsing: Start with the nonterminal representing the language as the unmatched sentential form Repeat until source string has been generated or until failure Let X be the leftmost character If X is terminal it must the first character of the remaining source (otherwise failure) If X is nonterminal then the rules for X must not overlap as far as the 1 st character generated by a rule. Select the rule which generates (in 1 step or more) a 1 st character matching the next source character and apply this rule.

14 Example Parse Grammar 1.S  NvNP 2.P  3.P  pN 4.N  dAn 5.A  aA 6.A  LL(1) parse table

15 LL(1) Parsing FIRST: Define First(X) as the set of characters which can begin a string derived from grammar symbol X Follow: Define Follow(X) as the set of characters which can follow grammar symbol X in a string derived from the start symbol S First: If X is a terminal then First(X) = {X} If X is a nonterminal and X → λ then add λ to First(X) If X → X 1 X 2..X k X k+1..X n with λ in First(X i ), 1 <= i <= k, then add First(X k+1 ) to First(X) and if λ in First(X i ), 1 <= i <= n, add λ to First(X) Follow: $ is in Follow(S) If A → αBβ with β <> λ, then add First(β) – { λ} to Follow(B) If A → αB or A → αBβ with λ in First(β), then add Follow(A) to Follow(B) LL(1) parse table Let T be a table with rows for nonterminals and columns for terminals. If R i A → α and t in First(α) then enter i in T(A,t). If R i A → α and λ in First(α) and t in Follow(A) then enter i in T(A,t).

16 LL(1) Parsing – Computation of First & Follow First : Initialize First( A ) to  for each A  N Repeat Change = False For each rule If A  uXv, u  * with X  N then Update( First( A ), First( X ), Change ) Else If X  T then Update( First( A ), First( X ), Change ) Else If A  u, u  * then Update( First( A ),, Change ) Until Change = False Follow: Initialize Follow( A ) ) to  for each A  N, Follow( S ) = { # } Repeat For each rule If A  uXYv, u  * with X  N then If Y  N then Update( Follow( X ), First( Y ), Change ) Else If Y  T Update( Follow( X ), Y, Change ) And If A  uXv with v  * then Update( Follow( X ), Follow( A ), Change ) Until Change = False

17 Example of First & Follow LL(1) parse table

18 LL(1) parse – Example 2 : the dog bit the young boy in the leg  dnvdanpdn (tokens generated by lexical analyzer)


Download ppt "Languages A Language L over a finite alphabet  is a set of strings of characters from the alphabet. Examples are : 1.L = { s | s is a grammatical English."

Similar presentations


Ads by Google