Languages A Language L over a finite alphabet  is a set of strings of characters from the alphabet. Examples are : 1.L = { s | s is a grammatical English.

Languages A Language L over a finite alphabet  is a set of strings of characters from the alphabet. Examples are : 1.L = { s | s is a grammatical English sentence} and  = { w | w is a word in the dictionary} (consisting of over a hundred thousand words, each treated as an atomic symbol). 2.L = { s | s is a string of digits} and  = { 0,1,2,3,4,5,6,7,8,9} 3.L = { P | P is a string of ASCII characters forming a Java program} and  = the printable ASCII character set.

Grammars A context free grammar G is a 4-tuple : G = ( V, ,P,S ) where 1. V is a set of nonterminals (or string variables), each representing a sublanguage from which the variable takes its values. Examples are which can take on values such as “the big box” and T which can take on string values used to represent products in an algebraic expression. 2.  is a finite alphabet. Examples are the English vocabulary (consisting of over a hundred thousand words, each treated as an atomic symbol). Another example is the printable ASCII character set. The binary alphabet consists of {0,1}. The alphabet contains the symbols from which language strings are formed. 3. P is a finite set of productions or rules used to define the sublanguages represented by the nonterminals. In a context free grammar, a rule has the format A  X where A  V and X  ( V   ) *. The interpretation is that the strings in the sublanguage represented by A can be constructed according to the format indicated by X. For a terminal character in X, the terminal character is used in the A string and for a variable in X, a string in the sublanguage is substituted for the variable. Examples are  and T  a * T. 4. S is a designated variable (referred to as the start symbol or the head of the language). It represents the language being defined by the grammar G.

Grammars and Derivations Derivations If u,v are strings in ( V   ) *, A is in V and A  X is in P, then uAv  uXv, referred to as uAv “derives” uXv by application of the rule A  X. For repeated applications of 0 or more rules, the symbol  * is used. Language Definition The language L(G) defined by G is { x | x   *, S  * x }

Parsing Given a Grammar G with distinguished nonterminal S and a string X over the alphabet, does S  * X? Parsing attempts to find a sequence of rules by which – S  * X

Grammar for Decimal Numbers I  d I I  d I  D D  d D D  d Parse tree for d d. d d d I d I d I D d D d A parse tree has intermediate nodes for nonterminals, a child node for each RHS character in the production used to replace the nonterminal, a leaf node for each character in the language string produced by the derivation. The language is the set of strings for which there exist parse trees.

A Grammar for Sentences S  NvP# N  dAn A  aA A  P  pN Example derivation S  NvP#  dAnvP  danvP  danvpN  danvpdAn  danvpdn The young woman went to the market.

A Grammar for Sentences S  NVNP S  NVa N  dAn N  dn A  aA A  a V  mv V  v P  pN Example derivation S  NVa  dnVa  dnva The car is speeding thed carn isv speedinga Alphabet or Vocabulary

Top down Left to Right Parse Repeat Select a rule to replace the leftmost nonterminal whose right hand side will ultimately generate a prefix of the remaining source.

Top down Left to Right Parse Leftmost character of the sentential form is. Select the rule  [the] and click to “expand”.

Top down Left to Right Parse

Lexemes and Tokens A lexeme is a string of terminal characters belonging to some lexical class such as adjective, determiner, noun, etc. Examples are : “young” – adjective - a “the” - determiner “woman” - noun A token with a syntactic or lexical code. Examples are :

Finite state automata and language recognition S I D dd d · Finite state automaton has  = {d,}, start state S and legal final states I and D. The transition function is represented by above diagram or table below: d S I F I I D F D D D - Accepts : ddd, d.dd,.ddd Rejects d.dd.d · F d

Top down Left to Right Parse LL(1) Parsing: Start with the nonterminal representing the language as the unmatched sentential form Repeat until source string has been generated or until failure Let X be the leftmost character If X is terminal it must the first character of the remaining source (otherwise failure) If X is nonterminal then the rules for X must not overlap as far as the 1 st character generated by a rule. Select the rule which generates (in 1 step or more) a 1 st character matching the next source character and apply this rule.

Example Parse Grammar 1.S  NvNP 2.P  3.P  pN 4.N  dAn 5.A  aA 6.A  LL(1) parse table

LL(1) Parsing FIRST: Define First(X) as the set of characters which can begin a string derived from grammar symbol X Follow: Define Follow(X) as the set of characters which can follow grammar symbol X in a string derived from the start symbol S First: If X is a terminal then First(X) = {X} If X is a nonterminal and X → λ then add λ to First(X) If X → X 1 X 2..X k X k+1..X n with λ in First(X i ), 1 <= i <= k, then add First(X k+1 ) to First(X) and if λ in First(X i ), 1 <= i <= n, add λ to First(X) Follow: $ is in Follow(S) If A → αBβ with β <> λ, then add First(β) – { λ} to Follow(B) If A → αB or A → αBβ with λ in First(β), then add Follow(A) to Follow(B) LL(1) parse table Let T be a table with rows for nonterminals and columns for terminals. If R i A → α and t in First(α) then enter i in T(A,t). If R i A → α and λ in First(α) and t in Follow(A) then enter i in T(A,t).

LL(1) Parsing – Computation of First & Follow First : Initialize First( A ) to  for each A  N Repeat Change = False For each rule If A  uXv, u  * with X  N then Update( First( A ), First( X ), Change ) Else If X  T then Update( First( A ), First( X ), Change ) Else If A  u, u  * then Update( First( A ),, Change ) Until Change = False Follow: Initialize Follow( A ) ) to  for each A  N, Follow( S ) = { # } Repeat For each rule If A  uXYv, u  * with X  N then If Y  N then Update( Follow( X ), First( Y ), Change ) Else If Y  T Update( Follow( X ), Y, Change ) And If A  uXv with v  * then Update( Follow( X ), Follow( A ), Change ) Until Change = False

Example of First & Follow LL(1) parse table

LL(1) parse – Example 2 : the dog bit the young boy in the leg  dnvdanpdn (tokens generated by lexical analyzer)

Languages A Language L over a finite alphabet  is a set of strings of characters from the alphabet. Examples are : 1.L = { s | s is a grammatical English.

Similar presentations

Presentation on theme: "Languages A Language L over a finite alphabet  is a set of strings of characters from the alphabet. Examples are : 1.L = { s | s is a grammatical English."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Languages A Language L over a finite alphabet  is a set of strings of characters from the alphabet. Examples are : 1.L = { s | s is a grammatical English.

Similar presentations

Presentation on theme: "Languages A Language L over a finite alphabet  is a set of strings of characters from the alphabet. Examples are : 1.L = { s | s is a grammatical English."— Presentation transcript:

Similar presentations

About project

Feedback