Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 、 Alphabet Non-empty set of symbols , usually expressed in  、 V or Other Upper-case Greece Letter 2 、 Symbol(Character) Elements in alphabet, finest.

Similar presentations


Presentation on theme: "1 、 Alphabet Non-empty set of symbols , usually expressed in  、 V or Other Upper-case Greece Letter 2 、 Symbol(Character) Elements in alphabet, finest."— Presentation transcript:

1 1 、 Alphabet Non-empty set of symbols , usually expressed in  、 V or Other Upper-case Greece Letter 2 、 Symbol(Character) Elements in alphabet, finest elements in a language 3 、 String Finite sequence of symbols in the Alphabet. Notes : Null-string is string without any symbol, written as  。 Chapter 2 Language & Syntax Description Section 1 Alphabet & String

2 4 、 Sentence A set of strings based on symbols in the Alphabet in certain construction rules 5 、 Language Sets of sentences in the Alphabet. Notes : By convention, a symbol is expressed as a,b,c,… ; a string is expressed as , , ,… ; a set of strings is expressed in A,B,C,….

3 Chapter 2 Language & Syntax Description Section 1 Alphabet & String 6 、 Operations on the sets of strings 1) 、 Concatenate (Product) Operation Let the string set A={  1,  2,…},B={  1,  2,...}, then (Cartesian) Product AB is defined as AB={  |  A and  B} Notes : 1 ) String set product on self is called as power of the string set 2 ) A 0 ={  } 3 ) n powers of Alphabet A is the set of all strings with n length

4 Chapter 2 Language & Syntax Description Section 1 Alphabet & String 6 、 Operations on the sets of strings 2) 、 Closure and positive closure a ) Closure A * =A 0  A 1  A 2  … It is meant by the set of all strings on Alphabet A(Including null-string  ) b ) Positive closure A + =A 1  A 2  …=A * -{  } Notes : A language is a subset of positive closure on the Alphabet.

5 Chapter 2 Language & Syntax Description Section 2 Grammar & Language 1 、 Basic concepts a 、 Grammar Grammar is the formal production rules describing the construction of syntax elements. Notes : 1) Syntax elements include sentences and words in sentences, a language is composed of sentences. 2) The form of a production rule is as following: left-side  right-side (that can be read as “left- side is defined as right-side”, “left-side derives right-side”,or “left-side produces right-side”, it expresses the relation between the two sides)

6 b 、 Non-terminal symbol –A symbol that appears in the left of a rule, is bracketed in <> and expresses a syntax concept. –A set of non-terminal symbols is expressed in V N c 、 Terminal symbol –Strings in a language that cannot be decomposed (including strings of single characters), expressed in V T. Notes : Terminal symbols are basic elements of a sentence. Chapter 2 Language & Syntax Description Section 2 Grammar & Language 1 、 Basic concepts

7 d 、 Start symbol –A special non-terminal symbol that is the core of the defined syntax. Notes : The start symbol is also named as “identified symbol”. e 、 Production –A set of rules to define the relations among strings The form : A   ( A produce  ) E.g. 

8 Chapter 2 Language & Syntax Description Section 2 Grammar & Language 1 、 Basic concepts f 、 Derivation –The process that starts from the Start Symbol, and derives a sentence by replacing the left-side with right side in a production rule. –Leftmost (Rightmost) Derivation : Only use a production rule every time and replace the leftmost (Rightmost) Terminal Symbol with the right side Notes : Leftmost (Rightmost) Derivation are called canonical derivation.

9 Chapter 2 Language & Syntax Description Section 2 Grammar & Language 1 、 Basic concepts g 、 Reduction –Reduction is the inverse process of derivation,that is, starting from a given sentence of a language, arriving at the Start Symbol by replacing the right-side with left-side of the production rules finally. –Leftmost(Rightmost) Reduction is the inverse process of Rightmost(Leftmost) derivation. Notes : Leftmost and Rightmost Reduction are called canonical reduction.

10 Chapter 2 Language & Syntax Description Section 2 Grammar & Language 1 、 Basic concepts h 、 Sentential form 、 Sentence & Language Sentential form –String  that is produced from every derivation (including 0 derivation) from the Start Symbol. Written as S  ,   ( V N  V T ) * Sentence –A sentential form that only include terminal symbol Language –The set of sentences (strings) that are produced from one or more derivation from S. Written as L(G), L(G)={  |S  , and   V T * } * +

11 Chapter 2 Language & Syntax Description Section 2 Grammar & Language 1 、 Basic concepts i 、 Recursive definition of grammar rules –A non-terminal symbol is included in the definition of the non-terminal symbol. Notes : You should be careful when you define a grammar in a recursive method. You must give the exit statement (special case statement) of the recursion. Otherwise you can not get a sentence forever.

12 Chapter 2 Language & Syntax Description Section 2 Grammar & Language 1 、 Basic concepts j 、 Extended notations of grammar rules Use extended BNF(Backus Naur Form) notations –() ——Extract factor E.g. U  ax|ay|az Rewritten as U  a(x|y|z) –{} ——Assignment of repeat number E.g.  { | } 5 0. –[] ——Optional symbol E.g.  [+|-] { }

13 Chapter 2 Language & Syntax Description Section 2 Grammar & Language 1 、 Basic concepts k 、 Meta-language symbol The symbols that are used in describing the relations of grammar symbol, E.g. “  ” and “|” are called as meta-language symbol.

14 Chapter 2 Language & Syntax Description Section 2 Grammar & Language 2 、 Formal definition a 、 Grammar definition A grammar G is defined as a quadruple (V N,V T,P,S) b 、 Catalog of grammars According to the limitation on the production rules in a grammar, we can classify grammars into 4 sorts, such as,0-type grammar 、 1-type grammar 、 2-type grammar and 3-type grammar

15 Chapter 2 Language & Syntax Description Section 2 Grammar & Language 2 、 Formal definition b 、 Catalog of grammars (1) 0-type grammar (Phrase grammar or grammar without limitation) –To any production  in P where  V + and  V *, there is at least a non-terminal symbol in . Notes :  The automation that can recognizes a 0-type language is called as Turing Machine;  0-type grammar is a grammar that has least limitation on its productions;  We can get other types of grammar by limiting the form of productions in a 0-type grammar.

16 Chapter 2 Language & Syntax Description Section 2 Grammar & Language 2 、 Formal definition b 、 Catalog of grammars (2) 1-type grammar(context-sensitive grammar or length- added grammar) –To any production  in P,there is the limitation of |  |>=|  | except for S  . If S   , S can not appear in the right side of any production. –Or , any production  in P has the form of  A    (where ,  V *,A  V N,   V + ) except for S  . Notes :  The automation that can recognizes a 1-type language is called as Linear Bound (LBA) ;  In a 1-type grammar, we should consider the context of a non-terminal symbol when we replace the non-terminal symbol. And a non-terminal symbol can not be replaced by  except that the Start Symbol can produce 

17 Chapter 2 Language & Syntax Description Section 2 Grammar & Language 2 、 Formal definition b 、 Catalog of grammars (3) 2-type grammar(Context-free grammar) –Every production in P is of the form A  where A  V N ,  V *. Notes :  The left side of each production should be a non-terminal symbol, the right side of each production may be V N, V T or .  The automation that recognizes a 2-type language is called as Push- Down Automation(PDA)

18 Chapter 2 Language & Syntax Description Section 2 Grammar & Language 2 、 Formal definition b 、 Catalog of grammars (4)3-type grammar(Regular grammar, right-linear grammar or left-linear grammar) –Every production in P is of the form A   B , A   , or A  B  , A   , where A , B  V N ,  V T * 。 Notes :  The productions in 3-type grammar are right-linear productions or else left-linear productions. There cannot be either left-linear productions or right-linear productions. If all the productions in a 3-type grammar are left-linear productions, we call name grammar as left- linear grammar. If all the productions in a 3-type grammar are right-linear productions, we name the grammar as right-linear grammar.

19 Chapter 2 Language & Syntax Description Section 2 Grammar & Language 2 、 Formal definition b 、 Catalog of grammars (4)3-type grammar(Regular grammar, right- linear grammar or left-linear grammar) Notes :  The automation that recognizes 3-type language is called as finite state automation;  2-type grammar=self-embedded grammar(The productions are of the form S  aSb) +regular grammar, that is, any 2-type grammar without self-embedded property is equivalent to regular grammar.

20 Chapter 2 Language & Syntax Description Section 2 Grammar & Language 2 、 Formal definition b 、 Catalog of grammars HierarchyAlias Production form Automation name 0-type Grammar without limitation ,  V + Turing Machine 1-type Context- sensitive grammar  A   , A  V N Linear Bound Automation 2-type Context-free grammar A , A  V N Pushdown automation 3-type Regular grammar A   B , A  , A , B  V N ,  V T * Finite automation

21 Chapter 2 Language & Syntax Description Section 2 Grammar & Language 2 、 Formal definition c 、 i-type language –A language produced from i-type. Written as L(G): L(G)={  |   V T * , and S  } +

22 L(G 1 )={a i (a|b)|i>=0} Example : LetG 2 = ({S},{a,b},P,S) Where P includes: (0) S  aSb (1) S  ab L(G 2 )={a n b n |n>=1} Example : Let G 1 = ({S},{a,b},P,S) Where P includes: (0) S  aS (1) S  a (2) S  b

23 Chapter 2 Language & Syntax Description Section 2 Grammar & Language 2 、 Formal definition Notes : Limitations on productions in grammars used by lexical analysis and syntax analysis are as followings, –There is not the production such as P  P, for this kind of production would be useless but for leading to ambiguity –Any non-terminal symbol P should be accessed, and can derive terminal string. Start from the Start Symbol S , there exists the derivation S  P  P must be able to derive a terminal string, that is P  ;   V T *. * +

24 Chapter 2 Language & Syntax Description Section 3 Grammar construction and simplification 1 、 Constructing a grammar from a language Example1 : Let L 1 ={a 2n b n |n>=1 and a,b  V T } Try to construct the grammar G 1 from L 1 Let n=1 , L 1 =aab n=2 , L 1 =aaaabb n=3 , L 1 =aaaaaabbb …… So we have : S  aaSb S  aab

25 Chapter 2 Language & Syntax Description Section 3 Grammar construction and simplification 1 、 Constructing a grammar from a language Example 2 : Let L 2 ={a i b j c k | i,j,k>=1 and a,b,c  V T } Try to construct the grammar G 2 from L 2 S  aS S  aB B  bB B  bC C  cC | c

26 Chapter 2 Language & Syntax Description Section 3 Grammar construction and simplification 1 、 Constructing a grammar from a language Example 3 : Let L 3 ={  |   (a,b) * and there are as many a’s as b’s in  } Try to construct the grammar G 3 from L 3 S   S  bB , S  aA A  bS|b, A  aAA B  aS | a | bBB (0) S   S  aSbS S  bSaS

27 Chapter 2 Language & Syntax Description Section 3 Grammar construction and simplification 1 、 Constructing a grammar from a language Example 4 : Let L 4 ={  |   (0,1) * and the number of 1 appeared in  is even} Try to construct the grammar G 4 from L 4 S   S  0S , S  1A A  0A, A  1S

28 Chapter 2 Language & Syntax Description Section 3 Grammar construction and simplification 2 、 Grammar Simplification a 、 Because a language can be described in different grammars, it is true that should select the grammar which has least productions and is the most suitable to the properties of the language. b 、 In a grammar, there may be some redundant productions that are useless to derivation. We should delete these productions. –The production which is of the form P  P –The production which can not derive a terminal string forever –The production whose left-side non-terminal symbol does not appear in the right-side of any production

29 Chapter 2 Language & Syntax Description Section 3 Grammar construction and simplification 2 、 Grammar Simplification c 、 Steps of simplification : –Look for the productions of the form P  P, and delete them ; –If a production can not be used in the derivations forever, delete it ; –If a production can not derive a terminal string, delete it; –Arrange the remained productions.

30 Chapter 2 Language & Syntax Description Section 3 Grammar construction and simplification 2 、 Grammar Simplification Example : Simplify the following grammar (0)S  Be (1)S  Ec (2)A  Ae (3)A  e (4)A  A (5)B  Ce (6)B  Af (7)C  Cf (8)D  f Result: (0) S  Be (1)A  Ae (2)A  e (3)B  Af

31 Chapter 2 Language & Syntax Description Section 3 Grammar construction and simplification 3 、 Construct a context-free grammar without  -production a 、 A context-free grammar without  -production should satisfy the conditions as followings –If there is the production S   of the form in P, S should not appear in right-side of any production, where S is the Start Symbol of the grammar ; –There are no other  -productions in P. b 、 The algorithm to construct a context-free grammar without  -production : –G=(V N,V T,P,S) G’=(V’ N,V’ T,P’,S’) (1) Find out all non-terminal symbols that can derive  after some steps, and put them into the set V 0 ;

32 Chapter 2 Language & Syntax Description Section 3 Grammar construction and simplification 3 、 Construct a context-free grammar without  -production b 、 The algorithm to construct a context-free grammar without  -production : (2)Construct the P’ set of productions of G’ as following steps: (A)If an symbol in V 0 appears in the right-side of a production, change the production into two productions : substitute the symbol in  and itself in the production respectively ; put the new productions into P’ ( B)Otherwise, put the productions relating to the symbol into P’ except for  -production relating to the symbol ( C)If there exists the production of the form S   in P, change the production into S ’   | S and put them into P’,let S ’ be the Start Symbol of G’ , let V’ N =V N  {S ’ } ,

33 Example : Let G1=({S},{a,b},P,S),where P: (0) S   (1) S  aSbS (2) S  bSaS (1)V 0 ={S} (2)P ’ (1) S  abS|aSbS|aSb|ab (2) S  baS|bSaS|bSa|ba (0) S’   | S So : G1’=({S’,S},{a,b},P’,S’),where P’: (0) S’   | S (1) S  abS|aSbS|aSb|ab (2) S  baS|bSaS|bSa|ba

34 Chapter 2 Language & Syntax Description Section 4 Syntax tree and ambiguity of a grammar 1 、 Syntax tree a 、 Definition –A tree used to express the structure of a sentence in a language b 、 Function –Present the syntax analysis process visually and directly –Used to decide the ambiguity of a grammar easily

35 S a B a B B b S b A a b An example to syntax tree

36 Chapter 2 Language & Syntax Description Section 4 Syntax tree and ambiguity of a grammar 1 、 Syntax tree c 、 Basic terms in a syntax tree (1) Sub-tree A tree composed of a node (except for leaf) and all its descendent nodes in a syntax tree (2) Pruning sub-tree Prune all the children of the root of a sub-tree (3) Sentential form Sequences of all leafs appearing in a snap-shot of the growing syntax tree

37 Chapter 2 Language & Syntax Description Section 4 Syntax tree and ambiguity of a grammar 1 、 Syntax tree c 、 Basic terms in a syntax tree (4) Phrase A string of end-symbol sequence from left to right in a sub-tree is called a phrase relating to the root of the sub-tree. –Simple phrase(Direct phrase) : If a phrase is derived by 1 step from the root of a sub-tree, the phrase is called a simple phrase relating to the root of the sub-tree. –Phrase in a sentential form : A phrase to a sub-tree relating to the sentential form

38 Chapter 2 Language & Syntax Description Section 4 Syntax tree and ambiguity of a grammar 1 、 Syntax tree c 、 Basic terms in a syntax tree (5) Handle A leftmost simple phrase in a sentential form. Notes: In the process of leftmost recursion, the core work is seeking for the handle.

39 S a B a B B b S b A a b Handles to a syntax tree 2 4 3 6 5 1

40 Chapter 2 Language & Syntax Description Section 4 Syntax tree and ambiguity of a grammar 2 、 Ambiguity of a grammar a 、 Ambiguity of a sentence If a sentence in a grammar has two or more related syntax tree, the sentence is ambiguous. b 、 Ambiguity of a grammar If a language to a grammar has ambiguous sentences, the grammar is ambiguous.

41 Chapter 2 Language & Syntax Description Section 4 Syntax tree and ambiguity of a grammar 2 、 Ambiguity of a grammar Example : G=({E} , {+,*,(,),i} , P , E) where : E  E+E | E*E | (E) | i To the sentence (i* i+ i), there are two leftmost derivations, thus there are two syntax trees to the sentence. (1) E  (E)  (E+E)  (E*E+E)  ( i*E+E)  ( i*i+E)  ( i* i+ i) (2) E  (E)  (E*E)  ( i*E)  ( i*E+E)  ( i*i+E)  ( i* i+ i)

42 E ( E ) E + E E * E i i i E ( E ) E * E E + E i i i

43 Chapter 2 Language & Syntax Description Section 4 Syntax tree and ambiguity of a grammar 2 、 Ambiguity of a grammar Notes: (1)Ambiguity would bring uncertainty of syntax analysis (2)Ambiguity of a grammar is undetermined, that is, there is no such algorithm that can determine a grammar is an ambiguous grammar in finite steps (3)If you want to prove a grammar is ambiguous, you just give a counterexample (4)If we can control the ambiguity of a grammar, that is, use additional conditions, the existence of ambiguity is not so bad


Download ppt "1 、 Alphabet Non-empty set of symbols , usually expressed in  、 V or Other Upper-case Greece Letter 2 、 Symbol(Character) Elements in alphabet, finest."

Similar presentations


Ads by Google