Presentation is loading. Please wait.

Presentation is loading. Please wait.

Slide1 Chapter 4 Lexical and Syntax Analysis. slide2 OutLines: In this chapter a major topics will be discussed : Introduction to lexical analysis, including.

Similar presentations


Presentation on theme: "Slide1 Chapter 4 Lexical and Syntax Analysis. slide2 OutLines: In this chapter a major topics will be discussed : Introduction to lexical analysis, including."— Presentation transcript:

1 slide1 Chapter 4 Lexical and Syntax Analysis

2 slide2 OutLines: In this chapter a major topics will be discussed : Introduction to lexical analysis, including simple example. Introduction to lexical analysis, including simple example. Discussing of parsing problem. Discussing of parsing problem. The primary approach to parsing. The primary approach to parsing. Recursive-descent implementation technique for LL parser. Recursive-descent implementation technique for LL parser. Discussing bottom-up parsing and LR parsing algorithm. Discussing bottom-up parsing and LR parsing algorithm. Lexical and Syntax Analysis: Chapter 4

3 Introduction Three different approaches to implementing programming languages : 1. Compilation. 2. Pure interpretation. 3. Hybrid. Lexical and Syntax Analysis: Chapter 4

4 slide4 Compilation Approach Uses a program called a compiler, which translates program written in high-level programming languages into machine code. Uses a program called a compiler, which translates program written in high-level programming languages into machine code. Lexical and Syntax Analysis: Chapter 4

5 slide5 Interpretation Approach Perform no translate. Perform no translate. Program are interpreted in their original form by software interpreter. Program are interpreted in their original form by software interpreter. Used for smaller system in which execution is not critical. Used for smaller system in which execution is not critical. Lexical and Syntax Analysis: Chapter 4

6 slide6 Hybrid Approach Translate program in high-level into intermediate forms which are interpreter. Translate program in high-level into intermediate forms which are interpreter. It takes much slower execution than compiler system. It takes much slower execution than compiler system. Lexical and Syntax Analysis: Chapter 4

7 slide7 Compilers task separate into two parts: 1. lexical analysis: deal with small-scale constructs (names, numeric literals) 2. Syntax analysis: deal with large-scale constructs ( expressions, statement, program unit ) Lexical and Syntax Analysis: Chapter 4

8 slide8 Reasons of separating lexical analysis from Syntax analysis : 1. Simplicity 2. Efficiency 3. Portability Lexical and Syntax Analysis: Chapter 4

9 slide9 Lexical Analysis Lexical Analysis is a part of syntax analysis Lexical Analysis is a part of syntax analysis Lexical Analyzer collect characters into logical grouping( lexemes) and assigns internal codes ( tokens ) Lexical Analyzer collect characters into logical grouping( lexemes) and assigns internal codes ( tokens ) Lexical and Syntax Analysis: Chapter 4

10 slide10 Consider following example of assign statement: Result = oldsum – value / 100 ; Result = oldsum – value / 100 ; Tokenlexeme IDENTResult ASSIGN-OP= IDENToldsum SUBSTRACT-OP– IDENTvalue DIVISION-OP/ INT-LIT100 SEMICOLON; Lexical and Syntax Analysis: Chapter 4

11 slide11 Lexical Analysis process : 1. Skipping comments and blank out side lexemes. 2. Insert lexemes into the symbol table. 3. Detect syntactic errors in tokens. 4. Report errors to the user. Lexical and Syntax Analysis: Chapter 4

12 slide12 Approaches to building a lexical analyzer: 1. Write a formal description of the token patterns using descriptive language related to RE. 2. Design state transition diagram that describes the token patterns of language and write program that implement diagram 3. Design state transition diagram that describes the token patterns of language and hand- construct a table-driven implementation of state diagram. Lexical and Syntax Analysis: Chapter 4

13 slide13 State diagram include state, transitions for each, and every token pattern. Results very large and complex diagram. Results very large and complex diagram. Therefore a lexical analyzer need to recognizes only: Name Name Reserved words Reserved words Integer literals Integer literals Lexical and Syntax Analysis: Chapter 4

14 slide14 Lexical analyzer use a table of reserved words to determine which names are reserved words. Lexical analyzer use a table of reserved words to determine which names are reserved words. Can build a much more compact state diagram. Can build a much more compact state diagram. Use a single transition on any character. Use a single transition on any character. we define some sub programs for the common tasks. we define some sub programs for the common tasks. Lexical and Syntax Analysis: Chapter 4

15 slide15 / * lex – a simple lexical analyzer */ Int lex() { getchar(); Switch ( charclass ) { /* parse identifiers and reserved word */ Case LETTER: addchar (); getchar(); While (charclass == LETTER II charclass == DIGIT) { Addchar();Getchar();} Return lookup (lexeme); Break; Lexical and Syntax Analysis: Chapter 4

16 slide16 /* parse integer literals*/ Case DIGIT: addchar (); getchar(); While (charclass ==DIGIT ){ addchar (); getchar();} Return INT_LIT; Break; } /*end of switch */ } /* end of function lex*/ Lexical and Syntax Analysis: Chapter 4

17 slide17 A state diagram to recognize names, reserved word, and integer literals: Lexical and Syntax Analysis: Chapter 4

18 slide18 Parsing problem Syntax analysis is called parsing. Syntax analysis is called parsing. Two goals of Syntax analysis: Two goals of Syntax analysis: 1. Must check the input program to determine it is syntactically correct. 1. Must check the input program to determine it is syntactically correct. 2. Produce either a complete parse tree or at least trace the structure of the complete parse tree. 2. Produce either a complete parse tree or at least trace the structure of the complete parse tree. Parsers for pl construct parse trees for given programs. Parsers for pl construct parse trees for given programs. Lexical and Syntax Analysis: Chapter 4

19 slide19  Syntax analyzer \parsers/ based in a formal description of syntax of programming calls context-free grammars, or BNF. Lexical and Syntax Analysis: Chapter 4

20 slide20  Advantages for using BNF : 1. BNF description are clear and concise. 2. Can be used as the direct basis for the syntax analysis. 3. The implementations are easy to maintain. Lexical and Syntax Analysis: Chapter 4

21 slide21 Notational convention for grammar symbols: 1. Terminal Symbols : lowercase letters the beginning of the alphabet ( a,b,c,……..) 2. Nonterminal Symbols: uppercase letters the beginning of the alphabet (A,B,C, …..) 3. Terminal or Nonterminal symbols : uppercase letters the end of the alphabet ( W,X,Y,Z) 4. String of Terminal - lowercase letters the end of the alphabet (w,x,y,z) 5. Mixed strings (terminal or/and nonterminal) Lexical and Syntax Analysis: Chapter 4

22 slide22 Categories Of Parsing Algorithms: 1. Top-down: the tree is built from the root downward to the leaves. 1. Bottom-up: the tree is built from the leaves upward to the root. Lexical and Syntax Analysis: Chapter 4

23 slide23 Complexity Of Parsing Parsing algorithms for grammar are complex and inefficient. Parsing algorithms for grammar are complex and inefficient. Complexity in algorithms is O( n 3 ), Means amount of time they take to order of the cube length string to be parsed. Complexity in algorithms is O( n 3 ), Means amount of time they take to order of the cube length string to be parsed. Algorithms must back up and reparse part of sentence being analyzed. Algorithms must back up and reparse part of sentence being analyzed. Reparsing :is required when the parser has made a mistake in the parsing process. Reparsing :is required when the parser has made a mistake in the parsing process. backing up: is requires that part of the parse tree being constructed (trace) must be dismantled and rebuilt. backing up: is requires that part of the parse tree being constructed (trace) must be dismantled and rebuilt. Lexical and Syntax Analysis: Chapter 4

24 slide24 Top down parser Top-down parsers check to see if a string can be generated by a grammar by creating a parse tree starting from the initial symbol and working down Top-down parsers check to see if a string can be generated by a grammar by creating a parse tree starting from the initial symbol and working down The parser uses symbol-look-ahead and this approach without backtracking The parser uses symbol-look-ahead and this approach without backtracking LL parsers are examples of top-down parsers. LL parsers are examples of top-down parsers. LL parsers LL parsers Lexical and Syntax Analysis: Chapter 4

25 slide25 Recursive – Decent Parsing Recursive – Decent Parser consist of many recursive subprograms and produce a parse tree in top-down (descending) order Recursive – Decent Parser consist of many recursive subprograms and produce a parse tree in top-down (descending) order The syntax of These structure described with recursive grammar rule The syntax of These structure described with recursive grammar rule EBNF is suitable for Recursive – Decent Parser EBNF is suitable for Recursive – Decent Parser Lexical and Syntax Analysis: Chapter 4

26 slide26 Consider the following EBNF description of arithmetic expression { ( + l - ) } { ( + l - ) } { ( * l / ) } { ( * l / ) } id l ( ) id l ( ) This grammar must ensure that code generation process( syntax analysis) Produce code that adheres to the associativety rule of language This grammar must ensure that code generation process( syntax analysis) Produce code that adheres to the associativety rule of language Lexical and Syntax Analysis: Chapter 4

27 slide27 The LL Grammar Class A LL parser is a top-down parser for a subset of the context-free grammars. It parses the input from Left to right, and constructs a Leftmost derivation of the sentence A LL parser is a top-down parser for a subset of the context-free grammars. It parses the input from Left to right, and constructs a Leftmost derivation of the sentencetop-downparsercontext-free grammarsLeftmost derivationtop-downparsercontext-free grammarsLeftmost derivation Lexical and Syntax Analysis: Chapter 4

28 slide28 Left Recursion In recursive descent parsing, it is possible to implement a procedure that may cause an infinite loop if the grammar that we called left recursion In recursive descent parsing, it is possible to implement a procedure that may cause an infinite loop if the grammar that we called left recursion Lexical and Syntax Analysis: Chapter 4

29 slide29 Left Recursion Consider the following rule: Consider the following rule: A A + b {direct} A A + b {direct} A B a A {indirect} A B a A {indirect} B A b B A b The Algorithm that remove both {direct, indirect} not covered here The Algorithm that remove both {direct, indirect} not covered here Lexical and Syntax Analysis: Chapter 4

30 slide30 Pairwise disjoint ness test Test requires the ability to compute a set of RHSs of a given nonterminal symbol in a grammar Test requires the ability to compute a set of RHSs of a given nonterminal symbol in a grammar If a nonterminal has more than RHSs the first terminal symbol must be unique. If a nonterminal has more than RHSs the first terminal symbol must be unique. Lexical and Syntax Analysis: Chapter 4

31 slide31 Consider the following rule: A aB l bAb l c A aB l bAb l c The FIRST set for RHSs are {a}, {b}, {c} which are clearly disjoint,which pass {Pairwise disjoint ness test} The FIRST set for RHSs are {a}, {b}, {c} which are clearly disjoint,which pass {Pairwise disjoint ness test} A aB l alb A aB l alb The FIRST set for RHSs are {a}, {a} The FIRST set for RHSs are {a}, {a} which are clearly not disjoint. which are clearly not disjoint. Lexical and Syntax Analysis: Chapter 4

32 slide32 Left factoring A process that solve the problem of left recursion and don’t pass “ pair wise disjoint ness test [ both RHS begin with the same terminal] A process that solve the problem of left recursion and don’t pass “ pair wise disjoint ness test [ both RHS begin with the same terminal] We will take a look at left factoring We will take a look at left factoring Lexical and Syntax Analysis: Chapter 4

33 slide33 Consider the rule: ident l ident [ ] ident l ident [ ] { this rule don’t pass pair wise disjoitness} { this rule don’t pass pair wise disjoitness} We use left factoring to solve this problem. Tow rules can be replaced by : We use left factoring to solve this problem. Tow rules can be replaced by : ident ident ε l [ ] ε l [ ] Lexical and Syntax Analysis: Chapter 4

34 slide34 Bottom- Up Parsing check to see a string can be generated from a grammar by creating a parse tree from the leaves, and working up. check to see a string can be generated from a grammar by creating a parse tree from the leaves, and working up. Left recursive acceptable to bottom-up parsers Left recursive acceptable to bottom-up parsers bottom-up parsers don’t include met symbols bottom-up parsers don’t include met symbols the process of bottom-up parsers produce the reverse of a right most derivation the process of bottom-up parsers produce the reverse of a right most derivation LR parsers are examples of bottom-up parsers. LR parsers are examples of bottom-up parsers. LR parsers LR parsers Lexical and Syntax Analysis: Chapter 4

35 slide35 Consider the following grammar : E E + T l T E E + T l T T T * F l F T T * F l F F { E } l id F { E } l id Lexical and Syntax Analysis: Chapter 4

36 slide36 E => E + T E => E + T => E + T * F => E + T * F => E + T * id => E + T * id => E + F * id => E + F * id => E + id * id => E + id * id => T + id * id => T + id * id => F + id * id => F + id * id => id + id * id => id + id * id Lexical and Syntax Analysis: Chapter 4

37 slide37 Shift-Reduce algorithm Its another name of “ bottom up parsers “ and specify two common action : Its another name of “ bottom up parsers “ and specify two common action : 1. Shift : moves the next input token onto the parser’s stack. 2. Reduce : replace RHS {the handle} on top of the parser stack by its corresponding LHS Lexical and Syntax Analysis: Chapter 4

38 slide38 LR Parsers Its an algorithm which most commonly used bottom-up parsing approach for programming languages. Its an algorithm which most commonly used bottom-up parsing approach for programming languages. It use parse stack, which contains grammar symbol and state symbol. It use parse stack, which contains grammar symbol and state symbol. Lexical and Syntax Analysis: Chapter 4

39 slide39 Advantages for LR parsers : 1. they work for nearly all grammars that describe programming languages. 2. they work on a larger class of grammars than other bottom-up algorithm and in the same efficient. 3. They can detect syntax errors as soon as its possible. 4. The LR class of grammar Is a super set of the class parsable by LL parsers Lexical and Syntax Analysis: Chapter 4

40 slide40 Consider the following grammar 1. E E + T 2. E T 3. T T * F 4. T F 5. F ( E ) 6. F id

41 slide41 ActionGOTO stateid+*﴾﴿$ETF 0S5 S4 123 1 S6 accept 2 R2S7 R2 3 R4 4S5 S4 823 5 R6 6S5 S4 93 7S5 S4 10 8 S6 S11 9 R1S7 R1 10 R3 11 R5

42 slide42 StackInputAction 0id + id * id $Shift 5 0id5 + id * id + id * id $Reduce 6 ﴾use GOTO[0,F]﴿ 0F3 + id * id + id * id $Reduce4 ﴾use GOTO[0,T]﴿ 0T2 + id * id + id * id $Reduce2 ﴾use GOTO[0,E]﴿ 0E1 + id * id + id * id $Shift 6 0E1+6 id * id id * id $ shift 5 0E1+6id5 * id * id $ Reduce 6 ﴾use GOTO[6,F]﴿ 0E1+6F3 * id * id $ Reduce 4 ﴾use GOTO[6,T]﴿ 0E1+6T9 * id * id $ Shift 7 0E1+6T9*7id$Shift 5 0E1+6T9*7id5$Reduce 6 ﴾use GOTO[7,F]﴿ 0E1+6T9*7F10$Reduce 3 ﴾use GOTO[6,T]﴿ 0E1+6T9$Reduce1 ﴾use GOTO[0,E]﴿ 0E1 $Accept


Download ppt "Slide1 Chapter 4 Lexical and Syntax Analysis. slide2 OutLines: In this chapter a major topics will be discussed : Introduction to lexical analysis, including."

Similar presentations


Ads by Google