Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Languages and Compilers (SProg og Oversættere) Parsing.

Similar presentations


Presentation on theme: "1 Languages and Compilers (SProg og Oversættere) Parsing."— Presentation transcript:

1 1 Languages and Compilers (SProg og Oversættere) Parsing

2 2 –Describe the purpose of the parser –Discuss top down vs. bottom up parsing –Explain necessary conditions for construction of recursive decent parsers –Discuss the construction of an RD parser from a grammar

3 3 Top-down parsing Thecatseesarat.Thecatseesrat. Sentence SubjectVerbObject. Sentence Noun Subject The Noun cat Verb seesa Noun Object Noun rat.

4 4 Bottom up parsing Thecatseesarat.Thecat Noun Subject sees Verb arat Noun Object. Sentence

5 5 Look-Ahead Derivation LL-Analyse (Top-Down) Look-Ahead Reduction LR-Analyse (Bottom-Up) Top-Down vs Bottom-Up parsing

6 6 Development of Recursive Descent Parser (1)Express grammar in EBNF (2)Grammar Transformations: Left factorization and Left recursion elimination (3)Create a parser class with –private variable currentToken –methods to call the scanner: accept and acceptIt (4)Implement private parsing methods: –add private parse N method for each non terminal N –public parse method that gets the first token form the scanner calls parse S (S is the start symbol of the grammar)

7 7 Recursive Descent Parsing Sentence ::= Subject Verb Object. Subject ::= I | a Noun | the Noun Object::= me | a Noun | the Noun Noun::= cat | mat | rat Verb::= like | is | see | sees Sentence ::= Subject Verb Object. Subject ::= I | a Noun | the Noun Object::= me | a Noun | the Noun Noun::= cat | mat | rat Verb::= like | is | see | sees Define a procedure parseN for each non-terminal N private void parseSentence() ; private void parseSubject(); private void parseObject(); private void parseNoun(); private void parseVerb(); private void parseSentence() ; private void parseSubject(); private void parseObject(); private void parseNoun(); private void parseVerb();

8 8 Recursive Descent Parsing public class MicroEnglishParser { private TerminalSymbol currentTerminal; //Auxiliary methods will go here... //Parsing methods will go here... } public class MicroEnglishParser { private TerminalSymbol currentTerminal; //Auxiliary methods will go here... //Parsing methods will go here... }

9 9 Recursive Descent Parsing: Auxiliary Methods public class MicroEnglishParser { private TerminalSymbol currentTerminal private void accept(TerminalSymbol expected) { if (currentTerminal matches expected) currentTerminal = next input terminal ; else report a syntax error }... } public class MicroEnglishParser { private TerminalSymbol currentTerminal private void accept(TerminalSymbol expected) { if (currentTerminal matches expected) currentTerminal = next input terminal ; else report a syntax error }... }

10 10 Recursive Descent Parsing: Parsing Methods private void parseSentence() { parseSubject(); parseVerb(); parseObject(); accept(‘.’); } private void parseSentence() { parseSubject(); parseVerb(); parseObject(); accept(‘.’); } Sentence ::= Subject Verb Object.

11 11 Recursive Descent Parsing: Parsing Methods private void parseSubject() { if (currentTerminal matches ‘ I ’) accept(‘ I ’); else if (currentTerminal matches ‘ a ’) { accept(‘ a ’); parseNoun(); } else if (currentTerminal matches ‘ the ’) { accept(‘ the ’); parseNoun(); } else report a syntax error } private void parseSubject() { if (currentTerminal matches ‘ I ’) accept(‘ I ’); else if (currentTerminal matches ‘ a ’) { accept(‘ a ’); parseNoun(); } else if (currentTerminal matches ‘ the ’) { accept(‘ the ’); parseNoun(); } else report a syntax error } Subject ::= I | a Noun | the Noun

12 12 Recursive Descent Parsing: Parsing Methods private void parseNoun() { if (currentTerminal matches ‘ cat ’) accept(‘ cat ’); else if (currentTerminal matches ‘ mat ’) accept(‘ mat ’); else if (currentTerminal matches ‘ rat ’) accept(‘ rat ’); else report a syntax error } private void parseNoun() { if (currentTerminal matches ‘ cat ’) accept(‘ cat ’); else if (currentTerminal matches ‘ mat ’) accept(‘ mat ’); else if (currentTerminal matches ‘ rat ’) accept(‘ rat ’); else report a syntax error } Noun::= cat | mat | rat

13 13 LL 1 Grammars The presented algorithm to convert EBNF into a parser does not work for all possible grammars. It only works for so called “LL 1” grammars. Basically, an LL1 grammar is a grammar which can be parsed with a top-down parser with a lookahead (in the input stream of tokens) of one token. What grammars are LL1? How can we recognize that a grammar is (or is not) LL1?  We can deduce the necessary conditions from the parser generation algorithm.  We can use a formal definition

14 14 LL 1 Grammars parse X* while (currentToken.kind is in starters[X]) { parse X } while (currentToken.kind is in starters[X]) { parse X } parse X|Y switch (currentToken.kind) { cases in starters[X]: parse X break; cases in starters[Y]: parse Y break; default: report syntax error } switch (currentToken.kind) { cases in starters[X]: parse X break; cases in starters[Y]: parse Y break; default: report syntax error } Condition: starters[X] and starters[Y] must be disjoint sets. Condition: starters[X] must be disjoint from the set of tokens that can immediately follow X *

15 15 Formal definition of LL(1) A grammar G is LL(1) iff for each set of productions M ::= X 1 | X 2 | … | X n : 1.starters[X 1 ], starters[X 2 ], …, starters[X n ] are all pairwise disjoint 2.If X i =>* ε then starters[X j ]∩ follow[X]=Ø, for 1≤j≤ n.i≠j If G is ε-free then 1 is sufficient

16 16 Converting EBNF into RD parsers The conversion of an EBNF specification into a Java implementation for a recursive descent parser is so “mechanical” that it can easily be automated! => JavaCC “Java Compiler Compiler”

17 17 JavaCC and JJTree

18 18 LR parsing –The algorithm makes use of a stack. –The first item on the stack is the initial state of a DFA –A state of the automaton is a set of LR0/LR1 items. –The initial state is constructed from productions of the form S:=  [, $] (where S is the start symbol of the CFG) –The stack contains (in alternating) order: A DFA state A terminal symbol or part (subtree) of the parse tree being constructed –The items on the stack are related by transitions of the DFA –There are two basic actions in the algorithm: shift: get next input token reduce: build a new node (remove children from stack)

19 19 JavaCUP: A LALR generator for Java Grammar BNF-like Specification JavaCUP Java File: Parser Class Uses Scanner to get Tokens Parses Stream of Tokens Definition of tokens Regular Expressions JFlex Java File: Scanner Class Recognizes Tokens Syntactic Analyzer

20 20 Steps to build a compiler with SableCC 1.Create a SableCC specification file 2.Call SableCC 3.Create one or more working classes, possibly inherited from classes generated by SableCC 4.Create a Main class activating lexer, parser and working classes 5.Compile with Javac

21 21 Hierarchy


Download ppt "1 Languages and Compilers (SProg og Oversættere) Parsing."

Similar presentations


Ads by Google