Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Syntax Specification (Sections 2.1-2.2.1) CSCI 431 Programming Languages Fall 2003 A modification of slides developed by Felix Hernandez-Campos at UNC.

Similar presentations


Presentation on theme: "1 Syntax Specification (Sections 2.1-2.2.1) CSCI 431 Programming Languages Fall 2003 A modification of slides developed by Felix Hernandez-Campos at UNC."— Presentation transcript:

1 1 Syntax Specification (Sections 2.1-2.2.1) CSCI 431 Programming Languages Fall 2003 A modification of slides developed by Felix Hernandez-Campos at UNC Chapel Hill

2 2 Phases of Compilation

3 3 Syntax Analysis Syntax:Syntax: –Webster’s definition: 1 a : the way in which linguistic elements (as words) are put together to form constituents (as phrases or clauses) The syntax of a programming languageThe syntax of a programming language –Describes its form »i.e. Organization of tokens –Formal notation »Context Free Grammars (CFGs)

4 4 Review: Formal definition of tokens A set of tokens is a set of strings over an alphabetA set of tokens is a set of strings over an alphabet –{read, write, +, -, *, /, :=, 1, 2, …, 10, …, 3.45e-3, …} A set of tokens is a regular set that can be defined by comprehension using a regular expressionA set of tokens is a regular set that can be defined by comprehension using a regular expression For every regular set, there is a deterministic finite automaton (DFA) that can recognize itFor every regular set, there is a deterministic finite automaton (DFA) that can recognize it –i.e. determine whether a string belongs to the set or not –Scanners extract tokens from source code in the same way DFAs determine membership

5 5 Review: Regular Expressions A regular expression (RE) is:A regular expression (RE) is: –A single character –The empty string,  –The concatenation of two regular expressions »Notation: RE 1 RE 2 (i.e. RE 1 followed by RE 2 ) –The union of two regular expressions »Notation: RE 1 | RE 2 –The closure of a regular expression »Notation: RE* »* is known as the Kleene star »* represents the concatenation of 0 or more strings

6 6 Review: Token Definition Example Numeric literals in PascalNumeric literals in Pascal –Definition of the token unsigned_number digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 unsigned_integer  digit digit* unsigned_number  unsigned_integer ( (. unsigned_integer ) |  ) ( ( e ( + | – |  ) unsigned_integer ) |  ) Recursion is not allowed!Recursion is not allowed!

7 7 Exercise digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 unsigned_integer  digit digit* unsigned_number  unsigned_integer ( (. unsigned_integer ) |  ) ( ( e ( + | – |  ) unsigned_integer ) |  ) Regular expression forRegular expression for –Decimal numbers number  …

8 8 Exercise digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 unsigned_integer  digit digit* unsigned_number  unsigned_integer ( (. unsigned_integer ) |  ) ( ( e ( + | – |  ) unsigned_integer ) |  ) Regular expression forRegular expression for –Decimal numbers number  ( + | – |  ) unsigned_integer ( (  unsigned_integer ) |  )

9 9 Exercise digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 unsigned_integer  digit digit* unsigned_number  unsigned_integer ( (. unsigned_integer ) |  ) ( ( e ( + | – |  ) unsigned_integer ) |  ) Regular expression forRegular expression for –Identifiers identifier  …

10 10 Exercise digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 unsigned_integer  digit digit* unsigned_number  unsigned_integer ( (. unsigned_integer ) |  ) ( ( e ( + | – |  ) unsigned_integer ) |  ) Regular expression forRegular expression for –Identifiers identifier  letter ( letter | digit |  )* letter  a | b | c | … | z

11 11 Context Free Grammars CFGsCFGs –Add recursion to regular expressions »Nested constructions –Notation expression  identifier | number | - expression | ( expression ) | ( expression ) | expression operator expression | expression operator expression operator  + | - | * | / »Terminal symbols »Non-terminal symbols »Production rule (i.e. substitution rule) terminal symbol  terminal and non-terminal symbols

12 12 Backus-Naur Form Backus-Naur Form (BNF)Backus-Naur Form (BNF) –Equivalent to CFGs in power –CFG expression  identifier | number | - expression | ( expression ) | ( expression ) | expression operator expression | expression operator expression operator  + | - | * | / –BNF (really EBNF)  expression    identifier  |  number  | -  expression  | (  expression  ) | (  expression  ) |  expression   operator   expression  |  expression   operator   expression   operator   + | - | * | /

13 13 Extended Backus-Naur Form Extended Backus-Naur Form (EBNF)Extended Backus-Naur Form (EBNF) –Adds some convenient symbols »Union| »Kleene star* »Meta-level parentheses( ) –It has the same expressive power

14 14 Extended Backus-Naur Form Extended Backus-Naur Form (EBNF)Extended Backus-Naur Form (EBNF) –It has the same expressive power BNF  digit   0  digit   1 …  digit   9  unsigned_integer    digit   unsigned_integer    digit   unsigned_integer  EBNF  digit   0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9  unsigned_integer    digit   digit  *

15 15 Derivations A derivation shows how to generate a syntactically valid stringA derivation shows how to generate a syntactically valid string –Given a CFG –Example: »CFG expression  identifier | number | - expression | ( expression ) | ( expression ) | expression operator expression | expression operator expression operator  + | - | * | / »Derivation of slope * x + intercept

16 16 Derivation Example Derivation of slope * x + interceptDerivation of slope * x + intercept expression  expression operator expression  expression operator intercept  expression operator intercept  expression + intercept  expression + intercept  expression operator expression + intercept  expression operator expression + intercept  expression operator x + intercept  expression operator x + intercept  expression * x + intercept  expression * x + intercept  slope * x + intercept  slope * x + intercept expression  * slope * x + intercept

17 17 Parse Trees A parse is graphical representation of a derivationA parse is graphical representation of a derivation ExampleExample

18 18 Ambiguous Grammars Alternative parse treeAlternative parse tree –same expression –same grammar This grammar is ambiguousThis grammar is ambiguous

19 19 Designing unambiguous grammars Specify more grammatical structureSpecify more grammatical structure –In our example, left associativity and operator precedence »10 – 4 – 3 means (10 – 4) – 3 »3 + 4 * 5 means 3 + (4 * 5)

20 20 Example Parse tree for 3 + 4 * 5Parse tree for 3 + 4 * 5 Exercise: parse tree forExercise: parse tree for - 10 / 5 * 8 – 4 - 5 - 10 / 5 * 8 – 4 - 5

21 21 Java Language Specification Available on-lineAvailable on-line –http://java.sun.com/docs/books/jls/second_edition/html/j.tit le.doc.html http://java.sun.com/docs/books/jls/second_edition/html/j.tit le.doc.htmlhttp://java.sun.com/docs/books/jls/second_edition/html/j.tit le.doc.html ExamplesExamples –Comments: http://java.sun.com/docs/books/jls/second_edition/html/lex ical.doc.html#48125 http://java.sun.com/docs/books/jls/second_edition/html/lex ical.doc.html#48125 http://java.sun.com/docs/books/jls/second_edition/html/lex ical.doc.html#48125 –Multiplicative Operators: http://java.sun.com/docs/books/jls/second_edition/html/ex pressions.doc.html#239829 http://java.sun.com/docs/books/jls/second_edition/html/ex pressions.doc.html#239829 http://java.sun.com/docs/books/jls/second_edition/html/ex pressions.doc.html#239829 –Unary Operators: http://java.sun.com/docs/books/jls/second_edition/html/ex pressions.doc.html#4990 http://java.sun.com/docs/books/jls/second_edition/html/ex pressions.doc.html#4990 http://java.sun.com/docs/books/jls/second_edition/html/ex pressions.doc.html#4990


Download ppt "1 Syntax Specification (Sections 2.1-2.2.1) CSCI 431 Programming Languages Fall 2003 A modification of slides developed by Felix Hernandez-Campos at UNC."

Similar presentations


Ads by Google