Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 COMP 144 Programming Language Concepts Felix Hernandez-Campos Lecture 4: Syntax Specification COMP 144 Programming Language Concepts Spring 2002 Felix.

Similar presentations


Presentation on theme: "1 COMP 144 Programming Language Concepts Felix Hernandez-Campos Lecture 4: Syntax Specification COMP 144 Programming Language Concepts Spring 2002 Felix."— Presentation transcript:

1 1 COMP 144 Programming Language Concepts Felix Hernandez-Campos Lecture 4: Syntax Specification COMP 144 Programming Language Concepts Spring 2002 Felix Hernandez-Campos Jan 16 The University of North Carolina at Chapel Hill

2 2 COMP 144 Programming Language Concepts Felix Hernandez-Campos Phases of Compilation

3 3 COMP 144 Programming Language Concepts Felix Hernandez-Campos Syntax Analysis Syntax:Syntax: –Webster’s definition: 1 a : the way in which linguistic elements (as words) are put together to form constituents (as phrases or clauses) The syntax of a programming languageThe syntax of a programming language –Describes its form »i.e. Organization of tokens (elements) –Formal notation »Context Free Grammars (CFGs)

4 4 COMP 144 Programming Language Concepts Felix Hernandez-Campos Review: Formal definition of tokens A set of tokens is a set of strings over an alphabetA set of tokens is a set of strings over an alphabet –{read, write, +, -, *, /, :=, 1, 2, …, 10, …, 3.45e-3, …} A set of tokens is a regular set that can be defined by comprehension using a regular expressionA set of tokens is a regular set that can be defined by comprehension using a regular expression For every regular set, there is a deterministic finite automaton (DFA) that can recognize itFor every regular set, there is a deterministic finite automaton (DFA) that can recognize it –i.e. determine whether a string belongs to the set or not –Scanners extract tokens from source code in the same way DFAs determine membership

5 5 COMP 144 Programming Language Concepts Felix Hernandez-Campos Review: Regular Expressions A regular expression (RE) is:A regular expression (RE) is: –A single character –The empty string,  –The concatenation of two regular expressions »Notation: RE 1 RE 2 (i.e. RE 1 followed by RE 2 ) –The union of two regular expressions »Notation: RE 1 | RE 2 –The closure of a regular expression »Notation: RE* »* is known as the Kleene star »* represents the concatenation of 0 or more strings

6 6 COMP 144 Programming Language Concepts Felix Hernandez-Campos Review: Token Definition Example Numeric literals in PascalNumeric literals in Pascal –Definition of the token unsigned_number digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 unsigned_integer  digit digit* unsigned_number  unsigned_integer ( (. unsigned_integer ) |  ) ( ( e ( + | – |  ) unsigned_integer ) |  ) Recursion is not allowed!Recursion is not allowed!

7 7 COMP 144 Programming Language Concepts Felix Hernandez-Campos Exercise digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 unsigned_integer  digit digit* unsigned_number  unsigned_integer ( (. unsigned_integer ) |  ) ( ( e ( + | – |  ) unsigned_integer ) |  ) Regular expression forRegular expression for –Decimal numbers number  …

8 8 COMP 144 Programming Language Concepts Felix Hernandez-Campos Exercise digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 unsigned_integer  digit digit* unsigned_number  unsigned_integer ( (. unsigned_integer ) |  ) ( ( e ( + | – |  ) unsigned_integer ) |  ) Regular expression forRegular expression for –Decimal numbers number  ( + | – |  ) unsigned_integer ( (  unsigned_integer ) |  )

9 9 COMP 144 Programming Language Concepts Felix Hernandez-Campos Exercise digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 unsigned_integer  digit digit* unsigned_number  unsigned_integer ( (. unsigned_integer ) |  ) ( ( e ( + | – |  ) unsigned_integer ) |  ) Regular expression forRegular expression for –Identifiers identifier  …

10 10 COMP 144 Programming Language Concepts Felix Hernandez-Campos Exercise digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 unsigned_integer  digit digit* unsigned_number  unsigned_integer ( (. unsigned_integer ) |  ) ( ( e ( + | – |  ) unsigned_integer ) |  ) Regular expression forRegular expression for –Identifiers identifier  letter ( letter | digit |  )* letter  a | b | c | … | z

11 11 COMP 144 Programming Language Concepts Felix Hernandez-Campos Context Free Grammars CFGsCFGs –Add recursion to regular expressions »Nested constructions –Notation expression  identifier | number | - expression | ( expression ) | ( expression ) | expression operator expression | expression operator expression operator  + | - | * | / »Terminal symbols »Non-terminal symbols »Production rule (i.e. substitution rule) terminal symbol  terminal and non-terminal symbols

12 12 COMP 144 Programming Language Concepts Felix Hernandez-Campos Backus-Naur Form Backus-Naur Form (BNF)Backus-Naur Form (BNF) –Equivalent to CFGs in power –CFG expression  identifier | number | - expression | ( expression ) | ( expression ) | expression operator expression | expression operator expression operator  + | - | * | / –BNF  expression    identifier  |  number  | -  expression  | (  expression  ) | (  expression  ) |  expression   operator   expression  |  expression   operator   expression   operator   + | - | * | /

13 13 COMP 144 Programming Language Concepts Felix Hernandez-Campos Extended Backus-Naur Form Extended Backus-Naur Form (EBNF)Extended Backus-Naur Form (EBNF) –Adds some convenient symbols »Union| »Kleene star* »Meta-level parentheses( ) –It has the same expressive power

14 14 COMP 144 Programming Language Concepts Felix Hernandez-Campos Extended Backus-Naur Form Extended Backus-Naur Form (EBNF)Extended Backus-Naur Form (EBNF) –It has the same expressive power BNF  digit   0  digit   1 …  digit   9  unsigned_integer    digit   unsigned_integer    digit   unsigned_integer  EBNF  digit   0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9  unsigned_integer    digit   digit  *

15 15 COMP 144 Programming Language Concepts Felix Hernandez-Campos Derivations A derivation shows how to generate a syntactically valid stringA derivation shows how to generate a syntactically valid string –Given a CFG –Example: »CFG expression  identifier | number | - expression | ( expression ) | ( expression ) | expression operator expression | expression operator expression operator  + | - | * | / »Derivation of slope * x + intercept

16 16 COMP 144 Programming Language Concepts Felix Hernandez-Campos Derivation Example Derivation of slope * x + interceptDerivation of slope * x + intercept expression  expression operator expression  expression operator intercept  expression operator intercept  expression + intercept  expression + intercept  expression operator expression + intercept  expression operator expression + intercept  expression operator x + intercept  expression operator x + intercept  expression * x + intercept  expression * x + intercept  slope * x + intercept  slope * x + intercept expression  * slope * x + intercept »Identifiers were not derived for simplicity

17 17 COMP 144 Programming Language Concepts Felix Hernandez-Campos Parse Trees A parse is graphical representation of a derivationA parse is graphical representation of a derivation ExampleExample

18 18 COMP 144 Programming Language Concepts Felix Hernandez-Campos Ambiguous Grammars Alternative parse treeAlternative parse tree –same expression –same grammar This grammar is ambiguousThis grammar is ambiguous

19 19 COMP 144 Programming Language Concepts Felix Hernandez-Campos Designing unambiguous grammars Specify more grammatical structureSpecify more grammatical structure –In our example, left associativity and operator precedence »10 – 4 – 3 means (10 – 4) – 3 »3 + 4 * 5 means 3 + (4 * 5)

20 20 COMP 144 Programming Language Concepts Felix Hernandez-Campos Example Parse tree for 3 + 4 * 5Parse tree for 3 + 4 * 5 Exercise: parse tree forExercise: parse tree for - 10 / 5 * 8 – 4 - 5 - 10 / 5 * 8 – 4 - 5

21 21 COMP 144 Programming Language Concepts Felix Hernandez-Campos Java Language Specification Available on-lineAvailable on-line –http://java.sun.com/docs/books/jls/second_edition/html/j.tit le.doc.html http://java.sun.com/docs/books/jls/second_edition/html/j.tit le.doc.htmlhttp://java.sun.com/docs/books/jls/second_edition/html/j.tit le.doc.html ExamplesExamples –Comments: http://java.sun.com/docs/books/jls/second_edition/html/lex ical.doc.html#48125 http://java.sun.com/docs/books/jls/second_edition/html/lex ical.doc.html#48125 http://java.sun.com/docs/books/jls/second_edition/html/lex ical.doc.html#48125 –Multiplicative Operators: http://java.sun.com/docs/books/jls/second_edition/html/ex pressions.doc.html#239829 http://java.sun.com/docs/books/jls/second_edition/html/ex pressions.doc.html#239829 http://java.sun.com/docs/books/jls/second_edition/html/ex pressions.doc.html#239829 –Unary Operators: http://java.sun.com/docs/books/jls/second_edition/html/ex pressions.doc.html#4990 http://java.sun.com/docs/books/jls/second_edition/html/ex pressions.doc.html#4990 http://java.sun.com/docs/books/jls/second_edition/html/ex pressions.doc.html#4990

22 22 COMP 144 Programming Language Concepts Felix Hernandez-Campos Reading Assignment Scott’s Chapter 2Scott’s Chapter 2 –Section 2.1.2 –Section 2.1.3 Java language specificationJava language specification –Chapter 2 (Grammars) –Glance at chapter 3 –Glance at sections 15.17, 15.18 and 15.15


Download ppt "1 COMP 144 Programming Language Concepts Felix Hernandez-Campos Lecture 4: Syntax Specification COMP 144 Programming Language Concepts Spring 2002 Felix."

Similar presentations


Ads by Google