Chapter 3 – Describing Syntax

Similar presentations

Presentation on theme: "Chapter 3 – Describing Syntax"— Presentation transcript:

1 Chapter 3 – Describing Syntax
CSCE 343

2 Syntax vs. Semantics Syntax: The form or structure of the expressions, statements, and program units. Semantics: the meaning of the expressions, statements, and program units. Syntax + semantics = language definition Used by: Other language designers Implementers Programmers

3 Terminology Sentence: string of characters over some alphabet
Language: set of sentences Lexeme: lowest level syntactic unit (e.g. +, (, {, while, etc.) Token: a category of lexemes (e.g. identifier, open brace, int literal, etc.)

4 Chomsky’s Classes of Grammars
Type-0: Unrestricted Type-1: Context Sensitive Grammars Type-2: Context Free Grammars Type-3: Regular Grammars Type 2,3 most useful for programming languages

5 Formal Methods For Describing Syntax
Backus-Naur Form (BNF) and Context-Free Grammars Most widely known method for describing programming language syntax Extended BNF Improves readability and writability of BNF Grammars and Recognizers

6 Backus-Naur Form (BNF)
Invented by John Backus to describe Algol 58 BNF is equivalent to context-free grammars BNF: a metalanguage (used to describe other languages) Abstractions (called nonterminals) represent classes of syntactic structures (like variables)

7 BNF Fundamentals Non-terminals: BNF abstractions
Terminals: lexemes and tokens Grammar: a collection of rules Example rule: <while_stmt>  while ( <logic_expr> ) <block>

8 BNF Rules Rule has left-hand side (LHS) and right-hand side (RHS)
LHS is a single non-terminal RHS consists of terminals and non-terminals Grammar is a set of rules Can have more than one RHS: Recursion for lists: <ident_list>  identifier | identifier, <ident_list>

9 Derivations For a language to be recognized, it must be derivable from the grammar. Derivation: repeated application of rules, starting with the start symbol (non-terminal) and ending with a sentence (all terminal symbols) Leftmost vs. rightmost

10 Example Grammar Grammar: <program>  <stmts>
<stmts>  <stmt> | <stmt> ; <stmts> <stmt>  <var> = <expr> <var>  a | b | c | d <expr>  <var> + <var> | <var> - <var> Derivation: <program> => <stmts> => <stmt> => <var> = <expr> => a = <expr> => a = <var> + <var> => a = b + <var> => a = b + c

11 Practice Write a BNF grammar for Java Boolean expressions.

12 Derivation Each step in the derivation: sentential form
Sentence: sentential form that has only terminals Leftmost vs rightmost

13 Practice Use your grammar to show a leftmost and a rightmost derivation of the expression: a < b && c == d

14 Parse Trees A hierarchical representation of a derivation
<program> => <stmts> => <stmt> => <var> = <expr> => a = <expr> => a = <var> + <var> => a = b + <var> => a = b + c

15 Practice Use your grammar to show a parse tree of the expression:
!a || c <= d

16 Parse Trees and Semantics
Compilers generate code by traversing parse trees. Semantics are derived from “shape” of trees. Example: math expressions Operations lower in tree occur first.

17 Ambiguous Grammars <expr>  <expr> <op> <expr> | <id> <op>  * | + <id> a | b | c | d

18 Ambiguous Grammars Get rid of multiple recursion to create unambiguous grammar: <expr>  <expr> + <term> | <term> <term>  <term> / <id> | <id> <id>  a | b | c | d

19 Operator Associativity
Associativity indicated by recursion: <expr>  <expr> + <term> (left associative) <expr>  <term> + <expr> (right associative)

20 Extended BNF Optional parts in brackets []
<proc_call> <ident> [ ( <expr_list> ) ] Alternatives are placed in parenthesis <term>  <term> (+|-) <const> Repetitions (0 or more) are in braces {} <ident>  letter { letter | digit }

21 BNF and EBNF BNF EBNF <expr>  <expr> + <term>
<term>  <term> * <factor> | <term> / <factor> | <factor> EBNF <expr>  <term> { ( + | - ) <term> } <term>  <factor> { ( * | / ) <factor> }


Download ppt "Chapter 3 – Describing Syntax"

Similar presentations

Ads by Google