Download presentation
Presentation is loading. Please wait.
1
Chapter 3 – Describing Syntax
CSCE 343
2
Syntax vs. Semantics Syntax: The form or structure of the expressions, statements, and program units. Semantics: the meaning of the expressions, statements, and program units. Syntax + semantics = language definition Used by: Other language designers Implementers Programmers
3
Terminology Sentence: string of characters over some alphabet
Language: set of sentences Lexeme: lowest level syntactic unit (e.g. +, (, {, while, etc.) Token: a category of lexemes (e.g. identifier, open brace, int literal, etc.)
4
Chomsky’s Classes of Grammars
Type-0: Unrestricted Type-1: Context Sensitive Grammars Type-2: Context Free Grammars Type-3: Regular Grammars Type 2,3 most useful for programming languages
5
Formal Methods For Describing Syntax
Backus-Naur Form (BNF) and Context-Free Grammars Most widely known method for describing programming language syntax Extended BNF Improves readability and writability of BNF Grammars and Recognizers
6
Backus-Naur Form (BNF)
Invented by John Backus to describe Algol 58 BNF is equivalent to context-free grammars BNF: a metalanguage (used to describe other languages) Abstractions (called nonterminals) represent classes of syntactic structures (like variables)
7
BNF Fundamentals Non-terminals: BNF abstractions
Terminals: lexemes and tokens Grammar: a collection of rules Example rule: <while_stmt> while ( <logic_expr> ) <block>
8
BNF Rules Rule has left-hand side (LHS) and right-hand side (RHS)
LHS is a single non-terminal RHS consists of terminals and non-terminals Grammar is a set of rules Can have more than one RHS: Recursion for lists: <ident_list> identifier | identifier, <ident_list>
9
Derivations For a language to be recognized, it must be derivable from the grammar. Derivation: repeated application of rules, starting with the start symbol (non-terminal) and ending with a sentence (all terminal symbols) Leftmost vs. rightmost
10
Example Grammar Grammar: <program> <stmts>
<stmts> <stmt> | <stmt> ; <stmts> <stmt> <var> = <expr> <var> a | b | c | d <expr> <var> + <var> | <var> - <var> Derivation: <program> => <stmts> => <stmt> => <var> = <expr> => a = <expr> => a = <var> + <var> => a = b + <var> => a = b + c
11
Practice Write a BNF grammar for Java Boolean expressions.
12
Derivation Each step in the derivation: sentential form
Sentence: sentential form that has only terminals Leftmost vs rightmost
13
Practice Use your grammar to show a leftmost and a rightmost derivation of the expression: a < b && c == d
14
Parse Trees A hierarchical representation of a derivation
<program> => <stmts> => <stmt> => <var> = <expr> => a = <expr> => a = <var> + <var> => a = b + <var> => a = b + c
15
Practice Use your grammar to show a parse tree of the expression:
!a || c <= d
16
Parse Trees and Semantics
Compilers generate code by traversing parse trees. Semantics are derived from “shape” of trees. Example: math expressions Operations lower in tree occur first.
17
Ambiguous Grammars <expr> <expr> <op> <expr> | <id> <op> * | + <id> a | b | c | d
18
Ambiguous Grammars Get rid of multiple recursion to create unambiguous grammar: <expr> <expr> + <term> | <term> <term> <term> / <id> | <id> <id> a | b | c | d
19
Operator Associativity
Associativity indicated by recursion: <expr> <expr> + <term> (left associative) <expr> <term> + <expr> (right associative)
20
Extended BNF Optional parts in brackets []
<proc_call> <ident> [ ( <expr_list> ) ] Alternatives are placed in parenthesis <term> <term> (+|-) <const> Repetitions (0 or more) are in braces {} <ident> letter { letter | digit }
21
BNF and EBNF BNF EBNF <expr> <expr> + <term>
<term> <term> * <factor> | <term> / <factor> | <factor> EBNF <expr> <term> { ( + | - ) <term> } <term> <factor> { ( * | / ) <factor> }
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.