Chapter 2 Syntax A language that is simple to parse for the compiler is also simple to parse for the human programmer. N. Wirth.

Slides:



Advertisements
Similar presentations
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
Advertisements

Chapter 2 Syntax. Syntax The syntax of a programming language specifies the structure of the language The lexical structure specifies how words can be.
Session 14 (DM62) / 15 (DM63) Recursive Descendent Parsing.
ISBN Chapter 3 Describing Syntax and Semantics.
C. Varela; Adapted w/permission from S. Haridi and P. Van Roy1 Declarative Computation Model Defining practical programming languages Carlos Varela RPI.
Chapter 3 Describing Syntax and Semantics Sections 1-3.
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
Programming Languages 2nd edition Tucker and Noonan
Chapter 3 Describing Syntax and Semantics Sections 1-3.
Slide 1 Chapter 2-b Syntax, Semantics. Slide 2 Syntax, Semantics - Definition The syntax of a programming language is the form of its expressions, statements.
Chapter 3 Describing Syntax and Semantics Sections 1-3.
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering CSCE 330 Programming Language Structures Chapter 2: Syntax Fall 2009 Marco.
Dr. Muhammed Al-Mulhem 1ICS ICS 535 Design and Implementation of Programming Languages Part 1 Fundamentals (Chapter 4) Compilers and Syntax.
Chapter 3: Formal Translation Models
S YNTAX. Outline Programming Language Specification Lexical Structure of PLs Syntactic Structure of PLs Context-Free Grammar / BNF Parse Trees Abstract.
Describing Syntax and Semantics
1 Syntax and Semantics The Purpose of Syntax Problem of Describing Syntax Formal Methods of Describing Syntax Derivations and Parse Trees Sebesta Chapter.
CS 355 – PROGRAMMING LANGUAGES Dr. X. Topics Introduction The General Problem of Describing Syntax Formal Methods of Describing Syntax.
Syntax and Backus Naur Form
1 Chapter 3 Describing Syntax and Semantics. 3.1 Introduction Providing a concise yet understandable description of a programming language is difficult.
CS 331, Principles of Programming Languages Chapter 2.
Context-Free Grammars
CS Describing Syntax CS 3360 Spring 2012 Sec Adapted from Addison Wesley’s lecture notes (Copyright © 2004 Pearson Addison Wesley)
Context-Free Grammars and Parsing
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
PART I: overview material
LANGUAGE DESCRIPTION: SYNTACTIC STRUCTURE
3-1 Chapter 3: Describing Syntax and Semantics Introduction Terminology Formal Methods of Describing Syntax Attribute Grammars – Static Semantics Describing.
C H A P T E R TWO Syntax and Semantic.
ISBN Chapter 3 Describing Syntax and Semantics.
TextBook Concepts of Programming Languages, Robert W. Sebesta, (10th edition), Addison-Wesley Publishing Company CSCI18 - Concepts of Programming languages.
Copyright © by Curt Hill Grammar Types The Chomsky Hierarchy BNF and Derivation Trees.
1 Syntax In Text: Chapter 3. 2 Chapter 3: Syntax and Semantics Outline Syntax: Recognizer vs. generator BNF EBNF.
Dr. Philip Cannata 1 Lexical and Syntactic Analysis Chomsky Grammar Hierarchy Lexical Analysis – Tokenizing Syntactic Analysis – Parsing Hmm Concrete Syntax.
CFG1 CSC 4181Compiler Construction Context-Free Grammars Using grammars in parsers.
CPS 506 Comparative Programming Languages Syntax Specification.
D Goforth COSC Translating High Level Languages.
Syntax and Semantics Structure of programming languages.
D Goforth COSC Translating High Level Languages Note error in assignment 1: #4 - refer to Example grammar 3.4, p. 126.
Chapter 3 Describing Syntax and Semantics
Context Free Grammars CFGs –Add recursion to regular expressions Nested constructions –Notation expression  identifier | number | - expression | ( expression.
ISBN Chapter 3 Describing Syntax and Semantics.
CS 331, Principles of Programming Languages Chapter 2.
Chapter 3 Context-Free Grammars Dr. Frank Lee. 3.1 CFG Definition The next phase of compilation after lexical analysis is syntax analysis. This phase.
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
9/15/2010CS485, Lecture 2, Fall Lecture 2: Introduction to Syntax (Revised based on the Tucker’s slides)
C H A P T E R T W O Syntax and Semantic. 2 Introduction Who must use language definitions? Other language designers Implementors Programmers (the users.
1 CS Programming Languages Class 04 September 5, 2000.
Copyright © 2006 Addison-Wesley. All rights reserved.1-1 ICS 410: Programming Languages Chapter 3 : Describing Syntax and Semantics Syntax.
Structure of programming languages
BNF A CFL Metalanguage Some Variations Particular View to SLK Copyright © 2015 – Curt Hill.
Chapter 3 – Describing Syntax CSCE 343. Syntax vs. Semantics Syntax: The form or structure of the expressions, statements, and program units. Semantics:
Syntax(1). 2 Syntax  The syntax of a programming language is a precise description of all its grammatically correct programs.  Levels of syntax Lexical.
Describing Syntax and Semantics Chapter 3: Describing Syntax and Semantics Lectures # 6.
Chapter 3: Describing Syntax and Semantics
Chapter 3 – Describing Syntax
Chapter 3 Context-Free Grammar and Parsing
Chapter 3 – Describing Syntax
Syntax (1).
Compiler Construction (CS-636)
Lecture 3: Introduction to Syntax (Cont’)
Programming Languages 2nd edition Tucker and Noonan
Programming Languages 2nd edition Tucker and Noonan
CSC 4181Compiler Construction Context-Free Grammars
R.Rajkumar Asst.Professor CSE
CSC 4181 Compiler Construction Context-Free Grammars
Chapter 3 Describing Syntax and Semantics.
Programming Languages 2nd edition Tucker and Noonan
COMPILER CONSTRUCTION
Presentation transcript:

Chapter 2 Syntax A language that is simple to parse for the compiler is also simple to parse for the human programmer. N. Wirth

2.1 Grammars Backus-Naur Form Derivations Parse Trees Associativity and Precedence Ambiguous Grammars 2.2 Extended BNF 2.3 Syntax of a Small Language: Clite Lexical Syntax Concrete Syntax 2.4 Compilers and Interpreters 2.5 Linking Syntax and Semantics Abstract Syntax Abstract Syntax Trees Abstract Syntax of Clite

Programming language principles – Grammars, syntax, semantics What makes a language successful? – Reliable, readable, writeable – Supported by simplicity, orthogonality, efficiency, … Paradigms – Imperative, object-oriented, functional, logic Language implementation – Compilers, interpreters, combinations SUMMARY Review

 Three levels: ◦ Lexical syntax – describes basic language symbols (names, operators, etc.) ◦ Concrete syntax - rules for writing expressions, statements and programs, describes the external representation of a program ◦ Abstract syntax – describes an internal representation of the program, emphasizes content over form, derived during parsing  Clite, a mini-language, used as a teaching tool in the study of syntax and semantics.

 A metalanguage is a language used to define other languages.  A grammar is a set of rules, written in a metalanguage, and used to define the concrete syntax of a language.  Programming languages are defined by a context-free grammar (more about this later)  Syntax can be defined in ways other than by formal grammars; e.g., in a natural language or syntax diagrams.

 Notation for describing a context-free grammar ◦ Sometimes called Backus Normal Form  First used to define syntax of Algol 60  Now used to define (concrete) syntax of most major languages

 Set of ◦ productions: P (productions = rules) ◦ terminal symbols: T ◦ nonterminal symbols: N ◦ start symbol:  A production has the form A → ω, where A is a nonterminal symbol and ω is a string from N and T.

 Consider the grammar: Integer  Digit | Integer Digit Digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9  The productions are the rules for Integer & Digit  Non-terminals: Integer, Digit  start symbol = Integer  Terminals :  Metasymbols:  |

 Grammars such as the Integer grammar can be used in several ways: 1. Theoretically, to produce (derive) all legal stings, starting with the Start symbol. Practically, to show how to write syntactically correct “sentences” in the language described by the grammar. (Programmers use grammars this way) 2. To show that a particular string is or is not correctly formed. (Language translators – compilers & interpreters – use it this way)

 A 6-step process, begins with the start symbol  Rule 1: Integer  Digit | Integer Digit  Replace a nonterminal by a RHS of one of its rules:  Step 1: Integer  Integer Digit  Step 2:  Integer Digit Digit  Step 3:  Digit Digit Digit  Step 4:  3 Digit Digit  Step 5:  3 5 Digit  Step 6:   Finished when there are only terminals on the RHS  This is a leftmost derivation.

Integer  Integer Digit  Integer 2  Integer Digit 2  Integer 5 2  Digit 5 2   This is called a rightmost derivation, since at each step the rightmost nonterminal is replaced.

 The language L defined by a BNF grammar G is the set of all terminal strings that can be derived from the start symbol.

 A parse tree is a graphical representation of a derivation. The root node of the tree is the start symbol. Each internal node of the tree corresponds to a non- terminal The child(ren) of a node represent a right-hand side of a production for which the node is the left-hand side. Each leaf node represents a terminal symbol of the derived string, reading from left to right. Leaves must match the original string.

Integer Digit

Parse Tree for 352 as an Integer Figure 2.1

 The following grammar defines the language of arithmetic expressions with 1-digit integers, addition, and subtraction.  Expr  Expr + Term | Expr – Term | Term  Term  0 |... | 9 | ( Expr )  Nonterminals: Expr, Term  Terminals: +, -, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, (, )

Parse of the String Figure 2.2

 Grammars define associativity and precedence among the operators in an express ◦ Precedence: which operator is evaluated first ; e.g., in the expression “a + b / c” ◦ Associativity: evaluation order for adjacent operators that have equal precedence; e.g., in the expression “a - b + c”

 Consider the more interesting grammar G 1 : Expr  Expr + Term | Expr – Term | Term Term  Term * Factor | Term / Factor | Term % Factor | Factor Factor  Primary ** Factor | Primary Primary  0 |... | 9 | ( Expr )

Parse of 4**2**3+5*6+7 for Grammar G 1 Figure 2.3 Expr  Expr + Term |Expr – Term | Term Term  Term * Factor | Term / Factor | Term % Factor | Factor Factor  Primary ** Factor | Primary Primary  0 |... | 9 | ( Expr )

PrecedenceAssociativityOperators 3right ** 2left * / % 1left + -  The grammar rules define precedence & associativity.  The parse tree shows something about the relations: operators lower in the tree are generally evaluated first; an operation can’t be performed until its operands are evaluated Associativity and Precedence for Grammar G 1 Table 2.1

 An operator’s precedence is determined by the length of the shortest derivation from the start symbol to the operator (see Figure 2.3)  Left- or right- associativity is determined by left- or right- recursion. ◦ compare the operators ** and + in Figure 2.3

 A grammar is ambiguous if one of its strings has two or more different parse trees.  C, C++, and Java have a large number of operators and precedence levels  Instead of using a large grammar, we can: ◦ Write a smaller ambiguous grammar, and ◦ Specify precedence and associativity rules separately; e.g., Table 2.1

 Expr → Expr Op Expr | ( Expr ) | Integer  Op → + | - | * | / | % | **  G 2 is equivalent to G 1 ; i.e., its language is the same; but… – G 2 has fewer productions and non-terminals than G 1. – G 2 is ambiguous. – All operators have same precedence and associativity isn’t specified.

Ambiguous Parse of Using Grammar G 2 Figure 2.4

IfStatement → if ( Expression ) Statement | if ( Expression ) Statement else Statement where Statement → Assignment | IfStatement | Block Suppose one of the statements was another If?

With which ‘if’ does the following ‘else’ associate? if (x < 0) if (y < 0) y = y - 1; else y = 0; Answer: according to the grammar, either one!

The Dangling Else Ambiguity Figure 2.5

Algol 68, Modula, Ada: use an explicit delimiter to end every conditional, for example: if (x < 0) if (y<0) if (y<0) y = y - 1; y = y - 1; else fi; y = x / y; else fi; y = x / y; fi; fi;

3. Java: rewrite the grammar to define two different kinds of If statements: IfThenStatement → if ( Expression ) Statement IfThenElseStatement → if ( Expression ) StatementNoShortIf else Statement The category StatementNoShortIf includes all statement types except IfThenStatement.

 C, C++, Pascal, other languages: ◦ Leave the grammar ambiguous ◦ Have a separate rule outside the grammar to explain the usage  Tradeoff: large grammar with no ambiguity, smaller grammar with extra rules 31

 BNF: recursion to represent iteration  EBNF: additional metacharacters represent iteration ◦ { } braces: show a series of zero or more occurrences ◦ ( ) parens: pick exactly one from the enclosed list ◦ [ ] brackets: pick zero or one from the enclosed list  Metacharacters are distinguished from terminal symbols by a different typeface.

BNF Expr → Term | Expr + Term | Expr – Term IfStatement → if ( Expr ) Statement | if ( Expr ) Statement else Statement EBNF Expr → Term { ( + | - ) Term } IfStatement → if ( Expr ) Statement [ else Statement ]

C-style EBNF lists alternatives on separate lines and uses opt to signify optional parts. e.g., IfStatement: if ( Expression ) Statement ElsePart opt ElsePart: else Statement

We can always rewrite an EBNF grammar as a BNF grammar. e.g., A → x { y } z can be rewritten: A → x A' z A' → ε | y A' (Rewriting EBNF rules with ( ), [ ] is left as an exercise.)

 Syntax diagrams are another way to describe grammar rules.  Popularized when they were used to describe Pascal grammar.

 BNF is considered equivalent to context- free grammars because it can express any rule in the grammar  EBNF is no more (or less) powerful or expressive than BNF. Its virtue is compactness.  Syntax diagrams are equally expressive.

Grammars ◦ BNF notation ◦ Grammars & parse trees ◦ Grammars, parse trees, associativity & precedence ◦ Ambiguity in grammars Next up: ◦ Clite syntax ◦ Lexical and concrete syntax ◦ More about compilers & interpreters ◦ Abstract syntax