1 CSE305 Programming Languages Syntax What is it? How is it specified? Who uses it? Why is it needed?

Slides:



Advertisements
Similar presentations
Grammars, constituency and order A grammar describes the legal strings of a language in terms of constituency and order. For example, a grammar for a fragment.
Advertisements

ICE1341 Programming Languages Spring 2005 Lecture #5 Lecture #5 In-Young Ko iko.AT. icu.ac.kr iko.AT. icu.ac.kr Information and Communications University.
ICE1341 Programming Languages Spring 2005 Lecture #4 Lecture #4 In-Young Ko iko.AT. icu.ac.kr iko.AT. icu.ac.kr Information and Communications University.
ISBN Chapter 3 Describing Syntax and Semantics.
Concepts of Programming Languages 1 Describing Syntax and Semantics Brahim Hnich Högskola I Gävle
CS 330 Programming Languages 09 / 13 / 2007 Instructor: Michael Eckmann.
Chapter 3 Describing Syntax and Semantics Sections 1-3.
Fall 2007CS 2251 Miscellaneous Topics Deque Recursion and Grammars.
A basis for computer theory and A means of specifying languages
Chapter 3 Describing Syntax and Semantics Sections 1-3.
ISBN Chapter 3 Describing Syntax and Semantics.
1 Note As usual, these notes are based on the Sebesta text. The tree diagrams in these slides are from the lecture slides provided in the instructor resources.
Chapter 3 Describing Syntax and Semantics Sections 1-3.
1 Introduction: syntax and semantics Syntax: a formal description of the structure of programs in a given language. Semantics: a formal description of.
Dr. Muhammed Al-Mulhem 1ICS ICS 535 Design and Implementation of Programming Languages Part 1 Fundamentals (Chapter 4) Compilers and Syntax.
Chapter 3: Formal Translation Models
ISBN Chapter 3 Describing Syntax and Semantics.
1 Disambiguating the grammar If we use the parse tree to indicate precedence levels of the operators, we can remove the ambiguity. The following rules.
Roadmap Ch. 1 Classification of languages What make a “good” language?
S YNTAX. Outline Programming Language Specification Lexical Structure of PLs Syntactic Structure of PLs Context-Free Grammar / BNF Parse Trees Abstract.
(2.1) Grammars  Definitions  Grammars  Backus-Naur Form  Derivation – terminology – trees  Grammars and ambiguity  Simple example  Grammar hierarchies.
Describing Syntax and Semantics
1 Syntax and Semantics The Purpose of Syntax Problem of Describing Syntax Formal Methods of Describing Syntax Derivations and Parse Trees Sebesta Chapter.
ISBN Chapter 3 Describing Syntax and Semantics.
Describing Syntax and Semantics
CS 355 – PROGRAMMING LANGUAGES Dr. X. Topics Introduction The General Problem of Describing Syntax Formal Methods of Describing Syntax.
Winter 2007SEG2101 Chapter 71 Chapter 7 Introduction to Languages and Compiler.
1 Chapter 3 Describing Syntax and Semantics. 3.1 Introduction Providing a concise yet understandable description of a programming language is difficult.
A sentence (S) is composed of a noun phrase (NP) and a verb phrase (VP). A noun phrase may be composed of a determiner (D/DET) and a noun (N). A noun phrase.
ISBN Chapter 3 Describing Syntax and Semantics.
CS Describing Syntax CS 3360 Spring 2012 Sec Adapted from Addison Wesley’s lecture notes (Copyright © 2004 Pearson Addison Wesley)
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
Grammars CPSC 5135.
Chapter Describing Syntax and Semantics. Chapter 3 Topics 1-2 Introduction The General Problem of Describing Syntax Formal Methods of Describing.
Chapter 3 Part I Describing Syntax and Semantics.
3-1 Chapter 3: Describing Syntax and Semantics Introduction Terminology Formal Methods of Describing Syntax Attribute Grammars – Static Semantics Describing.
ProgrammingLanguages Programming Languages Language Syntax This lecture introduces the the lexical structure of programming languages; the context-free.
C H A P T E R TWO Syntax and Semantic.
ISBN Chapter 3 Describing Syntax and Semantics.
Course: ICS313 Fundamentals of Programming Languages. Instructor: Abdul Wahid Wali Lecturer, College of Computer Science and Engineering.
TextBook Concepts of Programming Languages, Robert W. Sebesta, (10th edition), Addison-Wesley Publishing Company CSCI18 - Concepts of Programming languages.
1 Syntax In Text: Chapter 3. 2 Chapter 3: Syntax and Semantics Outline Syntax: Recognizer vs. generator BNF EBNF.
The College of Saint Rose CIS 433 – Programming Languages David Goldschmidt, Ph.D. from Concepts of Programming Languages, 9th edition by Robert W. Sebesta,
CPS 506 Comparative Programming Languages Syntax Specification.
ISBN Chapter 3 Describing Syntax and Semantics.
Syntax and Semantics Structure of programming languages.
Chapter 3 Describing Syntax and Semantics. Chapter 3: Describing Syntax and Semantics - Introduction - The General Problem of Describing Syntax - Formal.
C HAPTER 3 Describing Syntax and Semantics. T OPICS Introduction The General Problem of Describing Syntax Formal Methods of Describing Syntax Attribute.
Copyright © 2006 Addison-Wesley. All rights reserved. Ambiguity in Grammars A grammar is ambiguous if and only if it generates a sentential form that has.
Chapter 3 Describing Syntax and Semantics
Syntax The Structure of a Language. Lexical Structure The structure of the tokens of a programming language The scanner takes a sequence of characters.
ISBN Chapter 3 Describing Syntax and Semantics.
Syntax and Grammars.
Describing Syntax and Semantics Session 2 Course : T Programming Language Concept Year : February 2011.
Chapter 3 Context-Free Grammars Dr. Frank Lee. 3.1 CFG Definition The next phase of compilation after lexical analysis is syntax analysis. This phase.
ISBN Chapter 3 Describing Syntax and Semantics.
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
Programming Languages and Design Lecture 2 Syntax Specifications of Programming Languages Instructor: Li Ma Department of Computer Science Texas Southern.
C H A P T E R T W O Syntax and Semantic. 2 Introduction Who must use language definitions? Other language designers Implementors Programmers (the users.
1 CS Programming Languages Class 04 September 5, 2000.
Copyright © 2006 Addison-Wesley. All rights reserved.1-1 ICS 410: Programming Languages Chapter 3 : Describing Syntax and Semantics Syntax.
Chapter 3 – Describing Syntax CSCE 343. Syntax vs. Semantics Syntax: The form or structure of the expressions, statements, and program units. Semantics:
Describing Syntax and Semantics Chapter 3: Describing Syntax and Semantics Lectures # 6.
Chapter 3: Describing Syntax and Semantics
Chapter 3 – Describing Syntax
Describing Syntax and Semantics
Describing Syntax and Semantics
Chapter 3 – Describing Syntax
Chapter 3 Describing Syntax and Semantics.
Describing Syntax and Semantics
Presentation transcript:

1 CSE305 Programming Languages Syntax What is it? How is it specified? Who uses it? Why is it needed?

2 Note These notes are based on the Sebesta text. The tree diagrams in these slides are from the lecture slides provided in the instructor resources for the text, and were made by David Garrett.

3 Introduction: syntax and semantics Syntax: a formal description of the structure of programs in a given language. Semantics: a formal description of the meaning programs in a given language. Together the syntax and semantics define a language.

4 Who uses a language definition? Those who design a language Those who implement a language (e.g. write compilers for it) The who use the language (i.e. software developers) Those who make tools for developers (e.g. JDT in Eclipse)

5 Language & grammar A given language can have more than one grammar which describes it. The grammar presented to a user is not necessarily the same as the grammar used in an implementation. –implementation requires a very detailed grammar –user needs a human-readable grammar

6 Syntax and semantics of programming languages I have cautioned against getting too hung up on the syntax of a programming language. But, you still need to learn the syntax of any language you work with so that you can read and write programs in the language. To understand the meaning of programs expressed in a language you also have know the semantics of the language.

7 General background Chomsky hierarchy Context-free grammars Backus-Naur form

8 Chomsky hierarchy Noam Chomsky defined a hierarchy of grammars and languages known as the Chomsky hierarchy: –regular languages (most restrictive) –context-free languages –context-sensitive languages –unrestricted languages (least restrictive)

9 Chomsky hierarchy regular languages context-free languages context-sensitive languages unrestricted languages

10 Context-free (CF) grammar A CF grammar is formally presented as a 4- tuple G=(T,NT,P,S), where: –T is a set of terminal symbols (the alphabet) –NT is a set of non-terminal symbols –P is a set of productions (or rules), where P  NT  (T  NT)* –S  NT

11 Example 1 A small formal language L 1 = { 0, 00, 1, 11 } G 1 = ( {0,1}, {S}, { S  0, S  00, S  1, S  11 }, S )

12 Example 2 A small fragment of English L 2 = { the dog chased the dog, the dog chased a dog, a dog chased the dog, a dog chased a dog, the dog chased the cat, … } G 2 = ({a, the, dog, cat, chased}, {S, NP, VP, Det, N, V}, {S  NP VP, NP  Det N, Det  a | the, N  dog | cat, VP  V | V NP, V  chased}, S ) Notes: S = Sentence, NP = Noun Phrase, N = Noun VP = Verb Phrase, V = Verb, Det = Determiner

13 Language terminology (from Sebesta, p. 125) A language is a set of strings of symbols, drawn from some finite set of symbols (called the alphabet of the language). “The strings of a language are called sentences” “Formal descriptions of the syntax […] do not include descriptions of the lowest-level syntactic units […] called lexemes.” “A token of a language is a category of its lexemes.” Syntax of a programming language is often presented in two parts: –regular grammar for token structure (e.g. structure of identifiers) –context-free grammar for sentence structure

14 Examples of lexemes and tokens LexemesTokens fooidentifier i sumidentifier -3integer_literal 10integer_literal 1 ;statement_separator =assignment_operator

15 Backus-Naur Form (BNF) Backus-Naur Form (1959) –Invented by John Backus to describe ALGOL 58, modified by Peter Naur for ALGOL 60 –BNF is equivalent to context-free grammar –BNF is a metalanguage used to describe another language, the object language –Extended BNF: adds syntactic sugar to produce more readable descriptions

16 BNF Fundamentals Sample rules [p. 128] → = → if then → if then else non-terminals/tokens surrounded by lexemes are not surrounded by keywords in language are in bold → separates LHS from RHS | expresses alternative expansions for LHS → if then | if then else = is in this example a lexeme

17 BNF Rules A rule has a left-hand side (LHS) and a right-hand side (RHS), and consists of terminal and nonterminal symbols A grammar is often given simply as a set of rules (terminal and non-terminal sets are implicit in rules, as is start symbol)

18 Describing Lists There are many situations in which a programming language allows a list of items (e.g. parameter list, argument list). Such a list can typically be as short as empty or consisting of one item. Such lists are typically not bounded. How is their structure described?

19 Describing lists The are described using recursive rules. Here is a pair of rules describing a list of identifiers, whose minimum length is one: -> ident | ident, Notice that ‘, ’ is part of the object language (the language being described by the grammar).

20 Derivation of sentences from a grammar A derivation is a repeated application of rules, starting with the start symbol and ending with a sentence (all terminal symbols)

21 Recall example 2 G 2 = ({a, the, dog, cat, chased}, {S, NP, VP, Det, N, V}, {S  NP VP, NP  Det N, Det  a | the, N  dog | cat, VP  V | VP NP, V  chased}, S)

22 Example: derivation from G 2 Example: derivation of the dog chased a cat S  NP VP  Det N VP  the N VP  the dog VP  the dog V NP  the dog chased NP  the dog chased Det N  the dog chased a N  the dog chased a cat

23 Example 3 L 3 = { 0, 1, 00, 11, 000, 111, 0000, 1111, … } G 3 = ( {0, 1}, {S, ZeroList, OneList}, {S  ZeroList | OneList, ZeroList  0 | 0 ZeroList, OneList  1 | 1 OneList }, S )

24 Example: derivations from G 3 Example: derivation of S  ZeroList  0 ZeroList  0 0 ZeroList  ZeroList  Example: derivation of S  OneList  1 OneList  1 1 OneList  1 1 1

25 Observations about derivations Every string of symbols in the derivation is a sentential form. A sentence is a sentential form that has only terminal symbols. A leftmost derivation is one in which the leftmost nonterminal in each sentential form is the one that is expanded. A derivation can be leftmost, rightmost, or neither.

26 An example programming language grammar fragment -> | ; -> = -> a | b | c | d -> + | - -> | const

27 A leftmost derivation of a = b + const => => = => a = => a = + => a = b + => a = b + const

28 Parse tree A parse tree is an hierarchical representation of a derivation: const a = b +

29 Parse trees and compilation A compiler builds a parse tree for a program (or for different parts of a program). If the compiler cannot build a well-formed parse tree from a given input, it reports a compilation error. The parse tree serves as the basis for semantic interpretation/translation of the program.

30 Extended BNF Optional parts are placed in brackets [ ] -> ident [( )] Alternative parts of RHSs are placed inside parentheses and separated via vertical bars -> (+|-) const Repetitions (0 or more) are placed inside braces { } -> letter {letter|digit}

31 Comparison of BNF and EBNF sample grammar fragment expressed in BNF -> + | - | -> * | / | same grammar fragment expressed in EBNF -> {(+ | -) } -> {(* | /) }

32 Ambiguity in grammars A grammar is ambiguous if and only if it generates a sentential form that has two or more distinct parse trees Operator precedence and operator associativity are two examples of ways in which a grammar can provide an unambiguous interpretation.

33 Operator precedence ambiguity The following grammar is ambiguous: -> | const -> / | - The grammar treats the '/' and '-' operators equivalently.

34 An ambiguous grammar for arithmetic expressions -> | const -> / | - const --//

35 Disambiguating the grammar If we use the parse tree to indicate precedence levels of the operators, we can remove the ambiguity. The following rules give / a higher precedence than - -> - | -> / const | const const / -

36 Links to BNF-style grammars for actual programming languages Below are some links to grammars for real programming languages. Look at how the grammars are expressed. – – In the ones listed below, find the parts of the grammar that deal with operator precedence. – – – Jacques.Levy/poly/mainB/node23.htmlhttp:// Jacques.Levy/poly/mainB/node23.html –

37 Derivation of 2+5*3 using C grammarC grammar *

38 Recursion and parentheses To generate 2+3*4 or 3*4+2, the parse tree is built so that + is higher in the tree than *. To force an addition to be done prior to a multiplication we must use parentheses, as in (2+3)*4. Grammar captures this in the recursive case of an expression, as in the following grammar fragment:  + |  * |  | | “(” “)”

39 Associativity of operators When multiple operators appear in an expression, we need to know how to interpret the expression. Some operators (e.g. +) are associative, meaning that the meaning of an expression with multiple instances of the operator is the same no matter how it is interpreted: (a+b)+c = a+(b+c) Some operators (e.g. -) are not associative: (a-b)-c  a-(b-c)e.g. try a=10, b=8, c=6 (10-8)-6 = -4 but 10-(8-6)=8 - and / are both left-associative, meaning a-b-c is interpreted as (a-b)-c. Exponentiation (**) is right-associative. This means that 2**3**2 is interpreted as 2**(3**2) (i.e. 2**9) rather than (2**3)**2 (i.e. 8**2 or 2**6).

40 Associativity of Operators Operator associativity can be encoded by a grammar. The following grammar fragment does not do this: the left and right operands of '-' are treated symmetrically. -> - | -> | | “(” “)”

41 Associativity of Operators However, the following rules ensure that '-' is left- associative, because they prevent direct recursion with '-' in the right-hand operand. -> - | -> | | “(” “)” - -

42 Decision timing: Design time vs. Implementation time (to come) Java and precedence/associativity/left-to- right evaluation vs. C++ (?)

43 Theory vs. Reality (to come) Java/C# vs. C/C++ (size of representation – but this is not the slide to address this on: see next point). Also, effect of fixed size of representations on associativity: –mathematically, (x+y)+z = x+(y+z) –in practice (+ is not always associative): (large+small)+small = large large+(small+small) > large Dealing with fixed-size numeric representations