Presentation is loading. Please wait.

Presentation is loading. Please wait.

G22.2110-0011 Programming Languages G22.2110-001 Walter Williams.

Similar presentations


Presentation on theme: "G22.2110-0011 Programming Languages G22.2110-001 Walter Williams."— Presentation transcript:

1 G22.2110-0011 Programming Languages G22.2110-001 Walter Williams

2 G22.2110-0012 Administrative Stuff Homework, Exams, etc. Homework, Exams, etc. Weekly assignments Weekly assignments Programming projects Programming projects Mid-Term & Final Exams Mid-Term & Final Exams No cheating No cheating Join the mailing list: Join the mailing list:http://www.cs.nyu.edu/mailman/listinfo/g22_2110_001_su04 Recitation Recitation

3 G22.2110-0013 What’s covered in Lectures & Texts Purpose of course is for you to understand: Purpose of course is for you to understand: The issues involved in programming language design The issues involved in programming language design The various strategies for programming and how languages support those strategies The various strategies for programming and how languages support those strategies Type systems, OO support, abstraction, concurrent & generic programming Type systems, OO support, abstraction, concurrent & generic programming Not just learning to program in different languages Not just learning to program in different languages TextBooks: TextBooks: Scott – covers both compilers and programming languages. You can skip the compiler stuff. Scott – covers both compilers and programming languages. You can skip the compiler stuff. Barnes – Ada language, used by Defense Dept and other critical applications. Barnes – Ada language, used by Defense Dept and other critical applications. Paulson – ML language, widely used in AI, Language theory, etc. Paulson – ML language, widely used in AI, Language theory, etc. Others: Others: Stanley Lippman, The C++ Object Model; Stanley Lippman, The C++ Object Model; Bjarne Stroustrup, The Design and Evolution of C++ Bjarne Stroustrup, The Design and Evolution of C++ The Little Schemer The Little Schemer Java Language Specification Java Language Specification

4 G22.2110-0014 Language & Communication Human (Natural) Language Human (Natural) Language Problem Domain Language Problem Domain Language Algorithmic Language Algorithmic Language Documentation Language Documentation Language Programming Language Programming Language Language as a tool for thought Language as a tool for thought

5 G22.2110-0015 Programming Language Stakeholders Software Developers Software Developers Specification & Design Specification & Design Coding Coding Compiler Writers Compiler Writers Maintenance Programmers Maintenance Programmers Quality Control & Support Quality Control & Support Management Management

6 G22.2110-0016 Language Attributes Expressiveness Expressiveness APL for arrays, Lisp for lists, etc. APL for arrays, Lisp for lists, etc. All major languages are Turing complete All major languages are Turing complete Efficiency Efficiency Of coding, compilation or execution Of coding, compilation or execution Readability Readability By programming experts, domain experts and non-experts By programming experts, domain experts and non-experts Scalability Scalability Communicating parallel programmers Communicating parallel programmers Modules, separate compilation and information hiding Modules, separate compilation and information hiding Safety and Security Safety and Security Market Attributes Market Attributes Popularity => availability of programmers, tools, libraries, etc. Popularity => availability of programmers, tools, libraries, etc.

7 G22.2110-0017 Models (Styles) of Computation Imperative (Procedural) Imperative (Procedural) Mutable storage – modified by assignment Mutable storage – modified by assignment Fortran, Algol, C++, Java Fortran, Algol, C++, Java Functional (Applicative) Functional (Applicative) Pure mathematical functions – no side effects Pure mathematical functions – no side effects ML, Haskell, Smalltalk ML, Haskell, Smalltalk Declarative Declarative Programs are sets of (logical) assertions Programs are sets of (logical) assertions Prolog, SQL Prolog, SQL Object Oriented Object Oriented Orthogonal to the three models above Orthogonal to the three models above Inheritance, Polymorphism, Encapsulation Inheritance, Polymorphism, Encapsulation

8 G22.2110-0018 Compilers & Interpreters Compiling vs. Interpreting Compiling vs. Interpreting Compilers translate at compile time, once Compilers translate at compile time, once Interpreters translate at runtime, every time Interpreters translate at runtime, every time Front End Front End Syntactic Analysis: Lexical Analysis & Parsing Syntactic Analysis: Lexical Analysis & Parsing Semantic Analysis & Error Checking Semantic Analysis & Error Checking Generates Intermediate Code Generates Intermediate Code Back End Back End Most optimizations Most optimizations Turns Intermediate Code into Executable Turns Intermediate Code into Executable

9 G22.2110-0019 Programming Environments Development Environment Development Environment Interactive Development Environments Interactive Development Environments Smalltalk browser environment Smalltalk browser environment Microsoft IDE Microsoft IDE Development Frameworks Development Frameworks Swing, MFC Swing, MFC Language aware Editors Language aware Editors Libraries Libraries Java Swing classes Java Swing classes C++ Standard Template Library (STL) C++ Standard Template Library (STL) Libraries change much more quickly than the language Libraries change much more quickly than the language Libraries usually very different for different languages Libraries usually very different for different languages

10 G22.2110-00110 Lexical Issues Lexical Elements are Tokens Lexical Elements are Tokens Keywords, operators, punctuation, names, numbers, etc. Keywords, operators, punctuation, names, numbers, etc. Tokens are described by regular expressions (Type 3 grammars) Tokens are described by regular expressions (Type 3 grammars) Examples Examples Identifiers: letter (letter or digit)* Identifiers: letter (letter or digit)* Integer: digit digit* Integer: digit digit* Terminal symbols of lexical grammar are usually characters Terminal symbols of lexical grammar are usually characters ASCII, Unicode, etc. ASCII, Unicode, etc. Escape sequences and tri-grams Escape sequences and tri-grams

11 G22.2110-00111 Syntax & Semantics Syntax Syntax Deals with Form Deals with Form Gives structure to a stream of lexical elements Gives structure to a stream of lexical elements Semantics Semantics Deals with meaning Deals with meaning Meaning often depends on context Meaning often depends on context Both syntax and semantics can be represented by grammars – attribute grammars are used for semantics. Both syntax and semantics can be represented by grammars – attribute grammars are used for semantics. Distinction is somewhat artificial Distinction is somewhat artificial Syntax is that which can be conveniently expressed using a context free grammar Syntax is that which can be conveniently expressed using a context free grammar Semantics is everything else Semantics is everything else

12 G22.2110-00112 Language and Grammar An Alphabet Σ is a finite set of lexical symbols An Alphabet Σ is a finite set of lexical symbols Formal languages use letters of the alphabet as lexical symbols Formal languages use letters of the alphabet as lexical symbols Programming languages use Tokens Programming languages use Tokens L systems use lines to draw realistic images of trees and flowers L systems use lines to draw realistic images of trees and flowers Language L is a subset of strings in Σ* Language L is a subset of strings in Σ* A grammar G defines the subset of Σ* that belongs to L, and excludes the subset that does not belong to L. A grammar G defines the subset of Σ* that belongs to L, and excludes the subset that does not belong to L. A grammar can be used to generate new strings in L or to accept (or reject) strings in (or not it) L. A grammar can be used to generate new strings in L or to accept (or reject) strings in (or not it) L.

13 G22.2110-00113 CFG Example Block: { BlockStatements opt } BlockStatements:BlockStatement BlockStatements BlockStatement BlockStatement:LocalVariableDeclarationStatementStatementLocalVariableDeclarationStatement: LocalVariableDeclaration ; LocalVariableDeclaration: TypeName VariableDeclaratorId Statement: while ( expr ) BlockStatement ;

14 G22.2110-00114 Context Free Grammars Substitution Rules of the form: A ::= ω Substitution Rules of the form: A ::= ω where A is a Non-Terminal symbol and ω is a string of terminal and non-terminal symbols where A is a Non-Terminal symbol and ω is a string of terminal and non-terminal symbols A Simple CFG for a language E A Simple CFG for a language E 1. S ::= EXPR 2. S ::= EXPR S 3. EXPR ::= EXPR ‘+’ EXPR 4. EXPR ::= EXPR ‘–’ EXP 5. EXPR ::= ‘(‘ EXPR ‘)’ 6. EXPR ::= digit At least one rule must have only terminal symbols on RHS At least one rule must have only terminal symbols on RHS Every rule must have exactly one non-terminal on LHS Every rule must have exactly one non-terminal on LHS Terminal Symbols: digit + – ( ) Terminal Symbols: digit + – ( ) Non-Terminal Symbols: EXPR S Non-Terminal Symbols: EXPR S Examples of statements in E: Examples of statements in E: 1 1+1 (1+1) - 1

15 G22.2110-00115 Formal CFG A CFG, G, is a 4-tuple G = (Σ, N, S, δ) A CFG, G, is a 4-tuple G = (Σ, N, S, δ) Σ is an alphabet of terminal symbols Σ is an alphabet of terminal symbols N is a set of non-terminal symbols N is a set of non-terminal symbols S is a distinguished element of N, called the start symbol, which represents all strings in the language. S is a distinguished element of N, called the start symbol, which represents all strings in the language. δ is a set of rules of the form A ::= ω | A  N, ω  (Σ, N) + δ is a set of rules of the form A ::= ω | A  N, ω  (Σ, N) +

16 G22.2110-00116 CFG Idioms L ::= a L | a makes a list of one or more ‘a’s L ::= a L | a makes a list of one or more ‘a’s L ::= a, L | a makes a comma separated list of ‘a’s L ::= a, L | a makes a comma separated list of ‘a’s L ::= a L | λ makes a list of zero or more ‘a’s L ::= a L | λ makes a list of zero or more ‘a’s λ is a null symbol λ is a null symbol L :: L L | a | λ another way to make a list L :: L L | a | λ another way to make a list P ::= (P) makes P’s within nested parenthesis of arbitrary depth. P ::= (P) makes P’s within nested parenthesis of arbitrary depth.

17 G22.2110-00117 Backus-Naur Form (BNF) non-terminal symbols are identified by angle brackets non-terminal symbols are identified by angle brackets e.g. e.g. Terminal Symbols are token names or literal symbols Terminal Symbols are token names or literal symbols “::=“ is definitional equivalence “::=“ is definitional equivalence ‘|’ indicates “or” ‘|’ indicates “or” Many variations Many variations [ ] for optional elements [ ] for optional elements Parentheses for grouping Parentheses for grouping + and * (kleene star) + and * (kleene star) Superscripts for n occurances Superscripts for n occurances Subscripts, opt in Java Subscripts, opt in Java Italics or lowercase for Non- terminal symbols Italics or lowercase for Non- terminal symbols ::= while ( ) ::= while ( ) | if ( ) | if ( ) [else ] | id = EXP | | ; ::= ::= | ; ::= ::= | ID | NUMBER; ::= + | - | * | / ; ::= + | - | * | / ; Most language specifications use some variation of BNF

18 G22.2110-00118 Derivation & Parse Tree Derivation & Parse Tree Parse tree represents structure of parse Parse tree represents structure of parse Leaf nodes are terminal symbols Leaf nodes are terminal symbols Intermediate nodes are non-terminal symbols Intermediate nodes are non-terminal symbols Root node is start symbol of grammar Root node is start symbol of grammar Derivation tree also records which rules were used to build tree Derivation tree also records which rules were used to build tree Each node represents a specific production Each node represents a specific production Example Example (1 + 2 + 3 ) - 2 (1 + 2 + 3 ) - 2

19 G22.2110-00119 Grammars – Chomsky Hierarchy Type 0 – Unrestricted Type 0 – Unrestricted Can express anything that can be computed Can express anything that can be computed Impossible to parse Impossible to parse Type 1 – Context Sensitive Type 1 – Context Sensitive Difficult to parse Difficult to parse Attribute Grammars used for programming language semantics Attribute Grammars used for programming language semantics Type 2 – Context Free Type 2 – Context Free CFGs used for describing programming language syntax CFGs used for describing programming language syntax Type 3 – Regular Type 3 – Regular Used to describe lexical elements of programming languages Used to describe lexical elements of programming languages

20 G22.2110-00120 Grammatical Problems Programming languages use restricted grammars, such as LL or LR, which are not as powerful as general CFGs Programming languages use restricted grammars, such as LL or LR, which are not as powerful as general CFGs Dangling Else – Not LR shift reduce conflict Dangling Else – Not LR shift reduce conflict S ::= if E then S S ::= if E then S S ::= if E then S else S S ::= if E then S else S Solutions: Solutions: Always choose shift Always choose shift Specify endmarker e.g., endif Specify endmarker e.g., endif Left Recursion – Not LL Left Recursion – Not LL Ambiguity Ambiguity Foo(A) (in C) declaration or use of function Foo? Foo(A) (in C) declaration or use of function Foo? Requires lookahead in parser or more complex grammar Requires lookahead in parser or more complex grammar

21 G22.2110-00121 Programming Language History


Download ppt "G22.2110-0011 Programming Languages G22.2110-001 Walter Williams."

Similar presentations


Ads by Google