Download presentation
Presentation is loading. Please wait.
Published byRose Gilmore Modified over 9 years ago
1
ICE1341 Programming Languages Spring 2005 Lecture #4 Lecture #4 In-Young Ko iko.AT. icu.ac.kr iko.AT. icu.ac.kr Information and Communications University (ICU)
2
Spring 2005 2 ICE 1341 – Programming Languages © In-Young Ko, Information and Communications University Announcements Send the language-survey information to the TA Send the language-survey information to the TA Form your project teams by this Thursday March 10 th Form your project teams by this Thursday March 10 th Include 4-5 students in each team Include 4-5 students in each team Mix skill levels Mix skill levels Mix genders (if it is possible) Mix genders (if it is possible)
3
Spring 2005 3 ICE 1341 – Programming Languages © In-Young Ko, Information and Communications University Language evaluation criteria Language evaluation criteria Readability Readability Writability Writability Reliability Reliability Last Lecture
4
Spring 2005 4 ICE 1341 – Programming Languages © In-Young Ko, Information and Communications University This Lecture Language Syntax and Semantics Language Syntax and Semantics Formal Ways to Define Languages Formal Ways to Define Languages Chomsky Hierarchy Chomsky Hierarchy Backus-Naur Form (BNF) Backus-Naur Form (BNF)
5
Spring 2005 5 ICE 1341 – Programming Languages © In-Young Ko, Information and Communications University What are Syntax and Semantics? Syntax: the form of expression, statements, and program units Syntax: the form of expression, statements, and program units e.g., while ( ) e.g., while ( ) Semantics: the meaning of those expressions, statements, and program units Semantics: the meaning of those expressions, statements, and program units e.g., “When the current value of the Boolean expression is true, the embedded statement is executed.”
6
Spring 2005 6 ICE 1341 – Programming Languages © In-Young Ko, Information and Communications University Describing Syntax A sentence (statement) is a string of characters over some alphabet A sentence (statement) is a string of characters over some alphabet A language is a set of sentences A language is a set of sentences A lexeme is the lowest level syntactic unit of a language (e.g., *, sum, begin) A lexeme is the lowest level syntactic unit of a language (e.g., *, sum, begin) A token is a category of lexemes (e.g., identifier) A token is a category of lexemes (e.g., identifier) * AW Lecture Notes e.g., index = 2 * count + 17; LexemesTokens indexidentifier =equal_sign 2int_literal *mult_op countidentifier +plus_op 17int_literal ;semicolon
7
Spring 2005 7 ICE 1341 – Programming Languages © In-Young Ko, Information and Communications University Formal Ways to Define Languages Language Recognizers Language Recognizers A device that determines whether a given program is in a language A device that determines whether a given program is in a language e.g, a syntax analyzer of a compiler, finite automata e.g, a syntax analyzer of a compiler, finite automata Language Generators Language Generators A device that can be used to generate the sentences of a language A device that can be used to generate the sentences of a language e.g, regular expressions, context-free grammars e.g, regular expressions, context-free grammars (( 00 ) * 1 ( 11 ) * ) + 0 q0q0 q1q1 q2q2 q3q3 1 1 0 0 0 The transition diagram of a finite automaton F = (Q, ∑, δ, q 0, F) 001110 111110 000110 Accepted Not accepted
8
Spring 2005 8 ICE 1341 – Programming Languages © In-Young Ko, Information and Communications University Regular Expressions Define patterns of strings (languages) Define patterns of strings (languages) Widely used for text-search applications Widely used for text-search applications e.g., UNIX grep command, String match in Perl e.g., UNIX grep command, String match in Perl Used as the input to lexical analyzer generators, such as Lex or Flex Used as the input to lexical analyzer generators, such as Lex or Flex e.g., Handel, Händel, and Haendel are described by the pattern “H(a|ä|ae)ndel” e.g., Handel, Händel, and Haendel are described by the pattern “H(a|ä|ae)ndel” http://en.wikipedia.org/wiki/Regular_expression
9
Spring 2005 9 ICE 1341 – Programming Languages © In-Young Ko, Information and Communications University Regular Expression Syntax Alternation: | Alternation: | e.g., “gray|grey” e.g., “gray|grey” Quantification: ?, +, * Quantification: ?, +, * ?: the preceding pattern may be present at most once (e.g., “colou?r”) ?: the preceding pattern may be present at most once (e.g., “colou?r”) +: the preceding pattern may be present at least once (e.g., “goo+gle”) +: the preceding pattern may be present at least once (e.g., “goo+gle”) *: the preceding pattern may be present zero, one, or more times (e.g., “0*42”) *: the preceding pattern may be present zero, one, or more times (e.g., “0*42”) Grouping: ( ) Grouping: ( ) e.g., “gr(a|e)y”, “(grand)?father” e.g., “gr(a|e)y”, “(grand)?father” http://en.wikipedia.org/wiki/Regular_expression
10
Spring 2005 10 ICE 1341 – Programming Languages © In-Young Ko, Information and Communications University Regular Expression Examples (0|10)*1* (0|10)*1* ε, 0, 1, 0001, 1010101, 01111111, … 1?(00*1)*0* 1?(00*1)*0* ε, 0, 1, 001, 0010, 00010, 10010010, … aaa, aabb, abba, aabb, abbbb aaa, aabb, abba, aabb, abbbb (aaa|aabb|abba|aabb|abbbb) (a|aa)(bb)*a? a+b+a?
11
Spring 2005 11 ICE 1341 – Programming Languages © In-Young Ko, Information and Communications University Formal Methods of Describing Syntax – Context-free Grammars Developed by Noam Chomsky in the mid-1950s Developed by Noam Chomsky in the mid-1950s Language generators, meant to describe the syntax of natural languages Language generators, meant to describe the syntax of natural languages Represented by variables (non-terminals) that are described recursively in terms of each other and primitive symbols called terminals Represented by variables (non-terminals) that are described recursively in terms of each other and primitive symbols called terminals The rules relating the variables are called productions The rules relating the variables are called productions e.g, e.g, boy boy little little Context free languages are the theoretical basis for the syntax of most programming languages Context free languages are the theoretical basis for the syntax of most programming languages * Hopcroft & Ullman Chap 4
12
Spring 2005 12 ICE 1341 – Programming Languages © In-Young Ko, Information and Communications University Chomsky Hierarchy Regular Grammars (Type 3) Regular Grammars (Type 3) A wB (or A Bw) or A w, where A and B are variables, and w is a string of terminals (or empty) A wB (or A Bw) or A w, where A and B are variables, and w is a string of terminals (or empty) Regular languages can be recognized by finite automata Regular languages can be recognized by finite automata Context-free Grammars (Type 2) Context-free Grammars (Type 2) A , where A is a variable and is a string of variables and terminals A , where A is a variable and is a string of variables and terminals Context-free languages can be recognized by push-down automata Context-free languages can be recognized by push-down automata Four classes (models) of generative devices (grammars) that define four languages 01 e.g., 0(10)* S 0 A, A 1 0 A | є or, S 0 A, A 1 0 A | є or, S S 1 0 | 0 S S 1 0 | 0 01010 Finite Control Input Tape 01 e.g., S 0S0 | 1S1 | c 1c110 Finite Control Input Tape Stack
13
Spring 2005 13 ICE 1341 – Programming Languages © In-Young Ko, Information and Communications University Chomsky Hierarchy (cont’d) Context-sensitive Grammars (Type 1) Context-sensitive Grammars (Type 1) A , where A is a variable, and , and are strings of variables and terminals ( and may be empty, ≠ ) A , where A is a variable, and , and are strings of variables and terminals ( and may be empty, ≠ є ) “Permit replacement of variable A by string in the context of and ” Context-sensitive languages can be recognized by non-deterministic Turing machines Context-sensitive languages can be recognized by non-deterministic Turing machines Unrestricted Grammars (Type 0) Unrestricted Grammars (Type 0) , where and are strings of variables and terminals ( ≠ , where and are strings of variables and terminals ( ≠ є) Unrestricted languages can be recognized by Turing machines Unrestricted languages can be recognized by Turing machines 0101010 Finite Control Input Tape Turing Machines A simple mathematical model of a computer A simple mathematical model of a computer Input tape is infinite to the right Input tape is infinite to the right n leftmost cells hold the input n leftmost cells hold the input The remaining infinity of cells each the blank The remaining infinity of cells each the blank In one move, In one move, 1.Change state 2.Print the symbol on the tape cell, and replace it 3.Move the head left or right one cell * Hopcroft & Ullman Chap 4
14
Spring 2005 14 ICE 1341 – Programming Languages © In-Young Ko, Information and Communications University Backus-Naur Form (BNF) Invented by John Backus to describe Algol 58 (1959) Invented by John Backus to describe Algol 58 (1959) The most widely used method for programming language syntax The most widely used method for programming language syntax Equivalent to context-free grammars Equivalent to context-free grammars A meta-language to describe other languages A meta-language to describe other languages e.g., A small program (Example 3.1) begin end begin end | ; | ; = = A | B | C A | B | C + + | - | - | | LHS (Left-hand side): Abstraction or Non-terminal or Variable RHS (Right-hand side): Lexemes and tokens (terminals), and reference to other abstractions Production (rule)
15
Spring 2005 15 ICE 1341 – Programming Languages © In-Young Ko, Information and Communications University Derivations Repeated application of productions, starting with the start symbol and ending with a sentence (all terminal symbols) begin end begin end begin ; end begin ; end begin = ; end begin = ; end begin A = ; end begin A = ; end begin A = + ; end begin A = + ; end begin A = B + ; end begin A = B + ; end begin A = B + C; end begin A = B + C; end begin A = B + C; = end begin A = B + C; = end begin A = B + C; B = end begin A = B + C; B = end begin A = B + C; B = C end begin A = B + C; B = C end Leftmost Derivations Sentential Form Sentence Start Symbol
16
Spring 2005 16 ICE 1341 – Programming Languages © In-Young Ko, Information and Communications University Parse Trees A Parse Tree: a hierarchical representation of a derivation A Parse Tree: a hierarchical representation of a derivation Internal nodes of a parse tree: non-terminal symbols Internal nodes of a parse tree: non-terminal symbols Leaf nodes of a parse tree: terminal symbols Leaf nodes of a parse tree: terminal symbols Each sub-trees of a parse tree: an instance of an abstraction Each sub-trees of a parse tree: an instance of an abstraction const a = b +
17
Spring 2005 17 ICE 1341 – Programming Languages © In-Young Ko, Information and Communications University Reference Reference for computation theory Reference for computation theory Introduction to Automata Theory, Languages, and Computation by John E. Hopcroft, Rajeev Motwani, and Jefferey D. Ullman, 2 nd Ed., Addison Wesley, 2003 Introduction to Automata Theory, Languages, and Computation by John E. Hopcroft, Rajeev Motwani, and Jefferey D. Ullman, 2 nd Ed., Addison Wesley, 2003
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.