Modeling Arithmetic, Computation, and Languages Mathematical Structures for Computer Science Chapter 8 Copyright © 2006 W.H. Freeman & Co.MSCS SlidesAlgebraic.

Slides:



Advertisements
Similar presentations
Chapter 5 Pushdown Automata
Advertisements

Natural Language Processing - Formal Language - (formal) Language (formal) Grammar.
C O N T E X T - F R E E LANGUAGES ( use a grammar to describe a language) 1.
Grammars, Languages and Parse Trees. Language Let V be an alphabet or vocabulary V* is set of all strings over V A language L is a subset of V*, i.e.,
COGN1001: Introduction to Cognitive Science Topics in Computer Science Formal Languages and Models of Computation Qiang HUO Department of Computer.
Chapter Chapter Summary Languages and Grammars Finite-State Machines with Output Finite-State Machines with No Output Language Recognition Turing.
Theory of Computation What types of things are computable? How can we demonstrate what things are computable?
1 、 Alphabet Non-empty set of symbols , usually expressed in  、 V or Other Upper-case Greece Letter 2 、 Symbol(Character) Elements in alphabet, finest.
January 14, 2015CS21 Lecture 51 CS21 Decidability and Tractability Lecture 5 January 14, 2015.
79 Regular Expression Regular expressions over an alphabet  are defined recursively as follows. (1) Ø, which denotes the empty set, is a regular expression.
A shorted version from: Anastasia Berdnikova & Denis Miretskiy.
Grammars, Languages and Finite-state automata Languages are described by grammars We need an algorithm that takes as input grammar sentence And gives a.
Finite State Machines Data Structures and Algorithms for Information Processing 1.
Languages and Grammars MSU CSE 260. Outline Introduction: E xample Phrase-Structure Grammars: Terminology, Definition, Derivation, Language of a Grammar,
Lecture 21: Languages and Grammars. Natural Language vs. Formal Language.
Formal Grammars Denning, Sections 3.3 to 3.6. Formal Grammar, Defined A formal grammar G is a four-tuple G = (N,T,P,  ), where N is a finite nonempty.
Languages & Strings String Operations Language Definitions.
Introduction Syntax: form of a sentence (is it valid) Semantics: meaning of a sentence Valid: the frog writes neatly Invalid: swims quickly mathematics.
::ICS 804:: Theory of Computation - Ibrahim Otieno SCI/ICT Building Rm. G15.
1 Section 14.2 A Hierarchy of Languages Context-Sensitive Languages A context-sensitive grammar has productions of the form xAz  xyz, where A is a nonterminal.
Pushdown Automata (PDAs)
A sentence (S) is composed of a noun phrase (NP) and a verb phrase (VP). A noun phrase may be composed of a determiner (D/DET) and a noun (N). A noun phrase.
Languages, Grammars, and Regular Expressions Chuck Cusack Based partly on Chapter 11 of “Discrete Mathematics and its Applications,” 5 th edition, by Kenneth.
Grammars CPSC 5135.
Introduction to Language Theory
Copyright © Curt Hill Languages and Grammars This is not English Class. But there is a resemblance.
Parsing Introduction Syntactic Analysis I. Parsing Introduction 2 The Role of the Parser The Syntactic Analyzer, or Parser, is the heart of the front.
9.7: Chomsky Hierarchy.
CS 208: Computing Theory Assoc. Prof. Dr. Brahim Hnich Faculty of Computer Sciences Izmir University of Economics.
1Computer Sciences Department. Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER Reference 3Computer Sciences Department.
Grammars A grammar is a 4-tuple G = (V, T, P, S) where 1)V is a set of nonterminal symbols (also called variables or syntactic categories) 2)T is a finite.
Parsing and Code Generation Set 24. Parser Construction Most of the work involved in constructing a parser is carried out automatically by a program,
Lecture 16: Modeling Computation Languages and Grammars Finite-State Machines with Output Finite-State Machines with No Output.
Formal Languages and Grammars
Discrete Structures ICS252 Chapter 5 Lecture 2. Languages and Grammars prepared By sabiha begum.
GRAMMARS & PARSING. Parser Construction Most of the work involved in constructing a parser is carried out automatically by a program, referred to as a.
CSCI 4325 / 6339 Theory of Computation Zhixiang Chen Department of Computer Science University of Texas-Pan American.
Formal grammars A formal grammar is a system for defining the syntax of a language by specifying sequences of symbols or sentences that are considered.
Theory of Languages and Automata By: Mojtaba Khezrian.
Week 14 - Friday.  What did we talk about last time?  Simplifying FSAs  Quotient automata.
CS6800 Advance Theory of Computation Spring 2016 Nasser Alsaedi
Chapter 2. Formal Languages Dept. of Computer Engineering, Hansung University, Sung-Dong Kim.
PROGRAMMING LANGUAGES
BCT 2083 DISCRETE STRUCTURE AND APPLICATIONS
Introduction to Formal Languages
Context-Free Grammars: an overview
Syntax Specification and Analysis
Pushdown automata Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
Language and Grammar classes
Complexity and Computability Theory I
Automata and Languages What do these have in common?
Natural Language Processing - Formal Language -
The chomsky hierarchy Module 03.3 COP4020 – Programming Language Concepts Dr. Manuel E. Bermudez.
CSE322 The Chomsky Hierarchy
A HIERARCHY OF FORMAL LANGUAGES AND AUTOMATA
Intro to Data Structures
CHAPTER 2 Context-Free Languages
فصل دوم Context-Free Languages
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Discrete Mathematics and its Applications Rosen 7th ed., Ch. 13.1
Chapter 2 Context-Free Language - 01
Pushdown automata Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
Pushdown automata Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
Models of Computation by Dr. Michael P
Pushdown automata Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
Models of Computation by Dr. Michael P
The Chomsky Hierarchy Costas Busch - LSU.
Pushdown automata Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
COMPILER CONSTRUCTION
Pushdown automata Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
Presentation transcript:

Modeling Arithmetic, Computation, and Languages Mathematical Structures for Computer Science Chapter 8 Copyright © 2006 W.H. Freeman & Co.MSCS SlidesAlgebraic Structures

Section 8.4Formal Languages1 Natural Language Syntax and semantics in the English language sentence “The walrus talks loudly.” The meaning, or semantics, of the sentence is a bit surprising Its form, or syntax, is acceptable, i.e., as valid in the language, meaning that the various parts of speech (noun, verb, etc.) are strung together in a reasonable way. In contrast, we reject “Loudly walrus the talks” as an illegal combination of parts of speech or as syntactically incorrect and not part of the language.

Section 8.4Formal Languages2 Formal Language DEFINITIONS: ALPHABET, VOCABULARY, WORD, LANGUAGE An alphabet or vocabulary V is a finite, nonempty set of symbols. A word over V is a finite-length string of symbols from V. The set V* is the set of all words over V. (See Example 34 in Chapter 2 for a recursive definition of V*.) A language over V is any subset of V*. A grammar for the language can be described by defining its generative process.

Section 8.4Formal Languages3 Formal Language Legitimate form for a sentence is a noun-phrase followed by a verb-phrase. Symbolically: sentence  noun-phrase verb-phrase A legitimate form of noun-phrase is an article followed by a noun: noun-phrase  article noun A legitimate form of verb-phrase is a verb followed by an adverb: verb-phrase  verb adverb The following substitutions seem logical for the sentence: article  the noun  walrus verb  talks adverb  loudly

Section 8.4Formal Languages4 Formal Language Thus, one can generate the sentence “The walrus talks loudly” by making successive substitutions: sentence  noun-phrase verb-phrase  article noun verb-phrase  the noun verb-phrase  the walrus verb-phrase  the walrus verb adverb  the walrus talks adverb  the walrus talks loudly The foregoing boldface terms are those for which further substitutions can be made. The non-boldface terms stop or terminate the substitution process.

Section 8.4Formal Languages5 Grammar for Formal Language DEFINITION: PHRASE-STRUCTURE (TYPE 0) GRAMMAR A phrase-structure grammar (type 0 grammar) G is a 4-tuple, G(V, V T, S, P), where V = vocabulary V T = nonempty subset of V called the set of terminals S = element of V  V T called the start symbol P = finite set of productions of the form    where  is a word over V containing at least one non- terminal symbol and  is a word over V

Section 8.4Formal Languages6 Generations: Formal Language DEFINITION: GENERATIONS (DERIVATIONS) IN A LANGUAGE Let G be a grammar, G(V, V T, S, P), and let w 1 and w 2 be words over V. Then w 1 directly generates (directly derives) w 2, written w 1  w 2, if    is a production of G, w 1 contains an instance of , and w 2 is obtained from w 1 by replacing that instance of  with . If w 1, w 2,..., w n are words over V and w 1  w 2, w 2  w 3,... w n  1  w n, then w 1 generates (derives) w n, written w 1  w n. (By convention, w 1  w 1.) * *

Section 8.4Formal Languages7 Formal Language DEFINITION: LANGUAGE GENERATED BY A GRAMMAR Given a grammar G, the language L generated by G, sometimes denoted L(G), is the set. L = {w  V T S  w} In other words, L is the set of all strings of terminals generated from the start symbol. Note: Once a string w of terminals has been obtained, no productions can be applied to w, and w cannot generate any other words. * *

Section 8.4Formal Languages8 Example of a derivation Let L = {a n b n c n  n  1}. A grammar generating L is G(V, V T, S, P) where V = {a, b, c, S, B, C}, V T = {a, b, c}, and P consists of the following productions: 1. S  aSBC 2. S  aBC 3. CB  BC 4. aB  ab 5. bB  bb 6. bC  bc 7. cC  cc It is fairly easy to see how to generate any particular member of L using these productions. Thus, a derivation of the string a 2 b 2 c 2 is S  aSBC  aaBCBC  aaBBCC  aabBCC  aabbCC  aabbcC  aabbcc

Section 8.4Formal Languages9 Classes of Grammars DEFINITIONS: CONTEXT-SENSITIVE, CONTEXT-FREE, AND REGULAR GRAMMARS; CHOMSKY HIERARCHY A grammar G is context-sensitive (type 1) if it obeys the erasing convention and if, for every production    (except S  ), the word is at least as long as the word. A grammar G is context-free (type 2) if it obeys the erasing convention and for every production   ,  is a single nonterminal. A grammar G is regular (type 3) if it obeys the erasing convention and for every production    (except S  ),  is a single nonterminal and is of the form t or tW, where t is a terminal symbol and W is a nonterminal symbol. This hierarchy of grammars, from type 0 to type 3, is called the Chomsky hierarchy.

Section 8.4Formal Languages10 Classes of Grammar In a context-free grammar, a single nonterminal symbol on the left of a production can be replaced wherever it appears by the right side of the production. In a context-sensitive grammar, a given nonterminal symbol can perhaps be replaced only if it is part of a particular string, or context  hence the names context-free and context-sensitive. Any regular grammar is also context-free, and any context-free grammar is also context-sensitive.

Section 8.4Formal Languages11 Grammars and Languages DEFINITION: LANGUAGE TYPES A language is type 0 (context-sensitive, context-free, or regular) if it can be generated by a type 0 (context-sensitive, context-free, or regular) grammar. Languages can be classified based on the relationships among the four grammar types, as shown in the figure here. Thus, any regular language is also context-free because any regular grammar is also a context-free grammar, and so on. DEFINITION: EQUIVALENT GRAMMARS Two grammars are equivalent if they generate the same language.

Section 8.4Formal Languages12 Computational Devices The most general computational device is the Turing machine, and the most general language is a type 0 language. The sets recognized by Turing machines correspond to type 0 languages. There are computational devices with capabilities midway between those of finite-state machines and those of Turing machines. These devices recognize exactly the context-free languages and the context-sensitive languages, respectively. The type of device that recognizes the context-free languages is called a pushdown automaton, or pda. A pda consists of a finite-state unit that reads input from a tape and controls activity in a stack. Symbols from some alphabet can be pushed onto or popped off of the top of the stack.

Section 8.4Formal Languages13 Computational Devices The finite-state unit in a pda, as a function of the input symbol read, the present state, and the top symbol on the stack, has a finite number of possible next moves. A pda has a choice of next moves, and it recognizes the set of all inputs for which some sequence of moves exists that causes it to empty its stack. It can be shown that any set recognized by a pda is a context- free language, and conversely. The type of device that recognizes the context-sensitive languages is called a linear bounded automaton, or lba. An lba is a Turing machine whose read-write head is restricted to that portion of the tape containing the original input; in addition, at each step it has a choice of possible next moves. An lba recognizes the set of all inputs for which some sequence of moves exists that causes it to halt in a final state. Any set recognized by an lba can be shown to be a context- sensitive language, and conversely.

Section 8.4Formal Languages14 Computational Devices The figure below shows the relationship between the hierarchy of languages and the hierarchy of computational devices.

Section 8.4Formal Languages15 Context-Free Grammar Context-free grammars are important for the following three reasons: Context-free grammars seem to be the easiest to work with because they allow replacing only one symbol at a time. Furthermore, many programming languages are defined such that sections of syntax, if not the whole language, can be described by context-free grammars. Finally, a derivation in a context-free grammar has a nice graphical representation called a parse tree.

Section 8.4Formal Languages16 Example Formal context-free grammar to generate identifiers in some programming language could be presented as follows: identifier  letter identifier  identifier letter identifier  identifier digit letter  a letter  b  letter  z digit  0 digit  1  digit  9 Here, the set of terminals is {a, b,..., z, 0, 1,..., 9} and identifier the start symbol.

Section 8.4Formal Languages17 Example The word d2q can be derived as follows: identifier  identifier letter  identifier digit letter  letter digit letter  d digit letter  d2 letter  d2q. We can represent this derivation as a tree with the start symbol for the root as seen in the figure below. When a production is applied to a node, that node is replaced at the next lower level of the tree by the symbols in the right-hand side of the production used.