Instructor: Nick Cercone - 3050 CSEB - 1 NL Grammar Hierarchies Regular Expressions, Finite State Automata, Markov Algorithms.

Slides:



Advertisements
Similar presentations
CS2303-THEORY OF COMPUTATION Closure Properties of Regular Languages
Advertisements

Language and Automata Theory
C O N T E X T - F R E E LANGUAGES ( use a grammar to describe a language) 1.
Week 13 - Wednesday.  What did we talk about last time?  Exam 3  Before review:  Graphing functions  Rules for manipulating asymptotic bounds  Computing.
1 Languages. 2 A language is a set of strings String: A sequence of letters Examples: “cat”, “dog”, “house”, … Defined over an alphabet: Languages.
Finite Automata Great Theoretical Ideas In Computer Science Anupam Gupta Danny Sleator CS Fall 2010 Lecture 20Oct 28, 2010Carnegie Mellon University.
1 Introduction to Computability Theory Lecture3: Regular Expressions Prof. Amos Israeli.
Introduction to Computability Theory
1 Introduction to Computability Theory Lecture7: PushDown Automata (Part 1) Prof. Amos Israeli.
CS5371 Theory of Computation
1 CSCI-2400 Models of Computation. 2 Computation CPU memory.
Regular Languages Sequential Machine Theory Prof. K. J. Hintz Department of Electrical and Computer Engineering Lecture 3 Comments, additions and modifications.
1 Languages and Finite Automata or how to talk to machines...
79 Regular Expression Regular expressions over an alphabet  are defined recursively as follows. (1) Ø, which denotes the empty set, is a regular expression.
Normal forms for Context-Free Grammars
Topics Automata Theory Grammars and Languages Complexities
Finite State Machines Data Structures and Algorithms for Information Processing 1.
Regular Languages A language is regular over  if it can be built from ;, {  }, and { a } for every a 2 , using operators union ( [ ), concatenation.
Language Recognizer Connecting Type 3 languages and Finite State Automata Copyright © – Curt Hill.
Finite-State Machines with No Output
Regular Expressions. Notation to specify a language –Declarative –Sort of like a programming language. Fundamental in some languages like perl and applications.
CS490 Presentation: Automata & Language Theory Thong Lam Ran Shi.
DECIDABILITY OF PRESBURGER ARITHMETIC USING FINITE AUTOMATA Presented by : Shubha Jain Reference : Paper by Alexandre Boudet and Hubert Comon.
CS-5800 Theory of Computation II PROJECT PRESENTATION By Quincy Campbell & Sandeep Ravikanti.
CSC312 Automata Theory Lecture # 2 Languages.
CSC312 Automata Theory Lecture # 2 Languages.
Regular Expressions and Finite State Automata  Themes  Finite State Automata (FSA)  Describing patterns with graphs  Programs that keep track of state.
Lecture Two: Formal Languages Formal Languages, Lecture 2, slide 1 Amjad Ali.
Theory of Computation, Feodor F. Dragan, Kent State University 1 Regular expressions: definition An algebraic equivalent to finite automata. We can build.
Introduction to CS Theory Lecture 3 – Regular Languages Piotr Faliszewski
Lecture # 3 Chapter #3: Lexical Analysis. Role of Lexical Analyzer It is the first phase of compiler Its main task is to read the input characters and.
1 INFO 2950 Prof. Carla Gomes Module Modeling Computation: Language Recognition Rosen, Chapter 12.4.
1 Chapter 1 Introduction to the Theory of Computation.
Grammars CPSC 5135.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 2 Mälardalen University 2006.
1 Languages. 2 A language is a set of strings String: A sequence of letters Examples: “cat”, “dog”, “house”, … Defined over an alphabet:
COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.
What is a language? An alphabet is a well defined set of characters. The character ∑ is typically used to represent an alphabet. A string : a finite.
Regular Expressions Theory and Practice Jeff Schoolcraft MDCFUG 12/13/2005.
Regular Expressions CIS 361. Need finite descriptions of infinite sets of strings. Discover and specify “regularity”. The set of languages over a finite.
Regular Expressions and Languages A regular expression is a notation to represent languages, i.e. a set of strings, where the set is either finite or contains.
CHAPTER 1 Regular Languages
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 2 Mälardalen University 2010.
CSE Introduction to Computational Linguistics Tuesdays, Thursdays 14:30-16:00 – South Ross 101 Fall Semester, 2011 Instructor: Nick Cercone
Natural Language Processing Lecture 4 : Regular Expressions and Automata.
1Computer Sciences Department. Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER Reference 3Computer Sciences Department.
CS 203: Introduction to Formal Languages and Automata
Chapter 3 Regular Expressions, Nondeterminism, and Kleene’s Theorem Copyright © 2011 The McGraw-Hill Companies, Inc. Permission required for reproduction.
UNIT - I Formal Language and Regular Expressions: Languages Definition regular expressions Regular sets identity rules. Finite Automata: DFA NFA NFA with.
Lecture 2 Overview Topics What I forgot from last lecture Proof techniques continued Alphabets, strings, languages Automata June 2, 2015 CSCE 355 Foundations.
Theory of computation Introduction theory of computation: It comprises the fundamental mathematical properties of computer hardware, software,
Overview of Previous Lesson(s) Over View  A token is a pair consisting of a token name and an optional attribute value.  A pattern is a description.
CSCI 4325 / 6339 Theory of Computation Zhixiang Chen.
Week 13 - Wednesday.  What did we talk about last time?  Exam 3  Before review:  Graphing functions  Rules for manipulating asymptotic bounds  Computing.
Set, Alphabets, Strings, and Languages. The regular languages. Clouser properties of regular sets. Finite State Automata. Types of Finite State Automata.
Lecture 03: Theory of Automata:2014 Asif Nawaz Theory of Automata.
Deterministic Finite Automata Nondeterministic Finite Automata.
COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.
Topic 3: Automata Theory 1. OutlineOutline Finite state machine, Regular expressions, DFA, NDFA, and their equivalence, Grammars and Chomsky hierarchy.
Deterministic Finite-State Machine (or Deterministic Finite Automaton) A DFA is a 5-tuple, (S, Σ, T, s, A), consisting of: S: a finite set of states Σ:
Instructor: Nick Cercone CSEB -
Languages.
Languages Prof. Busch - LSU.
Languages Costas Busch - LSU.
REGULAR LANGUAGES AND REGULAR GRAMMARS
4. Properties of Regular Languages
Closure Properties of Regular Languages
Compiler Construction
Chapter 1 Introduction to the Theory of Computation
Languages Fall 2018.
Presentation transcript:

Instructor: Nick Cercone CSEB - 1 NL Grammar Hierarchies Regular Expressions, Finite State Automata, Markov Algorithms

Instructor: Nick Cercone CSEB - 2 Regular Expressions Regular expressions consist of constants and operators that denote sets of strings and operations over these sets, respectively. The following definition is standard, and found as such in most textbooks on formal language theory. Given a finite alphabet , the following constants are defined: (empty set)  denoting the set (empty string)  denoting the "empty" string, with no characters at all. (literal character) a in  denoting a character in the language.

Instructor: Nick Cercone CSEB - 3 Regular Expressions The following operations are defined: (concatenation) RS denoting the set { ab | a in R and b in S }. For example {"ab", "c"}{"d", "ef"} = {"abd", "abef", "cd", "cef"}. (alternation) R | S denoting the set union of R and S. For example {"ab", "c"}|{"ab", "d", "ef"} = {"ab", "c", "d", "ef"}.

Instructor: Nick Cercone CSEB - 4 Regular Expressions (Kleene star) R* denoting the smallest superset of R that contains e and is closed under string concatenation. This is the set of all strings that can be made by concatenating zero or more strings in R. For example, {"ab", "c"}* = {e, "ab", "c", "abab", "abc", "cab", "cc", "ababab", "abcab",... }.

Instructor: Nick Cercone CSEB - 5 Regular Expressions (Regular expressions are defined recursively as follows: ∅ is a regular expression  is a regular expression if a ∈ ∑ is a letter then a is a regular expression if r1 and r2 are regular expressions then so are (r1 + r2) and (r1 · r2) if r is a regular expression then so is (r ) ∗ nothing else is a regular expression over ∑

Instructor: Nick Cercone CSEB - 6 Finite State Automata Automata are models of computation: they compute languages. A finite-state automaton is a five-tuple {Q, q 0, ∑, , F}, where ∑ is a finite set of alphabet symbols, Q is a finite set of states, q 0 ∈ Q is the initial state, F ⊆ Q is a set of final (accepting) states and  : Q × ∑ × Q is a relation from states and alphabet symbols to states.

Instructor: Nick Cercone CSEB - 7 Finite State Automata Example: Finite-state automaton Q = {q0, q1, q2, q3} ∑ = {c, a, t, r} F = {q3}  = {,,, }

Instructor: Nick Cercone CSEB - 8 Finite State Automata The reflexive transitive extension of the transition relation  is a new relation, ˆ , defined as follows: –for every state q ∈ Q, (q, , q) ∈ ˆ  –for every string w ∈ ∑ ∗ and letter a ∈ ∑, if (q,w, q′) ∈ ˆ  and (q′, a, q′′) ∈  then (q,w · a, q′′) ∈ ˆ .

Instructor: Nick Cercone CSEB - 9 Finite State Automata Example: Paths For the finite-state automaton: ˆ  is the following set of triples:,,,,,,,,

Instructor: Nick Cercone CSEB - 10 Finite State Automata An extension:  -moves. The transition relation  is extended to:  ⊆ Q × ( ∑ ∪ {  }) × Q Example: Automata with  - moves - an automaton accepting the language {do, undo, done, undone}:

Instructor: Nick Cercone CSEB - 11 Formal language theory – definitions If L is a language then the reversal of L, denoted L R, is the language {w | w R ∈ L}. If L1 and L2 are languages, then L1 · L2 = {w1 · w2 | w1 ∈ L1 and w2 ∈ L2}. Example: Language operations –Let L1 = {i, you, he, she, it, we, they}, L2 = {smile, sleep}. –Then L1 R = {i, uoy, eh, ehs, ti, ew, yeht} and L1 · L2 = {ismile, yousmile, hesmile, shesmile, itsmile, wesmile, theysmile, isleep, yousleep, hesleep, shesleep, itsleep, wesleep, theysleep}.

Instructor: Nick Cercone CSEB - 12 Formal language theory – definitions If L is a language then L 0 = {  }. Then, for i > 0, L i = L · L i−1. Example: Language exponentiation Let L be the set of words {bau, haus, hof, frau}. Then L 0 = {  }, L 1 = L and L 2 = {baubau, bauhaus, bauhof, baufrau, hausbau, haushaus, haushof, hausfrau, hofbau, hofhaus, hofhof, hoffrau, fraubau, frauhaus, frauhof, fraufrau}.

Instructor: Nick Cercone CSEB - 13 Formal language theory – definitions The Kleene closure of L and is denoted L ∗ and is defined as  ∞ i=0 L i. L + =  ∞ i=0 i=1 L i Example: Kleene closure Let L = {dog, cat}. Observe that L 0 = {  }, L 1 = {dog, cat}, L 2 = {catcat, catdog, dogcat, dogdog}, etc. Thus L ∗ contains, among its infinite set of strings, the strings , cat, dog, catcat, catdog, dogcat, dogdog, catcatcat, catdogcat, dogcatcat, dogdogcat, etc. The notation for ∗ should now become clear: it is simply a special case of L ∗, where L = ∑

Instructor: Nick Cercone CSEB - 14 Markov Algorithms A Markov Algorithm is a finite sequence P 1, P 2,...,P n of Markov productions to be applied to strings in a given alphabet according to the following rules. Let S be a given string. The sequence is searched to find the first production P i whose antecedent occurs in S. If no such production exists, the operation of the algorithm halts without change in S. If there is a production in the algorithm whose antecedent occurs in S, the first such production is applied to S. If this is a conclusive production, the operation of the algorithm halts without further change in S. If this is a simple production, a new search is conducted using the string S' into which S has been transformed. If the operation of the algorithm ultimately ceases with a string S*, we say that S* is the result of applying the algorithm to S.

Instructor: Nick Cercone CSEB - 15 Markov Algorithms Example: Take the alphabet to be {a, b, c, d}. The algorithm is given below. Algorithm M1 M11: [conclusive]a d→  d c M12: [simple] b a →W M13: [simple]a →b c M14: [simple] b c →b b a M15: [simple] W →a Taking S = “dcb” we apply the algorithm by M15 dcb becomes adcb by M11 adcb becomes dccb and halts.

Instructor: Nick Cercone CSEB - 16 Markov Algorithms Example: Let  be a marker not in the alphabet. If S is a string in the alphabet, the result of applying algorithm M3 to S is the string SA. Algorithm M3 M31: [interchange]   →   , A  member of alphabet M32: [conclusive]  →  A M33:W →  Since S initially does not contain , the third production is then used to move  past the symbols in S. If S contains n occurrences of symbols, then after n steps we obtain the string S . At this point the first production no longer applies, and the second production produces SA. Since this production is conclusive, the string SA is then the result.

Instructor: Nick Cercone CSEB - 17 Markov Algorithms In the preceding example, we have introduced a new notation. Namely, in the first production we have used the variable  which ranges over the symbols in the alphabet. Thus the first line is not really a production, but rather a production schema, denoting all the productions which can be obtained by substituting symbols of the alphabet for . Because of the manner in which the Markov algorithms are used, the order in which the productions are written is vital. If the first two lines of algorithm M3 were interchanged, the result would be to transform S into AS, rather than into SA, and the productions represented by  →  would never be used.

Instructor: Nick Cercone CSEB - 18 Markov Algorithms Example: Another procedure which is quite common is that of reversing a string of characters. We do this by moving the first character to the end as before, then moving the next character down to the position just preceding the first character, and so on. Markers: ,  Algorithm M10 M101:   →  W , f  members of the alphabet M102:   →  M103:   f→ f   M104:   →  M105: W → 

Instructor: Nick Cercone CSEB - 19 Markov Algorithms Illustrating this algorithm on the string “ABCD” we have by M105 =>  A B C D by M103 => B  A C D by M103 => B C  A D by M103 => B C D  A by M104 => B C D  A by M105 =>  B C D  A by M103 => C  B D  A by M103 => C D  B  A by M102 => C D  B A by M105 =>  C D  B A by M103 => D  C  B A by M102 => D  C B A by M105 =>  D  C B A by M102 =>  D C B A by M105 =>   D C B A by M101 => D C B A

Instructor: Nick Cercone CSEB - 20 Other Concluding Remarks A PSYCHOLOGICAL TIP Whenever you're called on to make up your mind, and you're hampered by not having any, the best way to solve the dilemma, you'll find, is simply by spinning a penny. No -- not so that chance shall decide the affair while you're passively standing there moping; but the moment the penny is up in the air, you suddenly know what you're hoping.