LING 438/538 Computational Linguistics Sandiway Fong Lecture 8: 9/29.

Slides:



Advertisements
Similar presentations
LING/C SC/PSYC 438/538 Lecture 11 Sandiway Fong. Administrivia Homework 3 graded.
Advertisements

Chapter 5: Languages and Grammar 1 Compiler Designs and Constructions ( Page ) Chapter 5: Languages and Grammar Objectives: Definition of Languages.
Theory Of Automata By Dr. MM Alam
LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 13: 10/9.
LING 388: Language and Computers Sandiway Fong Lecture 9: 9/27.
Pushdown Automata Part II: PDAs and CFG Chapter 12.
LING 388: Language and Computers Sandiway Fong Lecture 9: 9/22.
LING 438/538 Computational Linguistics Sandiway Fong Lecture 7: 9/12.
LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 10: 9/27.
LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 12: 10/4.
LING 438/538 Computational Linguistics Sandiway Fong Lecture 9: 9/21.
LING 388: Language and Computers Sandiway Fong Lecture 21: 11/7.
LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 7: 9/11.
COMMONWEALTH OF AUSTRALIA Copyright Regulations 1969 WARNING This material has been reproduced and communicated to you by or on behalf of Monash University.
LING 388: Language and Computers Sandiway Fong Lecture 2: 8/23.
LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 6: 9/6.
Languages, grammars, and regular expressions
LING 438/538 Computational Linguistics Sandiway Fong Lecture 10: 9/26.
LING 388 Language and Computers Lecture 4 9/11/03 Sandiway FONG.
LING 438/538 Computational Linguistics Sandiway Fong Lecture 11: 10/3.
LING 388: Language and Computers Sandiway Fong Lecture 6: 9/13.
LING 388: Language and Computers Sandiway Fong Lecture 11: 10/3.
LING 438/538 Computational Linguistics Sandiway Fong Lecture 6: 9/7.
LING 438/538 Computational Linguistics Sandiway Fong Lecture 12: 10/5.
79 Regular Expression Regular expressions over an alphabet  are defined recursively as follows. (1) Ø, which denotes the empty set, is a regular expression.
LING 388 Language and Computers Lecture 11 10/7/03 Sandiway FONG.
LING 388: Language and Computers Sandiway Fong Lecture 10: 9/26.
LING 388 Language and Computers Lecture 7 9/23/03 Sandiway FONG.
LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 11: 10/2.
LING 438/538 Computational Linguistics Sandiway Fong Lecture 5: 9/5.
LING 388 Language and Computers Lecture 9 9/30/03 Sandiway FONG.
Topics Automata Theory Grammars and Languages Complexities
1 Regular Expressions/Languages Regular languages –Inductive definitions –Regular expressions syntax semantics Not covered in lecture.
Finite State Machines Data Structures and Algorithms for Information Processing 1.
LING 388 Language and Computers Lecture 6 9/18/03 Sandiway FONG.
Languages and Machines Unit two: Regular languages and Finite State Automata.
CPSC 388 – Compiler Design and Construction
Regular Languages A language is regular over  if it can be built from ;, {  }, and { a } for every a 2 , using operators union ( [ ), concatenation.
Finite-State Machines with No Output
Lecture Two: Formal Languages Formal Languages, Lecture 2, slide 1 Amjad Ali.
Introduction to CS Theory Lecture 3 – Regular Languages Piotr Faliszewski
1 Regular Expressions. 2 Regular expressions describe regular languages Example: describes the language.
1 Chapter 1 Introduction to the Theory of Computation.
LING/C SC/PSYC 438/538 Lecture 7 9/15 Sandiway Fong.
1 Introduction to Regular Expressions EELS Meeting, Dec Tom Horton Dept. of Computer Science Univ. of Virginia
1 Computability Five lectures. Slides available from my web page There is some formality, but it is gentle,
LING 388: Language and Computers Sandiway Fong Lecture 6: 9/15.
Regular Grammars Chapter 7. Regular Grammars A regular grammar G is a quadruple (V, , R, S), where: ● V is the rule alphabet, which contains nonterminals.
Regular Expressions and Languages A regular expression is a notation to represent languages, i.e. a set of strings, where the set is either finite or contains.
LING 388: Language and Computers Sandiway Fong 9/27 Lecture 10.
LING/C SC/PSYC 438/538 Lecture 13 Sandiway Fong. Administrivia Reading Homework – Chapter 3 of JM: Words and Transducers.
CSC312 Automata Theory Lecture # 3 Languages-II. Formal Language A formal language is a set of words—that is, strings of symbols drawn from a common alphabet.
CS 203: Introduction to Formal Languages and Automata
LING/C SC/PSYC 438/538 Lecture 15 Sandiway Fong. Did you install SWI Prolog?
LING/C SC/PSYC 438/538 Lecture 11 Sandiway Fong. Administrivia Homework 5 graded.
Three Basic Concepts Languages Grammars Automata.
Mathematical Foundations of Computer Science Chapter 3: Regular Languages and Regular Grammars.
BİL711 Natural Language Processing1 Regular Expressions & FSAs Any regular expression can be realized as a finite state automaton (FSA) There are two kinds.
Lecture 02: Theory of Automata:2014 Asif Nawaz Theory of Automata.
Week 14 - Friday.  What did we talk about last time?  Simplifying FSAs  Quotient automata.
LING/C SC/PSYC 438/538 Lecture 17 Sandiway Fong. Last Time Talked about: – 1. Declarative (logical) reading of grammar rules – 2. Prolog query: s(String,[]).
Topic 3: Automata Theory 1. OutlineOutline Finite state machine, Regular expressions, DFA, NDFA, and their equivalence, Grammars and Chomsky hierarchy.
Theory of Computation Lecture #
Lecture 1 Theory of Automata
LING/C SC/PSYC 438/538 Lecture 11 Sandiway Fong.
REGULAR LANGUAGES AND REGULAR GRAMMARS
Chapter 7 Regular Grammars
LING/C SC/PSYC 438/538 Lecture 21 Sandiway Fong.
LING/C SC/PSYC 438/538 Lecture 15 Sandiway Fong.
LING/C SC/PSYC 438/538 Lecture 18 Sandiway Fong.
Presentation transcript:

LING 438/538 Computational Linguistics Sandiway Fong Lecture 8: 9/29

Administrivia reminder –homework 2 due tonight

Last Time regular grammars –aka Chomsky hierarchy type-3 grammars –are formal grammars with severe restrictions on what can appear on the RHS –are limited in generative capacity or power –in Prolog DCG notation: x --> y, [t]. x --> [t]. (left recursive variant) or x --> [t],y. x --> [t]. (right recursive variant) –can ’ t have both left and right recursive rules in the same grammar

Last Time regular grammars examples regular languages –“ one or more a ’ s followed by one or more b ’ s ” –sheeptalk {ba!, baa!, baaa!,...} i.e. –can be encoded by a regular grammar beyond regular grammars examples –a n b n = –{ab, aabb, aaabbb,... } –ww R : where w  {a,b} + –i.e. any non-empty sequence of a’s and b’s informal idea about the crucial difference “needing to keep track of history”

Today’s Topic Finite State Automata –plus more on what it means to be a regular language Merge Point –Textbook – Chapter 2: Regular Expressions and Automata

+ left & right recursive rules Today’s Topic Finite State Automata –plus more on what it means to be a regular language formally equivalent – in terms of generative capacity or power Regular Grammars FSA Regular Expressions Regular Languages

Some Regular Expression Notation... some notation first (more on regexps next time) Regular Expressions (regexp) shorthand for describing sets of strings Operators: –string + set of one or more occurrences of string a + = {a, aa, aaa, aaaa, aaaaa, …} (abc) + = {abc, abcabc, abcabcabc, …} –Note: parentheses used to delimit the scope of the operator –string * set of zero or more occurrences of string a * = {, a, aa, aaa, aaaa, …} (abc) * = {, abc, abcabc, …} –Note:  - zero length string

Some Regular Expression Notation... some notation first Relation between * and + –a a * = a + –“a concatenated with a*” –a {, a, aa, aaa, aaaa, …} = {a, aa, aaa, aaaa, aaaaa, …} Operators: –string n exactly n occurrences of string a 4 b 3 = { aaaabbb } Language = a set of strings

Regular Expressions regular expressions –formally equivalent to regular grammars and finite state automata How to show this? Proof by construction… beyond regular expressions –examples {a n b n | n>0} is not regular {ww R | w  {a,b} + } is not regular, e.g. (abc) R = cba –How to show this? –Proof by Pumping Lemma Regular Grammars FSA Regular Expressions

Regular Expressions Example: –Language: L = {a + b + } “one or more a’s followed by one or more b’s” regular language –described by a regular expression Note: –infinite set of strings belonging to language L »e.g. abbb, aaaab, aabb, *abab, * Notation: –  is the empty string (or string with zero length) –* means string is not in the language regular grammar s --> [a],b. b --> [a],b. b --> [b],c. b --> [b]. c --> [b],c. c --> [b].

Finite State Automata (FSA) sx y a a b b L = {a + b + } L = {aa * bb * } deterministic FSA (DFSA) no ambiguity about where to go at any given state non-deterministic FSA (NDFSA) no restriction on ambiguity (surprisingly, no increase in power)

Finite State Automata (FSA) more formally –(Q,s,f,Σ,  ) 1.set of states (Q): {s,x,y}must be a finite set 2.start state (s): s 3.end state(s) (f): y 4.alphabet ( Σ ): {a, b} 5.transition function  : signature: character × state → state  (a,s)=x  (a,x)=x  (b,x)=y  (b,y)=y sx y a a b b

Finite State Automata (FSA) practical applications can be encoded and run efficiently on a computer widely used –encode regular expressions –compress large dictionaries –morphological analyzers Different word forms, e.g. want, wanted, unwanted (suffixation/prefixation) see chapter 3 of textbook speech recognizers Markov models = FSA + probabilities and many more …

Finite State Automata (FSA) how: 3 vs. 6 keystrokes michael: 7 vs. 15 keystrokes –T9 text entry (tegic.com) built in to your cellphone predictive text entry for mobile messaging/data entry reduces the number of keystrokes for inputting words on a telephone keypad (8 keys)

RegExp  FSA From Regular Expression to FSA Operators –asingle symbol a –a n n occurrences of a –a –a n a 3  a a aa

RegExp  FSA Operators –a * zero or more occurrences of a –a + one or more occurrences of a –a * –a + a + = aa * a a a

Regular Grammar  FSA examples –s --> [a], t. –x --> [a], x. –x --> [a]. a st a x a x final state y

Next Time Prolog and FSA