Presentation is loading. Please wait.

Presentation is loading. Please wait.

LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 12: 10/4.

Similar presentations


Presentation on theme: "LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 12: 10/4."— Presentation transcript:

1 LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 12: 10/4

2 Administrivia Upcoming midterm –next Thursday 11th October –format –5 questions in class –topic: Perl regexps, DCG rules and FSA –to be done on the computer –email answers at the end of the class (one file) –Answer 4 correctly for A –Answer all 5 correctly for A* (plus raises HWs #1 and #2 to As if not already there)

3 Administrivia Upcoming guest lecture –October 30th (Tuesday) –David Pinkus from Google (Phoenix) –Engineering Manager: Top Secret Projects –Title (TBA): something like –“Google and Natural Language: the next 10 years”

4 Prolog FSA: Recap deterministic FSA (DFSA) –no ambiguity about where to go at any given state non-deterministic FSA (NDFSA) –no restriction on ambiguity (surprisingly, no increase in formal power) textbook –D-RECOGNIZE (FIGURE 2.13) –ND-RECOGNIZE (FIGURE 2.21) Easy to implement in Prolog –non-determinism handled by same backtracking mechanism as with the DCG rule system –two methods

5 Prolog FSA: Recap Method 1: Transition Table Lookup Prolog code (for any FSA) fsa(S,L) :- L = [C|M],transition(S,C,T),fsa(T,M). fsa(E,[]):- end_state(E). Facts (FSA-particular) –end_state(y). –transition(s,a,x). –transition(x,a,x). –transition(x,b,y). –transition(y,b,y). Query ?- fsa(s,List). sx y a a b b transition function  :  (a,s)=x  (a,x)=x  (b,x)=y  (b,y)=y >

6 FSA and Prolog: Recap –state s: (start state) s([a|L]) :- x(L). match input string beginning with a and call state x with remainder of input –state x: x([a|L]) :- x(L). x([b|L]) :- y(L). –state y: (end state) y([]). y([b|L]) :- y(L). –query (call start state directly) ?- s(L). sx y a a b b

7 Today’s Topics NDFSA and (D)FSA are equivalent in expressive power –any NDFSA can be converted into a deterministic FSA FSA with empty transitions are equivalent to FSA without empty transitions –conversion just a special case of the NDFSA conversion Non-regular languages –Pumping lemma

8 NDFSA [ discussed at the end of section 2.2 in the textbook] construct a new machine –each state of the new machine represents the set of possible states of the original machine when stepping through the input Note: –new machine is equivalent to old one (but has more states) –new machine is deterministic example –(set of states) 12 a a 3 b 1 2,3 a 3 b 2 > >

9 NDFSA example –(set of states) trick: –construct new machine with states = set of possible states of the old machine 12 a a 3 b > {1} > a > {2,3} {3}{3} ba {1} > {2,3} {3} ba {1} > {2,3} 1 2,3 a 3 b 2 >

10 NDFSA → FSA Another example 1.why is this a NDFSA? 2.transform NDFSA into a (deterministic) FSA q4q4

11 NDFSA → (D)FSA Another example –transform NDFSA into a (deterministic) FSA sx z a a a b y b b a b

12 ε-transitions jump from state to another state with the empty character –ε-transition (textbook) or λ-transition –no increase in expressive power examples a ε b > a b b > a ε b > what’s the equivalent without the ε-transition?

13 FSA Properties FSAs (and thus regular languages) are preserved, i.e. maintain their FSA nature, under... –concatenation –union –intersection –complementation –and other operations... –[see section 2.3 of textbook]

14 concatenation concatenate two FSAs, result is a FSA –trick: use ε-transitions to link the automatons example –[figure 2.24]

15 union disjunction (union) of two FSAs, result is a FSA –trick: use ε-transitions to link the automatons example –[figure 2.26]

16 intersection (conjunction) intersect two FSAs, result is a FSA –trick: use (modified) set-of-states construction example s1s1 xy a ab b s2s2 z b ab {s 1,s 2 } a {x,s 2 } a {y,z} b b look familiar? that’s because a + b * ∩ a * b + = a + b +

17 complementation (complementation) the negation or opposite FSA –with respect to Σ * the set of all possible strings from the alphabet –i.e. accepts everything original FSA rejects –and rejects everything original FSA accepts –result is still a FSA

18 Limits of Finite State Technology Language = set of strings case 1 –suppose set is finite –e.g. L = {ba, abc, ccb, dd} easy to encode as a FSA (by closure under union) case 2 –set is infinite –... s1s1 s2s2 s3s3 ab s1s1 s2s2 s3s3 ba s4s4 c s1s1 s2s2 s3s3 cc s4s4 b s1s1 s2s2 s3s3 dd s0s0 ε ε ε ε

19 Limits of Finite State Technology Language = set of strings case 2 –set is infinite –e.g. L = a + b + = { ab, aab, abb, aabb, aaab, abbb, … } “ one or more a ’ s followed by one or more b ’ s ” we know this set is regular –however, consider L = {a n b n | n ≥ 1} = { ab, aabb, aaabbb, … } “ same number of b ’ s as a ’ s …” this set is not regular. Why? sx y a a b b

20 The Limits of Finite State Technology [Formally, we can use the Pumping Lemma to prove this particular case. See later slides...] informally, –we can build FSA for … –ab –aabb –aaabbb –… ab aabb aaabbb = end state

21 The Limits of Finite State Technology we can merge the individual FSA for … –ab –aabb –aaabbb aaabbb bb b such direct encoding would require an infinite number of states –and we ’ re using Finite State Automata quite different from the infinity obtained by looping –freely iterate (no counting)

22 The Limits of Finite State Technology example –L = a + b + = { ab, abb, aab, aabb, aaab, abbb, … } –“ one or more a ’ s followed by one or more b ’ s ” Note: –can be divided into two independent halves –each half can be replaced by iteration s1s1 s2s2 s3s3 ba s1s1 s2s2 s3s3 aa s4s4 b s1s1 s2s2 s3s3 ba s4s4 b s1s1 s2s2 s3s3 aa s4s4 b s5s5 b s1s1 s2s2 s3s3 aa s4s4 a s5s5 b s1s1 s2s2 s3s3 ba s4s4 b s5s5 b

23 The Limits of Finite State Technology example –L = a + b + = { ab, abb, aab, aabb, aaab, abbb, … } –“ one or more a ’ s followed by one or more b ’ s ” Note: –can be divided into two independent halves –each half can be replaced by iteration s1s1 s2s2 s3s3 ba s1s1 s2s2 s3s3 aa s4s4 b s1s1 s2s2 s3s3 ba s4s4 b s1s1 s2s2 s3s3 aa s4s4 b s5s5 b s1s1 s2s2 s3s3 aa s4s4 a s5s5 b s1s1 s2s2 s3s3 ba s4s4 b s5s5 b s1s1 s2s2 s3s3 ba s4s4 b s1s1 s2s2 s3s3 aa s4s4 b s5s5 b s0s0 ε ε s1s1 s2s2 s3s3 aa s4s4 a s5s5 b s6s6 b s0s0 ε ε s1s1 s2s2 s3s3 aa s4s4 a s5s5 b s6s6 bb s7s7 s1s1 s2s2 s3s3 aa s4s4 a s5s5 b b s3s3 s4s4 a s5s5 b b a

24 A Formal Tool: The Pumping Lemma A tool used to prove languages are NOT regular –based on the intuition that if a language is regular, it “pumps” (iterates) –Recall: for any given # states, if the (accepted) input string is long enough, you’ll need to loop Note: –can’t be used to prove a language is regular –to show something is regular: proof by exhibition describe a FSA (or regular expression or regular grammar)

25 A Formal Tool: The Pumping Lemma If a language L is regular, then there exists a number p > 0 –(sometimes called a magic number) such that every string w in L with |w| ≥ p can be written in the following form, where p is a pumping length: w = xyz with strings x, y and z such that |xy| ≤ p, |y| > 0 and xy i z is in L for every integer i ≥ 0.

26 A Formal Tool: The Pumping Lemma Example: –show that a n b n is not regular Proof (by contradiction): –pick a sufficiently long string in the language –e.g. a..aab..bb (#a’s = #b’s) –Partition it according to w = xyz –then show xy i z is not in L –i.e. string does not pump

27 A Formal Tool: The Pumping Lemma aaaa..aabbbb..bb Case 1: w = xyz, y straddles the ab boundary what happens when we pump y? Case 2: w = xyz, y is wholly within the a’s what happens when we pump y? Case 3: w = xyz, y is wholly within the b’s what happens when we pump y? yyy

28 Midterm Practice Question not graded (answer next time) convert the NDFSA into a deterministic FSA implement both the NDFSA and the equivalent FSA in Prolog using the “one predicate per state” encoding run both machines on the strings abab and abaaba, –how many steps (transitions + final stop) does each machine make? from figure 2.27 in the textbook


Download ppt "LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 12: 10/4."

Similar presentations


Ads by Google