# Recognising Languages We will tackle the problem of defining languages by considering how we could recognise them. Problem: Is there a method of recognising.

## Presentation on theme: "Recognising Languages We will tackle the problem of defining languages by considering how we could recognise them. Problem: Is there a method of recognising."— Presentation transcript:

Recognising Languages We will tackle the problem of defining languages by considering how we could recognise them. Problem: Is there a method of recognising infinite languages? i.e. given a language description and a string, is there an algorithm which will answer yes or no correctly? We will define an abstract machine which takes a candidate string and produces the answer yes or no. The abstract machine will be the specification of the language.

Finite State Automata A finite state automaton is an abstract model of a simple machine (or computer). The machine can be in a finite number of states. It receives symbols as input, and the result of receiving a particular input in a particular state moves the machine to a specified new state. Certain states are finishing states, and if the machine is in one of those states when the input ends, it has ended successfully (or has accepted the input). Example: A 1 1 2 3 4 a b b b a a a,b

FSA: Formal Definition A Finite State Automaton (FSA) is a 5-tuple (Q, I, F, T, E) where: Q = states = a finite set; I = initial states = a subset of Q; F = final states = a subset of Q; T = an alphabet; E = edges = a subset of Q (T + ) Q. FSA = labelled, directed graph =set of nodes (some final/initial) + each arc may have a label from an alphabet + directed arcs (arrows) between nodes. Example: formal definition of A 1 Q = {1, 2, 3, 4} I = {1} F = {4} T = {a, b} E = { (1,a,2), (1,b,4), (2,a,3), (2,b,4), (3,a,3), (3,b,3), (4,a,2), (4,b,4) } A1A1 1 2 3 4 a b b b a a a,b

Try going backwards Given: Q = {1, 2, 3, 4, 5} I = {1} F = {3} T = {a, c, h, m, t} E = { (1,c,2), (1,h,2), (1,m,2), (1,a,3), (1,h,5), (2,a,3), (3,t,4), (5,a,3) } Whats the diagram?

Definitions If (x,a,y) is an edge, x is its start state and y is its end state. A path is a sequence of edges such that the end state of one is the start state of the next. path p 1 = (2,b,4), (4,a,2), (2,a,3) A path is successful if the start state of the first edge is an initial state, and the end state of the last is a final state. path p 2 = (1,b,4),(4,a,2),(2,b,4),(4,b,4) The label of a path is the sequence of edge labels. label(p 1 ) = baa.

Definitions (cont.) A string is accepted by a FSA if it is the label of a successful path. Let A be a FSA. The language accepted by A is the set of strings accepted by A, denoted L(A). babb = label(p 2 ) is accepted by A 1. The language accepted by A 1 is the set of strings of a's and b's which end in a b, and in which no two a's are adjacent. 1 2 3 4 a b b b a a a,b

Non-Determinism We would like to use FSAs as a mechanism to recognise languages, simply following the edges as we input symbols from a string. But problem – non-deterministic. That is, we may have a choice. 1.From one state there could be a number of edges with the same label. have to remember possible branching points as we trace out a path, and investigate all branches. 2.Some of the edges could be labelled with, the empty string how does this affect our algorithm? 3.May be more than one initial state where do we start? 1 2 4 a a a b b Not Algorithmic

Deterministic FSA A FSA is deterministic if: (i) there are no -labelled edges; (ii) for any pair of state and symbol (q,t), there is at most one edge (q,t, p); and (iii) there is only one initial state. Notation: DFSA and NDFSA stand for deterministic and non-deterministic FSA respectively. A DFSA is a specification of an algorithm for determining whether or not a string is a member of a given language. All three conditions must hold

Transition Functions The transition function of A is the function : (x, t) {y : edge (x, t, y) in A} If A = (Q,I,F,T,E), then the transition matrix for A is a matrix with one row for each state in Q, one column for each symbol in T, s.t. the entry in row x and column t is (x,t). Example: ab 1{2}{4} 2{3}{4} 3{3}{3} 4{2}{4} where x denotes an initial state and x denotes a finish state.

Recognition Algorithm Problem: Given a DFSA, A = (Q,I,F,T,E), and a string w, determine whether w L(A). Note: denote the current state by q, and the current input symbol t. Since A is deterministic, (q,t) will always be a singleton set or will be undefined. If it is undefined, denote it by ( Q). Algorithm: Add symbol # to end of w. q := initial state t := first symbol of w#. while (t # and q ) begin q := (q,t) t := next symbol of w# end return ((t == #) & (q F))

Equivalence of DFSA's and NDFSA's Theorem: DFSA = NDFSA Let L be a language. L is accepted by a NDFSA iff L is accepted by a DFSA Algorithm: NDFSA -> DFSA create unique initial state remove -edges remove edge choices There is a simple algorithm to create a DFSA equivalent (i.e. accepts same language) to a given NDFSA. The reverse direction is trivial.

Algorithms Algorithm: create unique initial state (informal) Input: (Q, I, F, T, E) Q := Q {i} (where i Q) for each q I, E:= E {(i,,q)} I := {i} a b 12 3 6 54 a b a b 7 8

Algorithms Algorithm: remove -edges (informal) remove all single edges of form (q,,q) from E while there are cycles (paths starting and ending at the same state) of -edges in E begin select a cycle merge all states in path into a single state, keeping all edges into and out of path end while there are -edges in E begin select an -edge (p,,q) for each edge (q,t,r) add edge (p,t,r) if q F, add p to F remove (p,,q) from E end

Removing Edge Choices Algorithm: remove edge choices (subset construction) Input: (Q,I,F,T,E) I' := {I}F' := {}E' := {}S := {I}Q' : = {I} while S is not empty begin select X S for each t T begin S' = {q Q:(p,t,q) E, for some p X} if S' F {}, then F' := F' {S'} if S' {}, then E' := E' {(X,t,S')} S := S {S'}\Q' Q' := Q' {S'} end S := S\{X} end return (Q',I',F',T,E') 1 2 3 a,b ab (Q = {1,2,3}, I = {1}, F={3}, T={a,b}, E = {(1,a,1),(1,a,2),(1,b,1), (2,b,3),(3,a,3),(3,b,3)}) REMEMBER S is a set of sets!

Removing Edge Choices I' F' E' S Q' X t S' {1} ({1},a,{1,2}) ({1},b,{1}) ({1,2},a,{1,2}) ({1,2},b,{1,3}) ({1,3),a,{1,2,3}) ({1,3},b,{1,3}) ({1,2,3},a,{1,2,3}) ({1,2,3},b,{1,3}) {1} {1,2} -{1} {1,3} -{1,2} {1,2,3} -{1,3} -{1,2,3} {1} {1,2} {1,3} {1,2,3} {1} {1,2} {1,3} {1,2,3} abababababababab {1,2} {1} {1,2} {1,3} {1,2,3} {1,3} {1,2,3} {1,3} {1,2,3} a 1 1,2 1,2,3 b ab a 1,3 b b a REMEMBER S is a set of sets!

Minimum Size of FSA's Let A = (Q,I,F,T,E). For any two strings, x, y in T*, x and y are distinguishable w.r.t. A if there is a string z T* s.t. exactly one of xz and yz are in L(A). z distinguishes x and y w.r.t. A. This means that with x and y as input, A must end in different states - A has to distinguish x and y in order to give the right results for xz and yz. This is used to prove the next result: Theorem: L T*. If, for some integer n, there are n elements of T* s.t. any two are distinguishable w.r.t. A, then any FSA that recognises L must have at least n states.

Unique Minimal DFSA's Theorem: Minimal DFSA For a given language L, there exists a minimal DFSA accepting L, and it is unique. Algorithm: DFSA => minimal DFSA Input: (Q,I,F,T,E) R := { } for all edges (p,t,q) E do add (q,t,p) to R remove edge choices from (Q,I,F,T,R) to get (Q',I',F',T',E') Z := equivalent_states(Q, Q') (Q'',I'',F'',T,E'') := merge(Z,Q,I,F,T,E) return (Q'',I'',F'',T,E'') 1 2 4 5 a b a b 3 a,b b a b b 1 2 5 a b a 3,4 4 a,b b

Equivalent States Algorithm: equivalent_states Input: Q, Q' set all cells of M to t for all p in Q do for all sets S in Q' do if p is in S then for each q in Q do if q not in S then M[p][q] := f else for each q in Q do if q in S then M[p][q] := f S := { } for all p in Q do Z := { } if M[p][ ] = t then for each q in Q do if M[p][q] = t then add q to Z M[q] [ ] := f add Z to S return S Note: M is a 2d array, indexed by Q and { } Q

merging equivalent states Algorithm: merge Input: Z,(Q,I,F,T,E) for all S in Z do select p in S for all q in S s.t. q p do delete q from Q for each edge (q,t,w) in E do delete (q,t,w) from E if (p,t,w) not in E then add (p,t,w) to E for each edge (w,t,q) in E do delete (w,t,q) from E if (w,t,p) not in E then add (w,t,p) to E if q in I then delete q from I if p not in I then add p to I if q in F then delete q from F if p not in F then add p to F return (Q,I,F,T,E)

Example Minimisation 1 2 4 5 a b a b 3 a,b b a b b reversed the set Q from remove edge choices is: { {3,5}, {2,3,5}, {2,3,4,5}, {1,2,3,4,5} } 1 2 4 5 a b a b 3 a,b b a b b 1 2 3 4 5 1234512345 tttttttttt tttttttttt tttttttttt tttttttttt tttttttttt tttttttttt

Download ppt "Recognising Languages We will tackle the problem of defining languages by considering how we could recognise them. Problem: Is there a method of recognising."

Similar presentations