Inferring Finite Automata from queries and counter-examples Eggert Jón Magnússon.

Inferring Finite Automata from queries and counter-examples Eggert Jón Magnússon

Learning a language Inferring finite automata is analogous to learning a language. In fact, there is no way to distinguish between two automata that recognize the same language, without examining the state structure. We focus on finding the minimum equivalent automata.

Requirements for learning It has been shown that the only classes of languages that can be learned from positive data only are classes which include no infinite language. The idea is proof by contradiction. Assume that we have a guessing algorithm that can build an automaton to recognize the finite language L from the series of strings w 1...w n, members of L. Build an infinite language L’ that simply consists of the strings w 1...w n, plus at least one rule or string that is not a member of L. The infinite language can therefore always fool any guessing algorithm.

Teacher Angluin introduced the concept of a minimally adequate teacher, that can answer the questions: ◦ “is S a member of L” – yes/no ◦ “Is given DFA, D, the answer” – yes / or a string from the symmetric difference of L D and L (either a string that is in L and not in L D or a string that is in L D and not in L). With a given teacher, an algorithm exists that recognizes a regular set, and is  P.

Angluin’s Algorithm Iteratively, the algorithm builds a DFA using membership queries, then presents the teacher with the DFA as a solution. If the DFA is accepted, the algorithm is finished. Otherwise, the teacher responds with a counter-example, a string that the DFA presented would either accept or reject incorrectly. The algorithm uses the counter-example to refine the DFA, going back to the first step.

Angluin’s Algorithm, details. The algorithm uses two sets, S for states and E for experiments, and one observation table, T, where elements of (S  S  A) form rows, and elements of E form columns – the values of each cell is the outcome of a membership test for the concatenation of the row and column strings. The set S is prefix-complete, the set E is suffix-complete. Before making a guess, the observation table is required to be closed and consistent. ◦ Closed means that there are no unique rows in the bottom part of the observation table, for elements in S  A. ◦ - if the observation table isn’t closed, we find a unique row in the bottom part of the observation table, and pull it’s corresponding element from S  A into S ◦ Consistent means that if two rows for elements s 1, s 2 in S in the table are the same, for all a in A, the rows for s 1 a and s 2 a are the same. ◦ - if the table isn’t consistent, we find a suffix where this doesn’t hold, and add that to E.

Example Run Let’s use an example DFA from Sipser (Example 1.68, p. 76 in International version). The alphabet is A= {a,b}

Example, continued S = E = {  } T initialized with T is not closed – t(a)  t(  ) Add “a” to S, extend T T is now both closed and consistent.  0 a1 b1  0 a1 b1 aa0 ab1

First guess The teacher rejects, and gives the counterexample “ba” – which is not accepted by the first guess. We add “ba” and all it’s prefixes (“b”) to S. S is now: { ,“a”,”b”,”ba”} Now, the table is no longer consistent – row(b) = row(ba), but row(bab)  row(bb). We add “b” to E  0 a1 b1 ba1 aa0 bb0 ab1 baa0 bab1

Second guess The table is now consistent, and closed, so we make a guess. Note that the unique row “bitmask values” translate directly to states. T  b  01 a11 b10 ba11 aa01 bb01 ab11 baa01 bab11

Running time Equivalence test uses EQ DFA Since, for each equivalence test, we add at least one state to the guess state machine, in the worst case, we make one guess for each state in the target machine. In general, before each guess, we add only one string to either S or E. The running time is O(m 2 n 2 + mn 3 ) – m is the longest counterexample produced, and n is the number of states in the target machine.

Further work The requirement of a teacher is considered unfair by many and requiring too much knowledge of the automaton. Estimation/exploration algorithm (EEA) is a genetic algorithm. ◦ Creates many random state machines, and many random test strings ◦ Compares the output of the random state machines with the output of the target machine ◦ Iteratively refines, alternatively, the random state machines and test strings, either until convergence or until some desirable behaviour is displayed. ◦ Verification is done with a new set of test strings.

References Angluin, D., 1987. Learning Regular Sets from Queries and Counter-examples. Gold, E. Mark, 1967. Language Identification in the Limit. Bongard, J., Lipson, H., 2005. Active Coevolutionary Learning of Deterministic Finite Automata.

Inferring Finite Automata from queries and counter-examples Eggert Jón Magnússon.

Similar presentations

Presentation on theme: "Inferring Finite Automata from queries and counter-examples Eggert Jón Magnússon."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Inferring Finite Automata from queries and counter-examples Eggert Jón Magnússon.

Similar presentations

Presentation on theme: "Inferring Finite Automata from queries and counter-examples Eggert Jón Magnússon."— Presentation transcript:

Similar presentations

About project

Feedback