Download presentation

Presentation is loading. Please wait.

Published byLola Florence Modified over 2 years ago

1
Sandro Spina, John Abela Department of CS & AI, University of Malta. Mutually compatible and incompatible merges for the search of the smallest consistent DFA Francois Coste INRIA/IRISA, Campus de Beaulieu, 35042 Rennes Cedex, France

2
Mutually compatible and incompatible merges for the search of the smallest consistent DFA – ICGI 2004 2 The motivation behind our work was that of improving the (greedy) heuristic used by EDSM. Work was also carried out on diversification of the search strategy. EDSM [Price98] is very effective at inferring regular languages, except when training data is sparse. According to [Price98], Abbadingo style problems can be solved with high confidence (0.93) when the number of matched state labels is greater than 10. EDSM determines its merge sequence (greedily) by using a heuristic which compares language suffixes between two states in a DFA. Three Complementary Tracks Improve on Heuristic Score. Improve on Search Strategy. Combine these two. Evidence Driven State Merging

3
Mutually compatible and incompatible merges for the search of the smallest consistent DFA – ICGI 2004 3 Q. Whenever EDSM does not correctly infer the target language, can we (using a greedy depth-first search) improve the learner’s merge path by gathering and combining information (state label matches) from multiple valid merges? Does the combination of their evidence scores result in valuable information? Can this information be used to guide the search? We think so !!! Some of the initial results are encouraging. Target Size Convergence Improves Drastically Classification Rate Does Not Improve Consistently EDSM score→ focuses on single merge analysis S-EDSM score → score is a combination of single merge analysis Sharing Evidence

4
Mutually compatible and incompatible merges for the search of the smallest consistent DFA – ICGI 2004 4 Let M be the set of all possible merges. A merge M, is said to be valid if all the states in the subtree of q1 are state compatible with the corresponding states in the subtree of q2. Let M1, M2 2 M be two valid merges We define the relation ↑ µ M X M as follows M1↑M2 if M2 remains a valid merge in the hypothesis obtained by applying M1 If M1↑M2, we say that M1 is pairwise compatible to M2 Pairwise Compatible Merges

5
Mutually compatible and incompatible merges for the search of the smallest consistent DFA – ICGI 2004 5 Pairwise Compatible Merges (Simple) Example

6
Mutually compatible and incompatible merges for the search of the smallest consistent DFA – ICGI 2004 6 Suppose that M1, M2 and M3 2 M where M1 ↑ M2 and M2 ↑ M3 This does not necessarily imply that M1↑M3. This is because some states in M2 can be labelled differently by M1 and M3. Therefore ↑ is not transitive. In order to make ↑ a transitive relation ( denoted as ↕ ), M1 ↑ M3 needs to be checked as well to create the set { M1, M2, M3 } Set cardinality of mutual compatible merges can direct S-EDSM ‘s heuristic score. This is currently not implemented. Mutual Compatible Merges

7
Mutually compatible and incompatible merges for the search of the smallest consistent DFA – ICGI 2004 7 S-EDSM Algorithm

8
Mutually compatible and incompatible merges for the search of the smallest consistent DFA – ICGI 2004 8 Shared Evidence Driven State Merging (S-EDSM) implements only pairwise compatibility by creating classes of M1 ↑ { M2 … Mn } for the top 30% valid merges. Scores are recalculated and the best merge is determined and executed. Various strategies can be implemented. In terms of classification rate we are still not consistently performing better than classic EDSM. S-EDSM approximates better the target size of the target automaton. However this improvement does NOT help on its own. It’s only (possibly) an indication of a direction to follow. Initial Results

9
Mutually compatible and incompatible merges for the search of the smallest consistent DFA – ICGI 2004 9 Results II ( 400 State Target Size Convergence ) This graph documents 10 consecutive problems downloaded from Gowachin. Training set consisted of 20,000 strings.

10
Mutually compatible and incompatible merges for the search of the smallest consistent DFA – ICGI 2004 10 Results III ( 256 State Target Size Classification )

11
Mutually compatible and incompatible merges for the search of the smallest consistent DFA – ICGI 2004 11 Pairwise Incompatible Merges for Search m1 M m2 Mm3 M m5 Mm4 M =≠ Classical Search Tree: ……… …… = = = = ≠ ≠ ≠

12
Mutually compatible and incompatible merges for the search of the smallest consistent DFA – ICGI 2004 12 Pairwise Incompatible Merges for Search m1 M m2 M m’3 M I(m1) m’5 M I(m2) m4 M =≠ Candidates limitation after backtrack: … … … … … = = = = ≠ ≠ ≠ ≠

13
Mutually compatible and incompatible merges for the search of the smallest consistent DFA – ICGI 2004 13 Rationale: –A merge m’ I(m) may be tried after m. –Introduces diversity in the search Edsm: I(m) may be computed [Coste & Fredouille,ICGI’00] S-Edsm: I(m) is available « for free » Significant improvement when applied to the 3 first choices. Best application of scheme after the choice m’3 M I(m1) ? –After merging m’3 (=) –After not merging m’3 (≠) Pairwise Incompatible Merges for Search

14
Mutually compatible and incompatible merges for the search of the smallest consistent DFA – ICGI 2004 14 Develop a calculus to describe merge interactions. Implement all the relations and functions ( mutual compatibility, dominance, etc. ) of the calculus. Analyse the results achieved from these different implementations. Combine heuristic with better search strategies and study the best combination of heuristic and search strategy. Introduction of diversity in the exploration of the search space by limiting choice of candidate merges after a backtrack. Noisy Data !! Can S-EDSM perform better by combining information between different merges. Maybe with information gathered from merge interactions S-EDSM can ‘discover’ noise in the training set. Ultimately we want to see how far we can push, in terms of data sparseness, DFA learning. Thank you. Future Directions

Similar presentations

OK

May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Approaches to Parsing.

May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Approaches to Parsing.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google