Presentation is loading. Please wait.

Presentation is loading. Please wait.

Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence,

Similar presentations


Presentation on theme: "Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence,"— Presentation transcript:

1 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering Kewei Tu and Vasant Honavar Artificial Intelligence Research Laboratory Department of Computer Science Iowa State University www.cs.iastate.edu/~honavar/aigroup.html www.cild.iastate.edu

2 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Unsupervised Learning of Probabilistic Context-Free Grammar Greedy search to maximize the posterior of the grammar given the corpus Iterative (distributional) biclustering Competitive experimental results

3 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Outline Introduction Probabilistic Context Free Grammars (PCFG) The Algorithm based on Iterative Biclustering (PCFG-BCL) Experimental results

4 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Outline Introduction Probabilistic Context Free Grammars (PCFG) The Algorithm based on Iterative Biclustering (PCFG-BCL) Experimental results

5 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Motivation Probabilistic Context-Free Grammar (PCFG) find applications in many areas including: Natural Language Processing Bioinformatics Important to learn PCFG from data (training corpus) Labeled corpus not always available Hence the need for unsupervised learning

6 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Task Unsupervised learning of a PCFG from a positive corpus a square is above the triangle the square rolls a triangle rolls the square rolls a triangle is above the square a circle touches a square the triangle covers the circle …… S  NP VP NP  Det N VP  Vt NP (0.3) | Vi PP (0.2) | rolls (0.2) | bounces (0.1) ……

7 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Outline Introduction Probabilistic Context Free Grammars (PCFG) The Algorithm based on Iterative Biclustering (PCFG-BCL) Experimental results

8 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar PCFG Context-free Grammar (CFG) G = (N, Σ, R, S) N: non-terminals Σ: terminals R: rules S  N : the start symbol Probabilistic CFG Probabilities on grammar rules

9 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar P-CNF Probabilistic Chomsky normal form (P-CNF) Two types of rules: A  BC A  a

10 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar The AND-OR form P-CNF in the AND-OR form Two types of non-terminals: AND, OR AND  OR 1 OR 2 OR  A 1 | A 2 | a 1 | a 2 | …… with probabilities

11 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar The AND-OR form P-CNF in the AND-OR form

12 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar The AND-OR form P-CNF in the AND-OR form can be divided into two parts Start rules S  … A set of AND-OR groups Each group: AND  OR1 OR2 Bijection between ANDs and groups An OR may appear in multiple groups

13 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar The AND-OR form P-CNF in the AND-OR form can be divided into two parts

14 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Outline Introduction Probabilistic Context Free Grammars (PCFG) The Algorithm based on Iterative Biclustering (PCFG-BCL) Experimental results

15 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar PCFG-BCL: Outline Start with only the terminals Repeat the two steps Learn a new AND-OR group by biclustering Attach the new AND to existing ORs Post-processing: add start rules In principle, these steps are sufficient for learning any CNF grammar

16 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar PCFG-BCL: Outline Find new rules that yield the greatest increase in the posterior of the grammar given the corpus Local search, with the posterior as the objective function Use a prior that favors simpler grammars to avoid overfitting

17 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar PCFG-BCL Repeat the two steps Learn a new AND-OR group by biclustering Attach the new AND to existing ORs Post-processing: add start rules

18 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Intuition Construct a table T Index the rows and columns by symbols appearing in the corpus The cell at row x and column y records the number of times the pair xy appears in the corpus

19 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar An AND-OR group corresponds to a bicluster

20 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar The bicluster is multiplicatively coherent for any two rows i,j and two columns k,l

21 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Expression-context matrix of a bicluster Each row: a symbol pair contained in the bicluster Each column: a context in which the symbol pairs appear in the corpus It’s also multiplicatively coherent.

22 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Intuition If there’s a bicluster that is multiplicatively coherent and has a multiplicatively coherent expression-context matrix Then an AND-OR group can be learned from the bicluster

23 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Probabilistic Justification Change in likelihood as a result of adding an AND-OR group to a PCFG Bicluster multiplicative coherence Expression-context matrix multiplicative coherence

24 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Prior To prevent overfitting, use a prior that favors simpler grammars P(G)  2  DL(G) DL(G) is the description length of the grammar

25 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Learning a new AND-OR group by biclustering find in the table T a bicluster that leads to the maximal posterior gain create a new AND-OR group from the bicluster reduce the corpus using the new rules E.g., “the circle” is rewritten to the new AND symbol update T A new row and column are added for the new AND symbol

26 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar PCFG-BCL Repeat the two steps Learn a new AND-OR group by biclustering Attach the new AND to existing ORs Post-processing: add start rules

27 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Attaching the new AND under existing ORs For the new AND symbol N … There may exist OR symbols in the learned grammar, s.t. O  N is in the target grammar Such rules can't be learned in the biclustering step When learning O, N doesn’t exist When learning N, only learn N  AB We need an additional step to find such rules Recursion is learned in this step

28 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Intuition Adding rule O  N = adding a new row/column to the bicluster If O  N is true, then the expanded bicluster is multiplicatively coherent the expanded expression-context matrix is multiplicatively coherent If we find an OR symbol s.t. the expanded bicluster has this property Then a new rule O  N can be added to the grammar

29 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Probabilistic Justification Likelihood gain is an approximation of the expanded bicluster To prevent overfitting, the prior is also considered

30 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Attaching the new AND under existing ORs Try to find OR symbols that lead to large posterior gain When found add the new rule O  N to the grammar do a maximal reduction of the corpus update the table T

31 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar PCFG-BCL Repeat the two steps Learn a new AND-OR group by biclustering Attach the new AND to existing ORs Post-processing: add start rules

32 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Postprocessing For each sentence in the corpus: If it’s fully reduced to a single symbol x, then add S  x If not, a few options… Return the grammar

33 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Outline Introduction Probabilistic Context Free Grammars (PCFG) The Algorithm based on Iterative Biclustering (PCFG-BCL) Experimental results

34 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Experiments Measurements weak generative capacity precision, recall, F-score Test data artificial, English-like CFGs

35 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Experiment results P=Precision, R=Recall, F=F-score Number in the parentheses: standard deviation PCFG-BCL outperforms EMILE and ADIOS with lower standard deviations [Adriaans, et al., 2000][Solan, et al., 2005]

36 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Summary An unsupervised PCFG-learning algorithm It acquires new grammar rules by iterative biclustering on a table of symbol pairs In each step it tries to maximize the increase of the posterior of the grammar Competitive experimental results

37 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Work in progress Alternative strategies for optimizing the objective function Evaluation on and adaptation to real world applications (e.g., natural language), wrt. both weak and strong generative capacity

38 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Thank you~

39 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Backup…

40 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Step 1 Bicluster multiplicative coherence E-C matrix multiplicative coherence Prior gain (bias towards large BC) Likelihood Gain Posterior gain:

41 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Step 2 Intuition Remember O is learned by extracting a bicluster adding rule O  N = adding a new row/column to the bicluster

42 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Expanding the bicluster The expanded bicluster should still be multiplicatively coherent

43 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Step 2 Intuition Expression-context matrix adding rule O  N = adding a set of new rows to the E-C matrix

44 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Expanding the expression-context matrix The expanded expression-context matrix should still be multiplicatively coherent.

45 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Step 2 Likelihood gain: : the expected numbers of appearance of the symbol pairs when applying the current grammar to expand the current partially reduced corpus.

46 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Grammar selection/averaging Run the algorithm for multiple times to get multiple grammars Use the posterior of the grammars to do model selection/averaging Experimental results: Improved the performance Decreased the standard deviations

47 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Time Complexity N: # of ANDs k: average # of rules headed by an OR c: average column# of Expr-Cont Matrix h: average # of ORs that produce an AND or terminal d: a recursion depth limit ω: sentence# in the corpus m: average sentence length

48 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar biclustering vs. distributional clustering V1  makes | likes V2  likes | is Figure from [Adriaans, et al., 2000]

49 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar biclustering vs. substitutability heuristic N1  tea | coffee N2  eating Figure from [Adriaans, et al., 2000]

50 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar

51 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar A set of multiplicatively coherent biclusters, which represent a set of AND-OR groups in the grammar.

52 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Related work Unsupervised CFG learning EMILE [Adriaans et al., 2000] ABL [Zaanen, 2000] [Clark, 2001; 2007] ADIOS [Solan et al., 2005] Main difference Distributional biclustering A unified method for different types of rules

53 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Related work Unsupervised PCFG learning Inside-outside [Stolcke&Omohundro, 1994] [Chen 1995] [Kurihara&Sato, 2004; 2006] [Liang et al., 2007] Main difference Different prior Structure search method

54 Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Related work Unsupervised parsing (not CFG) [Klein&Manning, 2002; 2004] U-DOP [Bod, 2006]


Download ppt "Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence,"

Similar presentations


Ads by Google