Presentation is loading. Please wait.

Presentation is loading. Please wait.

Prototype-Driven Grammar Induction Aria Haghighi and Dan Klein Computer Science Division University of California Berkeley.

Similar presentations


Presentation on theme: "Prototype-Driven Grammar Induction Aria Haghighi and Dan Klein Computer Science Division University of California Berkeley."— Presentation transcript:

1 Prototype-Driven Grammar Induction Aria Haghighi and Dan Klein Computer Science Division University of California Berkeley

2 Grammar Induction Grammar Engineer EM Data Output 

3 Central Questions How do we specify what we want to learn? How do we fix observed errors? NPs are things like DT NN That’s not quite it!

4 Experimental Set-up Binary Grammar { X 1, X 2, … X n } plus POS tags Data WSJ-10 [7k sentences] Evaluate on Labeled F 1 Grammar Upper Bound: 86.1 XjXj X i XkXk

5 Unconstrained PCFG Induction Learn PCFG with EM Inside-Outside Algorithm Lari & Young [90] Results 0 i j n  (Inside)  (Outside)

6 Encoding Knowledge What’s an NP? Semi-Supervised Learning

7 Encoding Knowledge What’s an NP? For instance, DT NN JJ NNS NNP Prototype Learning

8 Prototype List

9 DT The NN koala How to use prototypes? NP PP VP VBD sat IN in DT the NN tree ¦ ¦ DT The NN koala JJ hungry NP PhrasePrototype NPDT NN PPIN DT NN VPVBD IN DT NN

10 Distributional Similarity  (DT NN)  (JJ NNS)  (NNP NNP) NP  (VBD DT NN)  (MD VB NN)  (VBN IN NN) VP  (IN NN)  (IN PRP)  (TO CD CD) PP  (DT JJ NN) { ¦ __ VBD : 0.3, VBD __ ¦ : 0.2, IN __ VBD: 0.1, ….. } proto=NP Clark ‘01 Klein & Manning ‘01

11 Distributional Similarity  (DT NN)  (JJ NNS)  (NNP NNP) NP  (VBD DT NN)  (MD VB NN)  (VBN IN NN) VP  (IN NN)  (IN PRP)  (TO CD CD) PP  (IN DT) proto=NONE

12 DT The NN koala VBD sat IN in DT the NN tree ¦ ¦ JJ hungry Prototype CFG + Model proto=NPproto=NONE

13 Prototype CFG + Model NP PP VP DT The NN koala VBD sat IN in DT the NN tree ¦ ¦ S JJ hungry NP P (DT NP | NP) P (proto=NP | NP) CFG Rule Proto Feature P CFG + (T,F) =  X(i,j)!  2 T P(  | X) P(p i,j | X)

14 Prototype CFG + Induction Experimental Set-Up Add Prototypes BLIPP corpus Results Unlabeled F 1 66.9

15 Constituent-Context Model DT The NN koala VBD sat IN in DT the NN tree ¦ ¦ JJ hungry Klein & Manning ‘02 + - P(VBD IN DT NN | +) P(NN __ ¦ | +) Yield Context P(NN VBD| - ) P(JJ __ IN | - )

16 Constituent-Context Model Bracketing Matrix P CCM (B, F’) =  (i,j) P(B i,j ) P(y i,j | B i,j ) P(c i,j | B i,j ) DT The NN koala VBD sat IN in DT the NN tree ¦ ¦ JJ hungry YieldContext

17 Intersected Model Different Aspects of Syntax Intersected EM [Klein 2005, Liang et. al. ‘06] P(T, F, F’) = P CCM (B(T), F’) P CFG + (T,F) CCMCFG PP NP

18 Grammar Induction Experiments Intersected CFG + and CCM Add CCM brackets Results

19 Reacting to Errors Possessive NPs Our TreeTarget Tree

20 Reacting to Errors Add Category: NP-POS NN POS New Analysis

21 Error Analysis Modal VPs Our TreeTarget Tree

22 Reacting to Errors Add Category: VP-INF VB NN New Analysis

23 Fixing Errors Supplement Prototypes NP-POS and VP-INF Results Unlabeled F 1 78.2

24 Results Summary 26.3 65.1 86.1 62.2 56.8 050100 PCFG PROTO CCM Best Upper Bound Labeled F1 ¼ 50% Error reduction From PCFG to BEST

25 Conclusion Prototype-Driven Learning Flexible Weakly Supervised Framework Merged distributional clustering techniques with structured models

26 Thank You! http://www.cs.berkeley.edu/~aria42

27 Lots of numbers

28 More numbers

29 Yet more numbers


Download ppt "Prototype-Driven Grammar Induction Aria Haghighi and Dan Klein Computer Science Division University of California Berkeley."

Similar presentations


Ads by Google