Download presentation

Presentation is loading. Please wait.

Published byTomas Sorrells Modified over 2 years ago

1
Learning and Inference for Hierarchically Split PCFGs Slav Petrov and Dan Klein

2
The Game of Designing a Grammar Annotation refines base treebank symbols to improve statistical fit of the grammar Parent annotation [Johnson ’98]

3
The Game of Designing a Grammar Annotation refines base treebank symbols to improve statistical fit of the grammar Parent annotation [Johnson ’98] Head lexicalization [Collins ’99, Charniak ’00]

4
The Game of Designing a Grammar Annotation refines base treebank symbols to improve statistical fit of the grammar Parent annotation [Johnson ’98] Head lexicalization [Collins ’99, Charniak ’00] Automatic clustering?

5
Forward Learning Latent Annotations EM algorithm: X1X1 X2X2 X7X7 X4X4 X5X5 X6X6 X3X3 Hewasright. Brackets are known Base categories are known Only induce subcategories Just like Forward-Backward for HMMs. Backward [Matsuzaki et al. ‘05]

6
Overview Limit of computational resources - Hierarchical Training - Adaptive Splitting - Parameter Smoothing

7
Refinement of the DT tag DT-1 DT-2 DT-3 DT-4 DT

8
Refinement of the DT tag DT

9
Hierarchical refinement of the DT tag DT

10
Hierarchical Estimation Results ModelF1 Baseline87.3 Hierarchical Training88.4

11
Refinement of the, tag Splitting all categories the same amount is wasteful:

12
Adaptive Splitting Want to split complex categories more Idea: split everything, roll back splits which were least useful Likelihood with split reversed Likelihood with split

13
Adaptive Splitting Want to split complex categories more Idea: split everything, roll back splits which were least useful Likelihood with split reversed Likelihood with split

14
Adaptive Splitting Results ModelF1 Previous88.4 With 50% Merging89.5

15
Number of Phrasal Subcategories

16
PP VP NPNP Number of Phrasal Subcategories

17
X NA C Number of Phrasal Subcategories

18
TOTO, PO S Number of Lexical Subcategories

19
N NN S NN P JJ

20
Smoothing Heavy splitting can lead to overfitting Idea: Smoothing allows us to pool statistics

21
ModelF1 Previous89.5 With Smoothing90.7 Result Overview

22
Proper Nouns (NNP): Personal pronouns (PRP): NNP-14Oct.Nov.Sept. NNP-12JohnRobertJames NNP-2J.E.L. NNP-1BushNoriegaPeters NNP-15NewSanWall NNP-3YorkFranciscoStreet PRP-0ItHeI PRP-1ithethey PRP-2itthemhim Linguistic Candy

23
Relative adverbs (RBR): Cardinal Numbers (CD): RBR-0furtherlowerhigher RBR-1morelessMore RBR-2earlierEarlierlater CD-7onetwoThree CD-4198919901988 CD-11millionbilliontrillion CD-0150100 CD-313031 CD-9785834

24
Inference She heard the noise. Exhaustive parsing: 1 min per sentence

25
Coarse-to-Fine Parsing [Goodman ‘97, Charniak&Johnson ‘05] Coarse grammar NP … VP Treebank Parse Prune NP-17 NP-12 NP-1 VP-6 VP-31… Refined grammar … Parse

26
Hierarchical Pruning Consider again the span 5 to 12: …QPNPVP… coarse: split in two: …QP1QP2NP1NP2VP1VP2… …QP1 QP3QP4NP1NP2NP3NP4VP1VP2VP3VP4… split in four: split in eight: …………………………………………… < t

27
Intermediate Grammars X-Bar= G 0 G= G1G2G3G4G5G6G1G2G3G4G5G6 Learning DT 1 DT 2 DT 3 DT 4 DT 5 DT 6 DT 7 DT 8 DT 1 DT 2 DT 3 DT 4 DT 1 DT DT 2

28
G1G2G3G4G5G6G1G2G3G4G5G6 Learning G1G2G3G4G5G6G1G2G3G4G5G6 Projected Grammars X-Bar= G 0 G= Projection i 0(G)1(G)2(G)3(G)4(G)5(G)0(G)1(G)2(G)3(G)4(G)5(G) G

29
Final Results (Efficiency) Parsing the development set (1600 sentences) Berkeley Parser: 10 min Implemented in Java Charniak & Johnson ‘05 Parser 19 min Implemented in C

30
Final Results (Accuracy) ≤ 40 words F1 all F1 ENG Charniak&Johnson ‘05 (generative)90.189.6 This Work90.690.1 GER Dubey ‘0576.3- This Work80.880.1 CHN Chiang et al. ‘0280.076.6 This Work86.383.4

31
Extensions Acoustic modeling Infinite Grammars Nonparametric Bayesian Learning [Petrov, Pauls & Klein ‘07] [Liang, Petrov, Jordan & Klein ‘07]

32
Conclusions Split & Merge Learning Hierarchical Training Adaptive Splitting Parameter Smoothing Hierarchical Coarse-to-Fine Inference Projections Marginalization Multi-lingual Unlexicalized Parsing

33
Thank You! http://nlp.cs.berkeley.edu

Similar presentations

OK

Improved Inference for Unlexicalized Parsing Slav Petrov and Dan Klein.

Improved Inference for Unlexicalized Parsing Slav Petrov and Dan Klein.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google