Download presentation

Presentation is loading. Please wait.

Published byTomas Sorrells Modified over 2 years ago

1
Learning and Inference for Hierarchically Split PCFGs Slav Petrov and Dan Klein

2
The Game of Designing a Grammar Annotation refines base treebank symbols to improve statistical fit of the grammar Parent annotation [Johnson ’98]

3
The Game of Designing a Grammar Annotation refines base treebank symbols to improve statistical fit of the grammar Parent annotation [Johnson ’98] Head lexicalization [Collins ’99, Charniak ’00]

4
The Game of Designing a Grammar Annotation refines base treebank symbols to improve statistical fit of the grammar Parent annotation [Johnson ’98] Head lexicalization [Collins ’99, Charniak ’00] Automatic clustering?

5
Forward Learning Latent Annotations EM algorithm: X1X1 X2X2 X7X7 X4X4 X5X5 X6X6 X3X3 Hewasright. Brackets are known Base categories are known Only induce subcategories Just like Forward-Backward for HMMs. Backward [Matsuzaki et al. ‘05]

6
Overview Limit of computational resources - Hierarchical Training - Adaptive Splitting - Parameter Smoothing

7
Refinement of the DT tag DT-1 DT-2 DT-3 DT-4 DT

8
Refinement of the DT tag DT

9
Hierarchical refinement of the DT tag DT

10
Hierarchical Estimation Results ModelF1 Baseline87.3 Hierarchical Training88.4

11
Refinement of the, tag Splitting all categories the same amount is wasteful:

12
Adaptive Splitting Want to split complex categories more Idea: split everything, roll back splits which were least useful Likelihood with split reversed Likelihood with split

13
Adaptive Splitting Want to split complex categories more Idea: split everything, roll back splits which were least useful Likelihood with split reversed Likelihood with split

14
Adaptive Splitting Results ModelF1 Previous88.4 With 50% Merging89.5

15
Number of Phrasal Subcategories

16
PP VP NPNP Number of Phrasal Subcategories

17
X NA C Number of Phrasal Subcategories

18
TOTO, PO S Number of Lexical Subcategories

19
N NN S NN P JJ

20
Smoothing Heavy splitting can lead to overfitting Idea: Smoothing allows us to pool statistics

21
ModelF1 Previous89.5 With Smoothing90.7 Result Overview

22
Proper Nouns (NNP): Personal pronouns (PRP): NNP-14Oct.Nov.Sept. NNP-12JohnRobertJames NNP-2J.E.L. NNP-1BushNoriegaPeters NNP-15NewSanWall NNP-3YorkFranciscoStreet PRP-0ItHeI PRP-1ithethey PRP-2itthemhim Linguistic Candy

23
Relative adverbs (RBR): Cardinal Numbers (CD): RBR-0furtherlowerhigher RBR-1morelessMore RBR-2earlierEarlierlater CD-7onetwoThree CD-4198919901988 CD-11millionbilliontrillion CD-0150100 CD-313031 CD-9785834

24
Inference She heard the noise. Exhaustive parsing: 1 min per sentence

25
Coarse-to-Fine Parsing [Goodman ‘97, Charniak&Johnson ‘05] Coarse grammar NP … VP Treebank Parse Prune NP-17 NP-12 NP-1 VP-6 VP-31… Refined grammar … Parse

26
Hierarchical Pruning Consider again the span 5 to 12: …QPNPVP… coarse: split in two: …QP1QP2NP1NP2VP1VP2… …QP1 QP3QP4NP1NP2NP3NP4VP1VP2VP3VP4… split in four: split in eight: …………………………………………… < t

27
Intermediate Grammars X-Bar= G 0 G= G1G2G3G4G5G6G1G2G3G4G5G6 Learning DT 1 DT 2 DT 3 DT 4 DT 5 DT 6 DT 7 DT 8 DT 1 DT 2 DT 3 DT 4 DT 1 DT DT 2

28
G1G2G3G4G5G6G1G2G3G4G5G6 Learning G1G2G3G4G5G6G1G2G3G4G5G6 Projected Grammars X-Bar= G 0 G= Projection i 0(G)1(G)2(G)3(G)4(G)5(G)0(G)1(G)2(G)3(G)4(G)5(G) G

29
Final Results (Efficiency) Parsing the development set (1600 sentences) Berkeley Parser: 10 min Implemented in Java Charniak & Johnson ‘05 Parser 19 min Implemented in C

30
Final Results (Accuracy) ≤ 40 words F1 all F1 ENG Charniak&Johnson ‘05 (generative)90.189.6 This Work90.690.1 GER Dubey ‘0576.3- This Work80.880.1 CHN Chiang et al. ‘0280.076.6 This Work86.383.4

31
Extensions Acoustic modeling Infinite Grammars Nonparametric Bayesian Learning [Petrov, Pauls & Klein ‘07] [Liang, Petrov, Jordan & Klein ‘07]

32
Conclusions Split & Merge Learning Hierarchical Training Adaptive Splitting Parameter Smoothing Hierarchical Coarse-to-Fine Inference Projections Marginalization Multi-lingual Unlexicalized Parsing

33
Thank You! http://nlp.cs.berkeley.edu

Similar presentations

OK

Statistical NLP Spring 2010 Lecture 14: PCFGs Dan Klein – UC Berkeley.

Statistical NLP Spring 2010 Lecture 14: PCFGs Dan Klein – UC Berkeley.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on spinal cord diseases Ppt on atrial septal defect in infants Ppt on differential aptitude test Atoms for kids ppt on batteries Ppt on job evaluation and merit rating Ppt on networking related topics on typhoons Ppt on non biodegradable waste recycling Ppt on area and perimeter of quadrilaterals Ppt on ozone layer Ppt on conservation of environment images