Download presentation

Presentation is loading. Please wait.

Published byPaul Hodde Modified over 2 years ago

1
Incrementally Learning Parameter of Stochastic CFG using Summary Stats Written by:Brent Heeringa Tim Oates

2
Goals: To learn the syntax of utterances Approach: SCFG (Stochastic Context Free Grammar) M= V-finite set of non-terminal E-finite set of terminals R-finite set of rules, each r has p(r). Sum of p(r) of the same left-hand side = 1 S-start symbol

3
Problems with most SCFG Learning Algorithms 1)Expensive storage: need to store a corpus of complete sentences 2)Time-consuming: algorithms needs to repeat passes throughout all data

4
Learning SCFG Inducing context-free structure from corpus(sentences) Learning – the production(rules) probabilities

5
Learning SCFG –Cont General method: Inside/Outside algorithm –Expectation- Maximization (EM) Find expectation of rules Maximize the likelihood given both expectation & corpus Disadvantage of Inside/Outside algo. –Entire sentence corpus must be stored using some representation(eg. chart parse) –Expensive storage (unrealistic for human agent!)

6
Proposed Algorithm Use Unique Normal Form (UNF) –Replace all terminal A-z to 2 new rules A->D p[A->D]=p[A->z] D-> z p[D->z]=1 –No two productions have the same right hand side

7
Learning SCFG- Proposed Algorithm -cont Use Histogram –Each rule has 2 histograms (H o r, H L r )

8
Proposed Algorithm -cont –H o r -contructed when parsing sentences in O – H L r- -will continue to be updated throughout learning process H L r rescale to fixed size h –Why?! –Recently used rules has more impact on histogram

9
Comparing between H L r & H o r Relative entropy T decrease- increase prob of rules used –(if s large, increase prob of rules used when parsing last sentence ) T increase- decrease prob of rules used (eg p t+1 (r)=0.01* p t+1 (r)

10
Comparing Inside/Outside Algo with the proposed algorithm Inside/Outside –O(n 3 ) Good –3-5 iterations Bad –Need to store complete sentence corpus Proposed Algo –O(n 3 ) Bad –500-1000 iterations Good –Memory requirements is constant!

Similar presentations

OK

PARSING David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.

PARSING David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on rulers and buildings class 7 Ppt on development in rural areas of india Ppt on unity in diversity dance Gastrointestinal anatomy and physiology ppt on cells Ppt online downloader from youtube Ppt on event driven programming advantages Ppt on data handling for class 3 Ppt on various dance forms of india Ppt on american vs british accent Dsp ppt on dft communications