Download presentation

Presentation is loading. Please wait.

Published byElaine Wyatt Modified over 2 years ago

1
Bayesian Learning of Non- Compositional Phrases with Synchronous Parsing Hao Zhang; Chris Quirk; Robert C. Moore; Daniel Gildea Z honghua li Mentor: Jun Lang 2011-10-21 I2R SMT-Reading Group 1

2
Paper info Bayesian Learning of Non-Compositional Phrases with Synchronous Parsing ACL-08 Long Paper Cited :Thirty Seven Authors: Hao Z hang Chris Quirk Robert C. Moore Daniel Gildea 2

3
Core Ideas Variational Bayes Tic-tac-toe pruning Word-to-phrase bootstrapping 3

4
Outline Paper present – Pipeline – Model – Training – Parsing (Pruning) – Result Shortcomings Discussion 4

5
Summary of the Pipeline Run IBM Model 1 on sentence-aligned data Use tic-tac-toe pruning to prune the bitext space Word-based ITG, Variational Bayes training, get the Viterbi alignment Non-compositional constraints to constrain the space of phrase pairs Phrasal ITG, VB training, Viterbi pass to get the phrasal alignment 5

6
Phrasal Inversion Transduction Grammar 6

7
Dirichlet Prior for Phrasal ITG 7

8
X1X1 X n-1 ZnZn X n+1 XNXN …….. root 0/0T/Vt/vs/u i Review : Inside-Outside Algorithm …….. Forward-backward Algorithm: not only used for HMM, but also for any State Space Model Inside-Outside Algorithm is a special case of Forward-backward Algorithm. Shujie liu 8

9
VB Algorithm for Training SITGs - E1 Inside probabilities : Initialization : Recursion : i (s/u-t/v) t/vs/u S/U j (s/u-S/U) k (S/U-t/v) Copy from liu 9

10
VB Algorithm for Training SITGs - E2 Outside probabilities : Initialization : Recursion : j (s/u-t/v) t/vS/U s/u k (S/U-s/u) i (s/u-t/v) Copy from liu 10

11
VB Algorithm for Training SITGs - E2 Outside probabilities : Initialization : Recursion : j (s/u-t/v) t/vS/U s/u k (S/U-s/u) i (s/u-t/v) Copy from liu 11

12
VB Algorithm for Training SITGs - E2 Outside probabilities : Initialization : Recursion : j (s/u-t/v) t/vS/U s/u k (S/U-s/u) i (s/u-t/v) Copy from liu 12

13
VB Algorithm for Training SITGs - E2 Outside probabilities : Initialization : Recursion : j (s/u-t/v) t/vS/U s/u k (S/U-s/u) i (s/u-t/v) Copy from liu 13

14
VB Algorithm for Training SITGs - E2 Outside probabilities : Initialization : Recursion : j (s/u-t/v) t/vS/U s/u k (S/U-s/u) i (s/u-t/v) j (s/u-t/v) S/Us/u i (S/U-s/u) k (s/u-t/v) t/v Copy from liu 14

15
VB Algorithm for Training SITGs - E2 Outside probabilities : Initialization : Recursion : j (s/u-t/v) t/vS/U s/u k (S/U-s/u) i (s/u-t/v) j (s/u-t/v) S/Us/u i (S/U-s/u) k (s/u-t/v) t/v Copy from liu 15

16
VB Algorithm for Training SITGs - M s=3, is the number of right-hand-sides for X m is the number of observed phrase pairs ψ is the digamma function 16

17
Pruning Tic-tac-toe pruning (Hao Z hang 2005) Fast Tic-tac-toe pruning (Hao Z hang 2008) High-precision alignments pruning (Haghighi ACL2009) – Prune all bitext cells that would invalidate more than 8 of high-precision alignments 1-1 alignment posterior pruning (Haghighi ACL2009) – Prune all 1-1 bitext cells that have a posterior below 10 -4 in both HMM Models 17

18
Tic-tac-toe pruning (Hao Z hang 2005) 18

19
Non-compositional Phrases Constraint e(i,j) number of links emitted from substring f(l,m) number of links emitted from substring 19

20
Word Alignment Evaluation Both 10 iterations training EM : lowest AER is achieved after the second iteration, which is 0.40. At iteration 10, AER for EM increase to 0.42 VB : ac is 1e-9, VB get AER close to 0.35 at iteration 10. 20

21
End-to-end Evaluation NIST Chinese-English training data NIST 2002 evaluation datasets for tuning and evalution 10-reference development set was used for MERT 4-reference test set was used for evaluation. 21

22
Shortcomings Grammar is not perfect Itg ordering is context independent Phrasal pairs are sparse 22

23
Grammar is not perfect Over-counting problem alternative ITG parse trees have the same word alignment matching, which is called over-counting problem. ITG Parser Tree SpaceWord Alignment Space I am rich ! ^^ vv 23

24
A better-constrained grammar A series of nested constituents with the same orientation will always have a left-heavy derivation And the second parser tree of the former example will not be generated. C->1/3C->2/4C-> 3/2C-> 4/1 A -> [C C] B -> ? 24

25
Thanks Q&A 25

Similar presentations

OK

Bayesian Speech Synthesis Framework Integrating Training and Synthesis Processes Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda Nagoya Institute.

Bayesian Speech Synthesis Framework Integrating Training and Synthesis Processes Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda Nagoya Institute.

© 2018 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on piezoelectric power generation Training ppt on leadership Ppt on life study of mathematician jobs File type ppt on cybercrime convention Ppt on water conservation methods Ppt on marketing management introduction Jit ppt on manufacturing engineer Ppt on union budget 2013 A ppt on air pollution Ppt on effective leadership skills