Presentation is loading. Please wait.

Presentation is loading. Please wait.

Stochastic Inversion Transduction Grammars Dekai Wu 11-734 Advanced Machine Translation Seminar Presented by: Sanjika Hewavitharana 04/13/2006.

Similar presentations


Presentation on theme: "Stochastic Inversion Transduction Grammars Dekai Wu 11-734 Advanced Machine Translation Seminar Presented by: Sanjika Hewavitharana 04/13/2006."— Presentation transcript:

1 Stochastic Inversion Transduction Grammars Dekai Wu 11-734 Advanced Machine Translation Seminar Presented by: Sanjika Hewavitharana 04/13/2006

2 Overview lSimple Transduction Grammars lInversion Transduction Grammars (ITGs) lStochastic ITGs lParsing with SITGs lApplications of SITGs lMain Reading: Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora (1997)Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora

3 Introduction lMathematical models of translation lIBM Models (Brown et al.): String generates String lSyntax based (Yamada & Kenji): Tree generates String lITG (Wu): two trees are generated simultaneously lITGs lA formalism for modeling bilingual sentence pairs lNot intended to use as full translation models, but to use for parallel corpus analysis lExtract useful structures from input data lGenerative view rather than translation view ltwo output trees are generated simultaneously, one for each language

4 Transduction Grammars lA simple transduction grammar is a CFG whose terminals are pairs of symbols (or singletons) lCan be used to model the generation of bilingual sentence pairs E The Financial Secretary and I will be accountable. C

5 Transduction Grammar Rules E.g. lSimple Rules: lInversion Rule:

6 Transduction Grammars lA simple transduction grammar is a CFG whose terminals are pairs of symbols (or singletons) lCan be used to model the generation of bilingual sentence pairs E C

7 Transduction Grammars lIn general, they are not very useful ltwo languages should share exactly the same grammatical structure lSo some sentence pairs cannot be generated lITG removes the rigid parallel ordering constraint lConstituent order in one language may be the inverse of the other language lOrder is the same for both (square brackets): lOrder is inverted for one (angle brackets):

8 ITGs le.g. lWith ITG we can parse the previous sentence pair lInversion rule: VP   VV PP 

9 ITG Parse Tree

10 Expressiveness of ITGs

11 lNot all matching are possible with ITG le.g. ‘Inside-out’ matching are not allowed lThis helps to reduce the combinatorial growth of matchings with the number of tokens lThe number of matchings eliminated increases rapidly as the number of tokens increases lAuthor claims this is a benefit

12 Expressiveness of ITGs

13 Normal Form of ITG lFor any ITG there exists an equivalent grammar in the normal form lRight hand side of all rules have either: lTerminal couples lTerminal singletons lPairs of non-terminals with straight orientation lPairs of non-terminals with inverted orientation

14 Stochastic ITGs lA probability can be assigned to each rewrite rule lThe probabilities of all the rules with a given left hand side must sum to 1. lAn SITG will give the most probable matching (ML) parse for a sentence pair. lSimilar to Viterbi or CYK (Chart) parsing

15 Parsing with SITGs lEvery node (q) in the parse tree has 5 elements: lBegin & end indices for language-1 string (s,t) lBegin & end indices for language-2 string (u,v) lNon-terminal category (i) lEach cell (in the chart) stores the probability of the most likely parse covering the appropriate substrings, rooted in the appropriate category

16 Parsing with SITGs - Algorithm lInitialize the cells corresponding to terminals using a translation lexicon lFor the other cells, recursively find the most probable way of obtaining that nonterminal category. lCompute the probability by multiplying the probability of the rule by the probabilities of both the constituents lStore that probability plus the orientation of the rule lComplexity: O(n 3 m 3 )

17 Applications of SITGs lSegmentation lBracketing lAlignment lBilingual Constraint Transfer lMining parallel sentences from comparable corpora [Wu & Fung 2005]

18 Applications of SITGs - Segmentation lWord boundaries are not marked in Chinese text lNo word chunks available for matching lOne option : do word segmentation as preprocessing lMight produce chunks with that does not agree bilingually lSolution: extend the algorithm to accommodate segmentation lAllow the initialization step to find strings of any length in the translation lexicon lThe recursive step stores the most probable way of creating a constituent, whether it came from the lexicon or from rules

19 Applications of SITGs – Bracketing lHow to assign structure to a sentence with no grammar available? lEspecially problematic for minority language lA solution using ITGs: lGet a parallel corpus pairing it with some other language lGet a reasonable translation dictionary lParse it with a bracketing transduction grammar

20 Bracketing Transduction Grammar lA minimal ITG lOnly one nonterminal: A lProduction rules: lLexical translation probabilities has prominence lSmall prob. values for the two singleton production rules lAlso, a very small value for

21 Bracketing with Singletons lSingletons cause bracketing errors lSome refinements: lDepending on the language, bias the singletons attachment either to the left or the right of a constituent lApply a series of transformations which would push the singletons as closely as possible towards couples e.g. [ x  A B  ] ⇌  x  A B   ⇌   x A  B  ⇌  [x A ] B  lBefore: lAfter:

22 Bracketing Experiments lUsed 2000 Chinese-English sentence-pairs from HKUST corpus lSome filtering: lRemove sentence pairs that were not adequately covered by the lexicon (>1 unknown words) lRemove sentence pairs with high unmatched words (>2) lBracketing precision: l80% for English l78% for Chinese lErrors mainly due to lexical imperfections l A statistical lexicon (~6.5k English, ~5.5k Chinese words) lCan be improved with extra information le.g. POS, grammar-based bracketer

23 Applications of SITGs - Alignment lAlignments (phrasal or word) are a natural byproduct of bilingual parsing lUnlike ‘parse-parse-match’ methods, this lDoesn’t require a robust grammar for both languages lGuarantees compatibility between parses lHas a principled way of choosing between possible alignments lProvides a more reasonable ‘distortion penalty’ lRecent empirical studies show ITGs produce better alignments in various applications [Wu & Fung 2005]

24 Bilingual Constraint Transfer lA high-quality parse for one language can be leveraged to get structure for the other lAlter the parsing algorithm: lonly allow constituents that match the parse that already exists for the well-studied language lThis works for any sort of constraint supplied for the well-studied language

25 References: lDekai Wu (1997), Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora, Computational Linguistics, Vol. 23, no. 1, pp. 377-403.Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora lDekai Wu (1995), Grammarless Extraction of Phrasal Translation Examples from Parallel Texts, 6 th Intl. Conf.on Theoretical and Methodological Issues in Machine Translation, Vol. 2, pp. 354-372. Leuven, Belgium.Grammarless Extraction of Phrasal Translation Examples from Parallel Texts lDekai Wu and Pascale FUNG (2005), Inversion Transduction Grammar Constraints for Mining Parallel Sentences from Quasi- Comparable Corpora, 2 nd Intl. Joint Conf. on Natural Language Processing (IJCNLP-2005), Jeju, Korea, October.Inversion Transduction Grammar Constraints for Mining Parallel Sentences from Quasi- Comparable Corpora


Download ppt "Stochastic Inversion Transduction Grammars Dekai Wu 11-734 Advanced Machine Translation Seminar Presented by: Sanjika Hewavitharana 04/13/2006."

Similar presentations


Ads by Google