Presentation is loading. Please wait.

Presentation is loading. Please wait.

Reordering Model Using Syntactic Information of a Source Tree for Statistical Machine Translation Kei Hashimoto, Hirohumi Yamamoto, Hideo Okuma, Eiichiro.

Similar presentations


Presentation on theme: "Reordering Model Using Syntactic Information of a Source Tree for Statistical Machine Translation Kei Hashimoto, Hirohumi Yamamoto, Hideo Okuma, Eiichiro."— Presentation transcript:

1 Reordering Model Using Syntactic Information of a Source Tree for Statistical Machine Translation Kei Hashimoto, Hirohumi Yamamoto, Hideo Okuma, Eiichiro Sumita, and Keiichi Tokuda Nagoya Institute of Technology National Institute of Information and Communications Technology Kinki University ATR Spoken Language Communication Research Labs. 1 2 3 4 12,3 2,4 1,2

2 Background (1/2)  Phrase-based statistical machine translation  Can model local word reordering Short idioms Insertions and deletions of words  Errors in global word reordering  Word reordering constraint technique  Linguistically syntax based approach Source tree, target tree, both tree structures  Formal constraints on word permutations IBM distortion, lexical reordering model, ITG 2

3 Background (2/2)  Imposing a source tree on ITG (IST-ITG)  Extension of ITG constraints  Introduce a source sentence tree structure  Cannot evaluate the accuracy of the target word orders  Reordering model using syntactic information  Extension of IST-ITG constraints  Rotation of source-side parse-tree  Can be briefly introduce to the phrase-based translation system 3

4 Outline  Background  ITG & IST-ITG constraints  Proposed reordering model  Training of the proposed model  Decoding using the proposed model  Experiments  Conclusions and future work 4

5 Inversion transduction grammar  ITG constraints  All possible binary tree structures are generated from the source word sequence  The target sentence is obtained by rotating any node of the generated binary trees  Can reduce the number of target word orders  Not consider the tree structure instance 5

6 Imposing source tree on ITG  Directly introduce a source sentence tree structure to ITG 6 Source sentence tree structure This is a pen Source sentence Thisisapen The target sentence is obtained by rotating any node of source sentence tree structure The number of word orders is reduced to

7 Non-binary tree  The parsing results sometimes produce non- binary trees 7 ABCDE cdedceecd ceddecedc # of orders in non-binary subtree is Any reordering of child nodes in non-binary subtree is allowed

8 Problem of IST-ITG  Cannot evaluate the accuracy of the target word reordering ⇒ Assign an equal probability to all rotations 8 Propose reordering model using syntactic information Equal probability : source sentence

9 Outline  Background  ITG & IST-ITG constraints  Proposed reordering model  Training of the proposed model  Decoding using the proposed model  Experiments  Conclusions and future work 9

10  Rotation of each subtree type is modeled Abstract of proposed method 10 This is a pen Source sentence Reordering probability : monotone or swap = S+NP+VP = VP+AUX+NP = NP+DT+NN Subtree type Source-side parse-tree NP S VP AUX NP DT NN Thisisapen Reordering model using syntactic information

11  Statistical syntax-directed translation with extended domain of locality [Liang Huang et al. 2006]  Extract rules for tree-to-string translation  Consider syntactic information  Consider multi-level trees on the source-side Related work 1 11 NP VP NP VB S S( :NP, VP( :VB, :NP)) →

12  Proposed reordering model  Used in phrase-based translation  Estimation of proposed model is independently conducted from phrase extraction  Child node reordering in one-level subtree  Cannot represent complex reordering  Reordering using syntactic information can be briefly introduced to phrase-based translation Related work 2 12

13 Training algorithm (1/3)  Reordering model training 1. Word alignment 2. Parsing source sentence 13 1. NP S VP AUX NP DT NN 2. source target

14 Training algorithm (2/3) 3. Word alignments and source-side parse-trees are combined 4. Rotation position is checked (monotone or swap) 14 3. NP S VP AUX NP DT NN 1,2,3,4 2,3,4 1 2,3 4 23 = S+NP+VP ⇒ monotone = VP+AUX+NP ⇒ swap = NP+DT+NN ⇒ monotone 4.

15 5. Reordering probability of the subtree is estimated by counting each rotation position  Non-binary subtree  Any orderings for child nodes are allowed  Rotation positions are categorized into only two type ⇒ Monotone or other (swap) Training algorithm (3/3) 15 is the count of rotation position t included all training samples for the subtree type s

16  Target word orders which are not derived from rotating nodes of source-side parse-tree  Linguistic reasons Difference of sentence structures  Non-linguistic reasons Errors of word alignments and syntactic analysis Remove subtree samples 16 Subtree and are used as training samples Subtree is removed from training samples

17 Clustering of subtree type  Number of possible subtree types is large  Unseen subtree type  Subtree type observed a few times ⇒ Cannot model exactly  Clustering of subtree type  The number of training samples is less than a heuristic threshold  Estimate clustered model from the counts of clustered subtree types 17

18 Decode using proposed model  Phrase-based decoder  Constrained by IST-ITG constraints  Target sentence is generated by rotating any node of the source-side parse-tree  Target word ordering that destroys a source phrase is not allowed  Check the rotation positions of subtrees  Calculate the reordering probabilities 18

19  Calculate reordering probability Decode using proposed model 19 ABCDE ba Subtree Rotation position monotone swap monotone cde : monotone or swap Source sentence Target sentence

20  Calculate reordering probability Decode using proposed model 20 ABCDE cd Subtree Rotation position swap monotone eab : monotone or swap Source sentence Target sentence

21 Rotation position included in a phrase  Cannot determine the rotation position  Word alignments included a phrase are not clear ⇒ Assign the higher probability, monotone or swap 21 ABCDE Subtree Rotation position swap higher abcde Phrase

22 Outline  Background  ITG & IST-ITG constraints  Proposed reordering model  Training of the proposed model  Decoding using the proposed model  Experiments  Conclusions and future work 22

23 Experimental conditions  Compared methods  Baseline : IBM distortion, lexical reordering models  IST-ITG : Baseline + IST-ITG constraint  Proposed : Baseline + proposed reordering model  Training  GIZA++ toolkit  SRI language model toolkit  Minimum error rate training (BLEU-4)  Charniak parser 23

24 Experimental conditions (E-J)  English-to-Japanese translation experiment JST Japanese-English paper abstract corpus 24 EnglishJapanese Training dataSentences1.0M Words24.6M28.8M Development dataSentences2.0K Words50.1K58.7K Test dataSentences2.0K Words49.5K58.0K Dev. and test data: single reference

25 Experimental results (E-J)  Proposed reordering model  Results of test set 25 BaselineIST-ITGProposed BLEU-427.8729.3129.80 Subtree sample13M Remove sample3M (25.38%) Subtree type54K Threshold10 Number of models6K + clustered Coverage99.29% Improved 0.49 points from IST-ITG

26 Experimental conditions (E-C)  English-to-Chinese translation experiment NIST MT08 English-to-Chinese translation track 26 EnglishChinese Training dataSentences4.6M Words79.6M73.4M Development dataSentences1.6K Words46.4K39.0K Test dataSentences1.9K Words45.7K47.0K (Ave.) Test data: 4 referencesDev. data: single references

27 Experimental results (E-C)  Proposed reordering model  Results of test set 27 BaselineIST-ITGProposed BLEU-417.5418.6018.93 Subtree sample50M Remove sample10M (20.36%) Subtree type2M Threshold10 Number of models19K + clustered Coverage99.45% Improved 0.33 points from IST-ITG

28 Conclusions and future work  Conclusions  Extension of the IST-ITG constraints  Reordering using syntactic information can be briefly introduced to the phrase-based translation  Improve 0.49 points in BLEU from IST-ITG  Future work  Simultaneous training of translation and reordering models  Deal with the complex reordering which is due to difference of sentence tree structures 28

29 29 Thank you very much!

30 Number of target word orders  Number of target word orders in a target word sequence (binary tree) 30 # of wordsIST-ITGITGNo Constraint 1111 2222 482224 81288,55840,320 10512206,0983,628,800 1516,384745,387,0381,307,674,368,000

31 Example of subtree model  Monotone probability 31 Subtree type s S+PP+,NP+VP+.0.764 NP+DT+NN+NN0.816 VP+AUX+VP0.664 VP+VBN+PP0.864 NP+NP+PP0.837 NP+DP+JJ+NN0.805 Swap probability = 1.0 – Monotone probability


Download ppt "Reordering Model Using Syntactic Information of a Source Tree for Statistical Machine Translation Kei Hashimoto, Hirohumi Yamamoto, Hideo Okuma, Eiichiro."

Similar presentations


Ads by Google