Presentation is loading. Please wait.

Presentation is loading. Please wait.

INSTITUTE OF COMPUTING TECHNOLOGY Bagging-based System Combination for Domain Adaptation Linfeng Song, Haitao Mi, Yajuan Lü and Qun Liu Institute of Computing.

Similar presentations


Presentation on theme: "INSTITUTE OF COMPUTING TECHNOLOGY Bagging-based System Combination for Domain Adaptation Linfeng Song, Haitao Mi, Yajuan Lü and Qun Liu Institute of Computing."— Presentation transcript:

1 INSTITUTE OF COMPUTING TECHNOLOGY Bagging-based System Combination for Domain Adaptation Linfeng Song, Haitao Mi, Yajuan Lü and Qun Liu Institute of Computing Technology Chinese Academy of Sciences

2 INSTITUTE OF COMPUTING TECHNOLOGY An Example 2

3 INSTITUTE OF COMPUTING TECHNOLOGY An Example 3 Initial MT system

4 INSTITUTE OF COMPUTING TECHNOLOGY An Example 4 Development set A:90% B:10% Initial MT systemTuned MT system that fits domain A The translation styles of A and B are quite different

5 INSTITUTE OF COMPUTING TECHNOLOGY An Example 5 Development set A:90% B:10% Initial MT systemTuned MT system that fits domain A Test set A:10% B:90%

6 INSTITUTE OF COMPUTING TECHNOLOGY An Example 6 Development set A:90% B:10% Initial MT systemTuned MT system that fits domain A Test set A:10% B:90% The translation style fits A, but we mainly want to translate B

7 INSTITUTE OF COMPUTING TECHNOLOGY Traditional Methods 7 Monolingual data with domain annotation

8 INSTITUTE OF COMPUTING TECHNOLOGY Traditional Methods 8 Monolingual data with domain annotation Domain recognizer

9 INSTITUTE OF COMPUTING TECHNOLOGY Traditional Methods 9 Bilingual training data

10 INSTITUTE OF COMPUTING TECHNOLOGY Traditional Methods 10 Bilingual training data Domain recognizer training data : domain A training data : domain B

11 INSTITUTE OF COMPUTING TECHNOLOGY Traditional Methods 11 Bilingual training data Domain recognizer training data : domain A training data : domain B MT system domain A MT system domain B

12 INSTITUTE OF COMPUTING TECHNOLOGY Traditional Methods 12 Test set

13 INSTITUTE OF COMPUTING TECHNOLOGY Traditional Methods 13 Domain recognizer Test set Test set domain A Test set domain B

14 INSTITUTE OF COMPUTING TECHNOLOGY Traditional Methods 14 The translation result MT system domain A MT system domain B Test set domain A Test set domain B The translation result domain A The translation result domain B

15 INSTITUTE OF COMPUTING TECHNOLOGY The merits Simple and effective Fits Human’s intuition 15

16 INSTITUTE OF COMPUTING TECHNOLOGY The drawbacks Classification Error (CE) Especially for unsupervised methods Supervised methods can make CE low, yet requiring annotation data limits its usage 16

17 INSTITUTE OF COMPUTING TECHNOLOGY Our motivation Jump out of the alley of doing adaptation directly Statistics methods (such as Bagging) can help. 17

18 INSTITUTE OF COMPUTING TECHNOLOGY The general framework of Bagging Preliminary 18

19 INSTITUTE OF COMPUTING TECHNOLOGY General framework of Bagging 19 Training set D

20 INSTITUTE OF COMPUTING TECHNOLOGY General framework of Bagging 20 C1 Training set D Training set D1Training set D2Training set D3 …… C2C3 ……

21 INSTITUTE OF COMPUTING TECHNOLOGY General framework of Bagging 21 C1C2C3 …… Test sample

22 INSTITUTE OF COMPUTING TECHNOLOGY General framework of Bagging 22 C1C2C3 …… Test sample Result of C1Result of C2Result of C3 …… Voting result

23 INSTITUTE OF COMPUTING TECHNOLOGY Our method 23

24 INSTITUTE OF COMPUTING TECHNOLOGY Training 24 A,A,A,B,B Suppose there is a development set For simplicity, there are only 5 sentences, 3 belong A, 2 belong B

25 INSTITUTE OF COMPUTING TECHNOLOGY Training 25 A,A,A,B,B A,B,B,B,B A,A,B,B,B A,A,A,B,B A,A,A,A,B …… We bootstrap N new development sets

26 INSTITUTE OF COMPUTING TECHNOLOGY Training 26 A,A,A,B,B A,B,B,B,B A,A,B,B,B A,A,A,B,B A,A,A,A,B MT system-1 …… MT system-2 MT system-3 MT system-4 MT system-5 …… For each set, a subsystem is tuned

27 INSTITUTE OF COMPUTING TECHNOLOGY Decoding 27 For simplicity, Suppose only 2 subsystem has been tuned Subsystem-1 W: Subsystem-1 W:

28 INSTITUTE OF COMPUTING TECHNOLOGY Decoding 28 Subsystem-1 W: Subsystem-1 W: A B Now a sentence “A B” needs a translation

29 INSTITUTE OF COMPUTING TECHNOLOGY Decoding 29 Subsystem-1 W: Subsystem-1 W: A B a b; a c; a b; a d; After translation, each system generate its N- best candidate

30 INSTITUTE OF COMPUTING TECHNOLOGY Decoding 30 a b; a c; a d; Fuse these N-best lists and eliminate deductions Subsystem-1 W: Subsystem-1 W: A B a b; a c; a b; a d;

31 INSTITUTE OF COMPUTING TECHNOLOGY Decoding 31 a b; a c; a d; Subsystem-1 W: Subsystem-1 W: A B a b; a c; a b; a d; Candidates are identical only if their target strings and feature values are entirely equal

32 INSTITUTE OF COMPUTING TECHNOLOGY Decoding 32 Calculate the voting score a b; a c; a d; Subsystem-1 W: Subsystem-1 W: a b; ; -0.16 a b; ; +0.04 a c; ; -0.1 a d; ; -0.18 S represent the number of subsystems

33 INSTITUTE OF COMPUTING TECHNOLOGY Decoding 33 The one with the highest score wins a b; a c; a d; Subsystem-1 W: Subsystem-1 W: a b; ; -0.16 a b; ; +0.04 a c; ; -0.1 a d; ; -0.18

34 INSTITUTE OF COMPUTING TECHNOLOGY Decoding 34 The one with the highest score wins a b; a c; a d; Subsystem-1 W: Subsystem-1 W: a b; ; -0.16 a b; ; +0.04 a c; ; -0.1 a d; ; -0.18 Since subsystems are different copies of the same model and share unique training data, calibration is unnecessary

35 INSTITUTE OF COMPUTING TECHNOLOGY Experiments 35

36 INSTITUTE OF COMPUTING TECHNOLOGY Basic Setups Data: NTCIR9 Chinese-English patent corpus 1k sentence pairs as development set Another 1k pairs as test set The remains are used for training System: hierarchical phrase based model Alignment: GIZA++ grow-diag-final 36

37 INSTITUTE OF COMPUTING TECHNOLOGY Effectiveness : Show and Prove Tune 30 subsystems using Bagging Tune 30 subsystems with random initial weight Evaluate the fusion results of the first N (N=5,10, 15, 20, 30) subsystems of both and compare 37

38 INSTITUTE OF COMPUTING TECHNOLOGY Results: 1-best 38 Number of subsystem +0.82

39 INSTITUTE OF COMPUTING TECHNOLOGY Results: 1-best 39 Number of subsystem +0.70

40 INSTITUTE OF COMPUTING TECHNOLOGY Results: Oracle 40 Number of subsystem +6.22

41 INSTITUTE OF COMPUTING TECHNOLOGY Results: Oracle 41 Number of subsystem +3.71

42 INSTITUTE OF COMPUTING TECHNOLOGY Compare with traditional methods Evaluate a supervised method For tackling data sparsity only operate on development set and test set Evaluate a unsupervised method Similar to Yamada (2007) To avoid data sparsity, only LM specific 42

43 INSTITUTE OF COMPUTING TECHNOLOGY Results 43

44 INSTITUTE OF COMPUTING TECHNOLOGY Conclusions Propose a bagging-based method to address multi-domain translation problem. Experiments shows that: Bagging is effective for domain adaptation problem Our method surpass baseline explicitly, and is even better than some traditional methods. 44

45 INSTITUTE OF COMPUTING TECHNOLOGY 45 Thank you for listening And any questions?


Download ppt "INSTITUTE OF COMPUTING TECHNOLOGY Bagging-based System Combination for Domain Adaptation Linfeng Song, Haitao Mi, Yajuan Lü and Qun Liu Institute of Computing."

Similar presentations


Ads by Google