Presentation is loading. Please wait.

Presentation is loading. Please wait.

Multi-Layer Filtering algorithm Bilingual Chunk Alignment In Statistical Machine Translation An introduction of Multi-Layer Filtering (MLF) algorithm Dawei.

Similar presentations


Presentation on theme: "Multi-Layer Filtering algorithm Bilingual Chunk Alignment In Statistical Machine Translation An introduction of Multi-Layer Filtering (MLF) algorithm Dawei."— Presentation transcript:

1 Multi-Layer Filtering algorithm Bilingual Chunk Alignment In Statistical Machine Translation An introduction of Multi-Layer Filtering (MLF) algorithm Dawei Hou LING 575 MT WIN07

2 Multi-Layer Filtering algorithm 2 What is the “Chunk” here ? I n this paper: The “Chunk” doesn’t rely on the information from tagging, parsing, syntax analyzing or segmenting A “Chunk” is a continuous words order

3 Multi-Layer Filtering algorithm 3 Why do we use “Chunk” in translations? C an leads to more fluent translations since chunk-based translations capture local reordering phenomena. C an successfully makes long sentences shorter, which benefits SMT algorithm’s performance. O btains accurate one-to-one alignment of each pair bilingual chunks. G reatly decrease search space and time complexity during translation.

4 Multi-Layer Filtering algorithm 4 What about other approaches? What about word-based translations?

5 Multi-Layer Filtering algorithm 5 Some background SMT systems employ word-based alignment models based on the five word-based statistical models proposed by IBM. P roblem: Still suffer from poor performance when used in the language pairs which have great differences in structures since these models fundamentally rely on word-level translation.

6 Multi-Layer Filtering algorithm 6 Some background A lignment algorithms based on phrases, chunks or structures and most of them based on complex syntax information. P roblem: Have proven to yield poor performance when dealing with long sentences; Heavily depend on the performance of associated tools such as parsers, POS taggers....

7 Multi-Layer Filtering algorithm 7 How do we get improvements from those problems by using chunk-based translations?

8 Multi-Layer Filtering algorithm 8 T o discover one-to-one pairs of bilingual chunks in the untagged well-formed bilingual sentence pairs M ulti-Layers are used to extract bilingual chunks according to different features of chunks in the bilingual corpus.

9 Multi-Layer Filtering algorithm 9 Summarization of Procedures F iltering the most frequent chunks C lustering the similar words and filtering the most frequent structures D eal with the remnant fragment K eeping one-to-one alignment

10 Multi-Layer Filtering algorithm 10 Filtering the most frequent chunks -- Step 1 A ssumption: The most co-occurrent word lists might be a potential chunk. A pply the formula-1 list below, we filter those word lists as initial monolingual chunks; formula-1 formula-2

11 Multi-Layer Filtering algorithm 11 The result of Filtering Step 1 What || kind || of || room || do || you || want || to || reserve 1.36 1.31 0.046 0.063 10.07 0.61 2.11 0.077 你 || 想 || 预 || 定 || 什 || 么 || 样 || 的 || 房 || 间 0.69 0.17 1.39 0.076 7.80 0.87 0.30 1.27 4.52 A n example :

12 Multi-Layer Filtering algorithm 12 Filtering the most frequent chunks -- Step 2 N ow we have : All the cohesion degrees between any two adjacent words in Source and Target sentences. A pplying the formula-3 list below, we will find the entire set of initial monolingual chunks; formula-3

13 Multi-Layer Filtering algorithm 13 The result of Filtering Step 2-1 What || kind || of || room || do || you || want || to || reserve 1.36 1.31 0.046 0.063 10.07 0.61 2.11 0.077 你 || 想 || 预 || 定 || 什 || 么 || 样 || 的 || 房 || 间 0.69 0.17 1.39 0.076 7.80 0.87 0.30 1.27 4.52 I n this case: n = int{ 10/4 } = 2;

14 Multi-Layer Filtering algorithm 14 The result of Filtering Step 2-(1)-EN Initial Chunks DkDkDkDk Dk*Dk*Dk*Dk* DkDkDkDk Dk*Dk*Dk*Dk* What kind1.36 You want0.61 What kind of2.105.25You want to0.330.82 Kind of1.31 You want to reserve0.0860.60 Do you10.07 Want to2.11 Do you want0.310.77Want to reserve0.0560.14 Do you want to0.130.90To reserve0.077 N ow we get a table of the initial monolingual chunks ; formula-4

15 Multi-Layer Filtering algorithm 15 The result of Filtering Step 2-(2)-EN Initial Chunks DkDkDkDk Dk*Dk*Dk*Dk* DkDkDkDk Dk*Dk*Dk*Dk* What kind1.36 What kind of2.105.25 Kind of1.31 Do you10.07 Want to2.11 S et threshold D k *> 1.0, we get : W e still need more steps to do maximum matching and overlap discarding;

16 Multi-Layer Filtering algorithm 16 The result of Filtering Step 2-(3)-EN Initial Chunks DkDkDkDk Dk*Dk*Dk*Dk* DkDkDkDk Dk*Dk*Dk*Dk* What kind1.36 Want to2.11 What kind of2.105.25 Kind of1.31 Do you10.07 A ccording to the maximum matching principle and Preventing overlapping problem, we need to apply : formula-4:formula-5:

17 Multi-Layer Filtering algorithm 17 The result of Filtering Step 2-(4)-EN D eal with the remnant fragment: we simply combine such individual or sequential words as a chunk. S o we get a much shorter sentence lists below: What & kind & of || room || do & you || want & to || reserve

18 Multi-Layer Filtering algorithm 18 The result of Filtering Step 2-(1)-CN What || kind || of || room || do || you || want || to || reserve 1.36 1.31 0.046 0.063 10.07 0.61 2.11 0.077 你 || 想 || 预 || 定 || 什 || 么 || 样 || 的 || 房 || 间 0.69 0.17 1.39 0.076 7.80 0.87 0.30 1.27 4.52 I n this case: n = int{ 10/4 } = 2;

19 Multi-Layer Filtering algorithm 19 The result of Filtering Step 2-(2)-CN Initial Chunks DkDkDkDk Dk*Dk*Dk*Dk* DkDkDkDk Dk*Dk*Dk*Dk* 你想 0.69 么样的房 0.130.55 预定 2.39 样的 0.30 什么 7.80 样的房 0.130.30 什么样 0.441.00 样的房间 0.210.88 什么样的 0.582.44 的房 1.27 么样 0.87 的房间 2.455.88 么样的 0.370.84 房间 4.52 N ow we get a table of the initial monolingual chunks ; formula-4

20 Multi-Layer Filtering algorithm 20 The result of Filtering Step 2-(3)-CN S et threshold D k *> 1.0, we get : W e still need more steps to do maximum matching and overlap discarding; Initial Chunks DkDkDkDk Dk*Dk*Dk*Dk* DkDkDkDk Dk*Dk*Dk*Dk* 预定 2.39 什么 7.80 什么样的 0.582.44 的房 1.27 的房间 2.455.88 房间 4.52

21 Multi-Layer Filtering algorithm 21 The result of Filtering Step 2-(4)-CN Initial Chunks DkDkDkDk Dk*Dk*Dk*Dk* DkDkDkDk Dk*Dk*Dk*Dk* 预定 2.39 的房 1.27 什么 7.80 的房间 2.455.88 什么样的 0.582.44 房间 4.52 A ccording to the maximum matching principle : 的 By applying formula-4: max( D 什么样的 / D 什么样, D 的房间 / D 房间 ) = max(2.44,1.30) = 2.44 ?

22 Multi-Layer Filtering algorithm 22 The result of Filtering Step 2-(5)-CN D eal with the remnant fragment: we simply combine such individual or sequential words as a chunk. S o we get a much shorter sentence lists below: 你 || 想 || 预 & 定 || 什 & 么 & 样 & 的 || 房 & 间

23 Multi-Layer Filtering algorithm 23 Some problems After fisrt filtering process, suppose we found an aligned chunk pairs: || 在 & 五 & 点 || || at & five & o’clock || But some potentially good chunks like: Might have been broken into several fragments like: Since this structure include word sequences with low frequency of occurrence (we suppose “six” is lower frequent than “five” here ) || at & six & o’clock || || at || six || o’clock ||

24 Multi-Layer Filtering algorithm 24 Clustering the similar words and filtering the most frequent structures M any frequent chunks have similar structures but different in detail. W e can cluster similar words according to the position vectors of their behavior relative to anchor words. F or all of the words in the same class, we suppose they are good chunks, then filter the most frequent structures according the method introduced before.

25 Multi-Layer Filtering algorithm 25 Clustering the similar words and filtering the most frequent structures – Step 1 I n the corpus resulting from the first filtering process, find the most frequent words as anchor words, for example: Rank12345678910 Wordtheatothisforinonofatroom W hy we use most frequent words? As the anchor words are the most common words, a great deal of information can be obtained. Words in similar position vectors in relation to anchor words can be assumed to belong to similar word classes.

26 Multi-Layer Filtering algorithm 26 Clustering the similar words and filtering the most frequent structures – Step 2 B uild words vectors and define the size of the window for observation. (in this case windows size = 5) For instance, we build a word vector which anchor word is “in” and we observe a candidate word “the” to be clustered falls within the window: Size5 Positionw-2w-1ww+1w+2 Wordthe inthe Value16104150 Formula-7,8:

27 Multi-Layer Filtering algorithm 27 Clustering the similar words and filtering the most frequent structures – Step 3 I n order to compare vectors fairly, these vectors must be normalized by formular-9 as follows: E xample : “in/that” and “in/this”

28 Multi-Layer Filtering algorithm 28 Clustering the similar words and filtering the most frequent structures – Step 4 M easure the similarities of various vectors and cluster the words which have similar distributions relative to the anchor words: Euclidian distance: E xample result: Word classis Anchor words Single double twin standard suite different quiet(a, room) the my your this that our(in, room) America all fact Japan English(in, )

29 Multi-Layer Filtering algorithm 29 Clustering the similar words and filtering the most frequent structures – Step 5 F or all of the words in the same class, replace with a particular symbol, and then consider this symbol as an ordinary word. Then filter the most frequent structures my Multi-Layer Filtering algorithm again. For instance, if we have: || 在 & 五 & 点 || || at & five & o’clock || parallel word classes: & { One, two,…, five..., twelve } We will get : { 一, 二, …, 五..., 十二 } || 在 & 一 & 点 || || at & one & o’clock || || 在 & 两 & 点 || || at & two & o’clock ||...

30 Multi-Layer Filtering algorithm 30 Keeping one-to-one alignment Next step:

31 Multi-Layer Filtering algorithm 31 Keeping one-to-one alignment N ow we have a pair of new parallel sentences with chunks: 你 || 想 || 预 & 定 || 什 & 么 & 样 & 的 || 房 & 间 What & kind & of || room || do & you || want & to || reserve the chunks to be aligned may occur almost equally in the corresponding parallel texts. O ur purpose is to find one-to-one chunk alignment on the assumption that the chunks to be aligned may occur almost equally in the corresponding parallel texts.

32 Multi-Layer Filtering algorithm 32 Keeping one-to-one alignment B y applying the formular-11, we can get a alignment table: formular-11:θ你想预定什么样的房间 What kind of 0.0250.0210.0530.8890.016 Room0.0210.0290.090.0140.888 Do you 0.4600.0140.0020.0120.020 Want to 0.0070.0690.0130.0020.023 reserve0.0020.0010.0830.0340.047

33 Multi-Layer Filtering algorithm 33 Experiments T raining data: 55,000 pairs of Chinese-English spoken parallel sentences T est data: 400 pairs of Chinese-English spoken parallel sentences were chosen randomly from the same corpus. These 400 pairs sentences manually partitioned to obtain monolingual chunks and then manually aligned the corresponding bilingual chunks for computing the chunking and alignment accuracy.

34 Multi-Layer Filtering algorithm 34 Experiments E valuation: Comparing the automatically obtained monolingual chunks and aligned bilingual chunks to chunks discovered manually, we compute their precision, recall and F-Measure value by the followed formula:

35 Multi-Layer Filtering algorithm 35 Experiments The accuracy of chunking Precision(%)Recall(%)F-Measure 77650.70 R esults: The accuracy of alignment Precision(%)Recall(%)F-Measure 89720.80

36 Multi-Layer Filtering algorithm 36 Experiments C omparisions of chunk-based translation to word-based translation: SystemsBLEUNIST Word-based0.2592.661 Chunk-based0.2902.921 Improvement+ 0.031+ 0.260 T he improvement is about 10%.

37 Multi-Layer Filtering algorithm 37 Conclusions T his chunking and alignment algorithm doesn’t rely on the information from tagging, parsing or syntax analysis, and doesn’t even require sentence segmentation. I t obtains accurate one-to-one alignment of chunks I t greatly decreases search space and time complexity during translation. T he performance is better than baseline word alignment system. (in some tasks)

38 Multi-Layer Filtering algorithm 38 Problem / Weakness A uthors didn’t say anything. M aybe we can do some improvement at: The step of maximum matching The step of building position vectors


Download ppt "Multi-Layer Filtering algorithm Bilingual Chunk Alignment In Statistical Machine Translation An introduction of Multi-Layer Filtering (MLF) algorithm Dawei."

Similar presentations


Ads by Google