Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistical Machine Translation. General Framework Given sentences S and T, assume there is a “translator oracle” that can calculate P(T|S), the probability.

Similar presentations


Presentation on theme: "Statistical Machine Translation. General Framework Given sentences S and T, assume there is a “translator oracle” that can calculate P(T|S), the probability."— Presentation transcript:

1 Statistical Machine Translation

2 General Framework Given sentences S and T, assume there is a “translator oracle” that can calculate P(T|S), the probability that an “ideal translator” will produce sentence T given sentence S. Our statistical translator tries to “reverse engineer” the ideal translator. That is, given T, it finds the S with highest probability P(S|T). We have: We want:

3

4 language model translation modelsearch method

5 Language model language model translation model can use n-gram model search method

6 Language model language model translation model can use n-gram model search method

7 Translation model Need alignment model that will allow us to calculate the probabilities of alignments, e.g., P [The (1) proposal (2) will (4) not (3,5) now (9) be implemented (6, 7, 8) | Les propositions ne seront pas mises en application maintenant] Target sentence Source sentence Notation for alignment: Les propositions ne seront pas mises en application maintenant | The (1) proposal (2) will (4, 5) not (3) now (9) be implemented (6, 7, 8)

8 Translation model Alignment model consists of: – fertility model (fertility = number of source words each target word is mapped to) – term-translation model – distortion model Target sentence Source sentence

9 Translation model (from Brown et al. paper): Need to calculate P (alignment), that is: P [The (1) proposal (2) will (4) not (3,5) now (9) be implemented (6, 7, 8) | Les propositions ne seront pas mises en application maintenant] To calculate this, we need: Fertility model: P(fertility =n |term) for each n (up to maximum value) and each target term Term-translation model: P(term S | term T ), the probability that term S appears in the source given that term T appears in the target Distortion model: One simple version is: assume position of target term depends only on position of source term and length of target sentence P(i | j, L) for each target position i, source position j, and target length L (limited to some maximum value for L) Target sentence Source sentence

10 Translation model (from Brown et al. paper): Need to calculate P (alignment), that is: P [The (1) proposal (2) will (4) not (3,5) now (9) be implemented (6, 7, 8) | Les propositions ne seront pas mises en application maintenant] Example: P [The (1) proposal (2) will (4) not (3,5) now (9) be implemented (6, 7, 8) | Les propositions ne seront pas mises en application maintenant] = P(fertility=1 | the) × P(les | the) × P(1 | 1, 7) × P(fertility=1 | proposal) × P(propositions | proposal) × P(2 | 2, 7) × P(fertility=1 | will) × P(seront | will) × P(3 | 4, 7) × P(fertility=2 | not) × P(ne | not) × P(pas | not) × P(4 | 3, 7) × P(4 | 5, 7) × etc. Target sentence Source sentence

11 How does the statistical translator learn these various models? From data, of course! E.g., massive amount of paired source/target sentences from UN translations How does the statistical translator search the database for the highest probability source sentence? See paper


Download ppt "Statistical Machine Translation. General Framework Given sentences S and T, assume there is a “translator oracle” that can calculate P(T|S), the probability."

Similar presentations


Ads by Google