Presentation is loading. Please wait.

Presentation is loading. Please wait.

Stephan Vogel - Machine Translation1 Machine Translation Distortion Model Stephan Vogel Spring Semester 2011.

Similar presentations


Presentation on theme: "Stephan Vogel - Machine Translation1 Machine Translation Distortion Model Stephan Vogel Spring Semester 2011."— Presentation transcript:

1 Stephan Vogel - Machine Translation1 Machine Translation Distortion Model Stephan Vogel Spring Semester 2011

2 Stephan Vogel - Machine Translation2 Recap: DM in Word Alignment Models lHMM alignment: Jump model lCan be conditioned on word classes lBalance between data and parameters in model lLarger corpora -> richer models F E 3 0 2

3 Stephan Vogel - Machine Translation3 Distance Model lDecoder typically generates target sequence sequentially, while jumping forth and back on source sentence lSimplest reordering model lCost of a reordering depends only on the distance of the reordering lDistribution can be estimated from alignment lOr just a Gaussian with mean 1 lOr log p( a j | a j-1, I) = a j – a j-1 i.e. reordering cost proportional to distance

4 Stephan Vogel - Machine Translation4 Lexicalized Reordering Models lInstead of conditioning on classes, condition on actual words lDifferent possibilities: lCondition on source words vs target words lCondition on words at start of jump (out-bound) vs words at landing point (in- bound) EE FF

5 Stephan Vogel - Machine Translation5 Block Distortion Model lGiven current block, look at links at the corners lTop: how did I come from previous phrase? lBottom: how do I continue to next phrase? F E Current Block Left Top Left Bottom Right Bottom Right Top Previous Block Next Block Current Block

6 Stephan Vogel - Machine Translation6 Block Distortion Model lTop-Left: prev-to-current = monotone F E Current Block Left Top Previous Block

7 Stephan Vogel - Machine Translation7 Block Distortion Model lTop-Right: prev-to-current = swap F E Current Block Right Top Previous Block

8 Stephan Vogel - Machine Translation8 Block Distortion Model lNeither top-left nor top-right: prev-to-current = disjoint F E Current Block Previous Block

9 Stephan Vogel - Machine Translation9 Block Distortion Model lBottom-Right: current-to-next = monotone F E Current Block Next Block

10 Stephan Vogel - Machine Translation10 Block Distortion Model lBottom-Left: current-to-next = swap F E Current Block Next Block

11 Stephan Vogel - Machine Translation11 Block Distortion Model lNeither bottom-Left nor bottom-right: current-to-next = disjoint F E Current Block Next Block

12 Stephan Vogel - Machine Translation12 Moses Code // orientation to previous E bool connectedLeftTop = isAligned( sentence, startF-1, startE-1 ); bool connectedRightTop = isAligned( sentence, endF+1, startE-1 ); if ( connectedLeftTop && !connectedRightTop) extractFileOrientation << "mono"; else if (!connectedLeftTop && connectedRightTop) extractFileOrientation << "swap"; else extractFileOrientation << "other"; // orientation to following E bool connectedLeftBottom = isAligned( sentence, startF-1, endE+1 ); bool connectedRightBottom = isAligned( sentence, endF+1, endE+1 ); if ( connectedLeftBottom && !connectedRightBottom) extractFileOrientation << " swap"; else if (!connectedLeftBottom && connectedRightBottom) extractFileOrientation << " mono"; else extractFileOrientation << " other";

13 Stephan Vogel - Machine Translation13 Block Distortion Model lFor each phrase pair 6 counts: 2 groups of 3 lFrom previous: monotone swap other lTo next: monotone swap other lNormalize for each group lWe do not model p( orientation | phase_pair_1, phrase_pair_2 ) lMany overlapping and embedded blocks lWould be too sparse lWe model p( orientation | phrase_pair, entering ) and p( orientation | phrase_pair, leaving ) lI.e. not really looking at the previous block, but only at the alignment link lFor each entry in the phrase table we have an entry in the distortion model

14 Stephan Vogel - Machine Translation14 Distortion Model Table acuerdo con el lugar de ||| according to the place of ||| 0.14286 0.14286 0.71429 0.71429 0.14286 0.14286 acuerdo con nuestra información ||| according to our information ||| 0.14286 0.14286 0.71429 0.71429 0.14286 0.14286 acuerdo de pesca con Marruecos ||| fisheries agreement with Morocco ||| 0.92982 0.01754 0.05263 0.78947 0.01754 0.19298 acuerdo entre Israel y ||| agreement ||| 0.20000 0.20000 0.60000 0.20000 0.20000 0.60000 acuerdo no porque sea bueno, ||| agreement not because it is good, ||| 0.60000 0.20000 0.20000 0.60000 0.20000 0.20000 acuerdo sobre este punto ||| agreed on ||| 0.20000 0.20000 0.60000 0.20000 0.20000 0.60000 acuerdos a largo plazo se iniciaron en ||| long-term arrangements began in ||| 0.60000 0.20000 0.20000 0.60000 0.20000 0.20000 acuerdos globales, especialmente ||| global agreements - primarily ||| 0.20000 0.20000 0.60000 0.60000 0.20000 0.20000 lMany entries 0.6 0.2 … lPhrase pair seen only once lSimple smoothing

15 Stephan Vogel - Machine Translation15 Distance-based ITG Reordering Model lSimple ITG model had very weak reordering model lCondition it on size of blocks (subtrees) lCondition on distance (e.g. taken from HMM alignment) F E

16 Stephan Vogel - Machine Translation16 Summary lDistortion models in word alignment models lDecoders work on phrases -> distortion models or phrases lIn Moses: Block reordering (also called lexicalized) lConditioned on phrase pair lMonotone, swap, disjoint lAlternatives lBased on words at the boundaries lInbound/Outbound lEasy to have lexicalized distortion model for ITG


Download ppt "Stephan Vogel - Machine Translation1 Machine Translation Distortion Model Stephan Vogel Spring Semester 2011."

Similar presentations


Ads by Google