Stephan Vogel - Machine Translation1 Machine Translation Distortion Model Stephan Vogel Spring Semester 2011.

Stephan Vogel - Machine Translation1 Machine Translation Distortion Model Stephan Vogel Spring Semester 2011

Stephan Vogel - Machine Translation2 Recap: DM in Word Alignment Models lHMM alignment: Jump model lCan be conditioned on word classes lBalance between data and parameters in model lLarger corpora -> richer models F E 3 0 2

Stephan Vogel - Machine Translation3 Distance Model lDecoder typically generates target sequence sequentially, while jumping forth and back on source sentence lSimplest reordering model lCost of a reordering depends only on the distance of the reordering lDistribution can be estimated from alignment lOr just a Gaussian with mean 1 lOr log p( a j | a j-1, I) = a j – a j-1 i.e. reordering cost proportional to distance

Stephan Vogel - Machine Translation4 Lexicalized Reordering Models lInstead of conditioning on classes, condition on actual words lDifferent possibilities: lCondition on source words vs target words lCondition on words at start of jump (out-bound) vs words at landing point (in- bound) EE FF

Stephan Vogel - Machine Translation5 Block Distortion Model lGiven current block, look at links at the corners lTop: how did I come from previous phrase? lBottom: how do I continue to next phrase? F E Current Block Left Top Left Bottom Right Bottom Right Top Previous Block Next Block Current Block

Stephan Vogel - Machine Translation6 Block Distortion Model lTop-Left: prev-to-current = monotone F E Current Block Left Top Previous Block

Stephan Vogel - Machine Translation7 Block Distortion Model lTop-Right: prev-to-current = swap F E Current Block Right Top Previous Block

Stephan Vogel - Machine Translation8 Block Distortion Model lNeither top-left nor top-right: prev-to-current = disjoint F E Current Block Previous Block

Stephan Vogel - Machine Translation9 Block Distortion Model lBottom-Right: current-to-next = monotone F E Current Block Next Block

Stephan Vogel - Machine Translation10 Block Distortion Model lBottom-Left: current-to-next = swap F E Current Block Next Block

Stephan Vogel - Machine Translation11 Block Distortion Model lNeither bottom-Left nor bottom-right: current-to-next = disjoint F E Current Block Next Block

Stephan Vogel - Machine Translation12 Moses Code // orientation to previous E bool connectedLeftTop = isAligned( sentence, startF-1, startE-1 ); bool connectedRightTop = isAligned( sentence, endF+1, startE-1 ); if ( connectedLeftTop && !connectedRightTop) extractFileOrientation << "mono"; else if (!connectedLeftTop && connectedRightTop) extractFileOrientation << "swap"; else extractFileOrientation << "other"; // orientation to following E bool connectedLeftBottom = isAligned( sentence, startF-1, endE+1 ); bool connectedRightBottom = isAligned( sentence, endF+1, endE+1 ); if ( connectedLeftBottom && !connectedRightBottom) extractFileOrientation << " swap"; else if (!connectedLeftBottom && connectedRightBottom) extractFileOrientation << " mono"; else extractFileOrientation << " other";

Stephan Vogel - Machine Translation13 Block Distortion Model lFor each phrase pair 6 counts: 2 groups of 3 lFrom previous: monotone swap other lTo next: monotone swap other lNormalize for each group lWe do not model p( orientation | phase_pair_1, phrase_pair_2 ) lMany overlapping and embedded blocks lWould be too sparse lWe model p( orientation | phrase_pair, entering ) and p( orientation | phrase_pair, leaving ) lI.e. not really looking at the previous block, but only at the alignment link lFor each entry in the phrase table we have an entry in the distortion model

Stephan Vogel - Machine Translation14 Distortion Model Table acuerdo con el lugar de ||| according to the place of ||| 0.14286 0.14286 0.71429 0.71429 0.14286 0.14286 acuerdo con nuestra informaciÃ³n ||| according to our information ||| 0.14286 0.14286 0.71429 0.71429 0.14286 0.14286 acuerdo de pesca con Marruecos ||| fisheries agreement with Morocco ||| 0.92982 0.01754 0.05263 0.78947 0.01754 0.19298 acuerdo entre Israel y ||| agreement ||| 0.20000 0.20000 0.60000 0.20000 0.20000 0.60000 acuerdo no porque sea bueno, ||| agreement not because it is good, ||| 0.60000 0.20000 0.20000 0.60000 0.20000 0.20000 acuerdo sobre este punto ||| agreed on ||| 0.20000 0.20000 0.60000 0.20000 0.20000 0.60000 acuerdos a largo plazo se iniciaron en ||| long-term arrangements began in ||| 0.60000 0.20000 0.20000 0.60000 0.20000 0.20000 acuerdos globales, especialmente ||| global agreements - primarily ||| 0.20000 0.20000 0.60000 0.60000 0.20000 0.20000 lMany entries 0.6 0.2 … lPhrase pair seen only once lSimple smoothing

Stephan Vogel - Machine Translation15 Distance-based ITG Reordering Model lSimple ITG model had very weak reordering model lCondition it on size of blocks (subtrees) lCondition on distance (e.g. taken from HMM alignment) F E

Stephan Vogel - Machine Translation16 Summary lDistortion models in word alignment models lDecoders work on phrases -> distortion models or phrases lIn Moses: Block reordering (also called lexicalized) lConditioned on phrase pair lMonotone, swap, disjoint lAlternatives lBased on words at the boundaries lInbound/Outbound lEasy to have lexicalized distortion model for ITG

Stephan Vogel - Machine Translation1 Machine Translation Distortion Model Stephan Vogel Spring Semester 2011.

Similar presentations

Presentation on theme: "Stephan Vogel - Machine Translation1 Machine Translation Distortion Model Stephan Vogel Spring Semester 2011."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Stephan Vogel - Machine Translation1 Machine Translation Distortion Model Stephan Vogel Spring Semester 2011.

Similar presentations

Presentation on theme: "Stephan Vogel - Machine Translation1 Machine Translation Distortion Model Stephan Vogel Spring Semester 2011."— Presentation transcript:

Similar presentations

About project

Feedback