Automatic Rule Learning for Resource-Limited Machine Translation Alon Lavie, Katharina Probst, Erik Peterson, Jaime Carbonell, Lori Levin, Ralf Brown Language.

Automatic Rule Learning for Resource-Limited Machine Translation Alon Lavie, Katharina Probst, Erik Peterson, Jaime Carbonell, Lori Levin, Ralf Brown Language Technologies Institute Carnegie Mellon University

October 11, 2002AMTA 20022 Why Machine Translation for Minority and Indigenous Languages? Commercial MT economically feasible for only a handful of major languages with large resources (corpora, human developers) Is there hope for MT for languages with limited resources? Benefits include: –Better government access to indigenous communities (Epidemics, crop failures, etc.) –Better indigenous communities participation in information-rich activities (health care, education, government) without giving up their languages. –Language preservation –Civilian and military applications (disaster relief)

October 11, 2002AMTA 20023 MT for Minority and Indigenous Languages: Challenges Minimal amount of parallel text Possibly competing standards for orthography/spelling Often relatively few trained linguists Access to native informants possible Need to minimize development time and cost

October 11, 2002AMTA 20024 AVENUE Partners LanguageCountryInstitutions Mapudungun (in place) Chile Universidad de la Frontera, Institute for Indigenous Studies, Ministry of Education Quechua (discussion) Peru Ministry of Education Iñupiaq (discussion) US (Alaska) Ilisagvik College, Barrow school district, Alaska Rural Systemic Initiative, Trans-Arctic and Antarctic Institute, Alaska Native Language Center Siona (discussion) Colombia OAS-CICAD, Plante, Department of the Interior

October 11, 2002AMTA 20025 AVENUE: Two Technical Approaches Generalized EBMT Parallel text 50K- 2MB (uncontrolled corpus) Rapid implementation Proven for major L’s with reduced data Transfer-rule learning Elicitation (controlled) corpus to extract grammatical properties Seeded version- space learning

October 11, 2002AMTA 20026 AVENUE Architecture User Learning Module Elicitation Process SVS Learning Process Transfer Rules Run-Time Module SL Input SL Parser Transfer Engine TL Generator EBMT Engine Unifier Module TL Output

October 11, 2002AMTA 20027 Learning Transfer-Rules for Languages with Limited Resources Rationale: –Large bilingual corpora not available –Bilingual native informant(s) can translate and align a small pre-designed elicitation corpus, using elicitation tool –Elicitation corpus designed to be typologically comprehensive and compositional –Transfer-rule engine and new learning approach support acquisition of generalized transfer-rules from the data

October 11, 2002AMTA 20028 Overview of Learning Approach 1.Elicitation Corpus: Bilingual data is acquired from a specifically engineered corpus 2.Feature Detection: Gather information about features and their values in the minority language 3.Rule Learning: Infer syntactic transfer rules by first guessing and then iteratively refining

October 11, 2002AMTA 20029 The Elicitation Corpus Translated, aligned by bilingual informant Corpus consists of linguistically diverse constructions Based on elicitation and documentation work of field linguists (e.g. Comrie 1977, Bouquiaux 1992) Organized compositionally: elicit simple structures first, then use them as building blocks Goal: minimize size, maximize coverage

October 11, 2002AMTA 200210 The Transfer Engine Analysis Source text is parsed into its grammatical structure. Determines transfer application ordering. Example: 他看书。 (he read book) S NP VP N V NP 他看书 Transfer A target language tree is created by reordering, insertion, and deletion. S NP VP N V NP he read DET N a book Article “a” is inserted into object NP. Source words translated with transfer lexicon. Generation Target language constraints are checked and final translation produced. E.g. “reads” is chosen over “read” to agree with “he”. Final translation: “He reads a book”

October 11, 2002AMTA 200211 Transfer Rule Formalism Type information Part-of-speech/constituent information Alignments x-side constraints y-side constraints xy-constraints, e.g. ((Y1 AGR) = (X1 AGR)) ; SL: the man, TL: der Mann NP::NP [DET N] -> [DET N] ( (X1::Y1) (X2::Y2) ((X1 AGR) = *3-SING) ((X1 DEF = *DEF) ((X2 AGR) = *3-SING) ((X2 COUNT) = +) ((Y1 AGR) = *3-SING) ((Y1 DEF) = *DEF) ((Y2 AGR) = *3-SING) ((Y2 GENDER) = (Y1 GENDER)) )

October 11, 2002AMTA 200212 Transfer Rule Formalism (II) Value constraints Agreement constraints ;SL: the man, TL: der Mann NP::NP [DET N] -> [DET N] ( (X1::Y1) (X2::Y2) ((X1 AGR) = *3-SING) ((X1 DEF = *DEF) ((X2 AGR) = *3-SING) ((X2 COUNT) = +) ((Y1 AGR) = *3-SING) ((Y1 DEF) = *DEF) ((Y2 AGR) = *3-SING) ((Y2 GENDER) = (Y1 GENDER)) )

October 11, 2002AMTA 200213 Rule Learning - Overview Goal: Acquire Syntactic Transfer Rules Use available knowledge from the source side (grammatical structure) Three steps: 1.Flat Seed Generation: first guesses at transfer rules; no syntactic structure 2.Compositionality: use previously learned rules to add structure 3.Seeded Version Space Learning: refine rules by generalizing with validation

October 11, 2002AMTA 200214 Flat Seed Generation Create a transfer rule that is specific to the sentence pair, but abstracted to the POS level. No syntactic structure. ElementSource SL POS sequencef-structure TL POS sequenceTL dictionary, aligned SL words Type informationcorpus, same on SL and TL Alignmentsinformant x-side constraintsf-structure y-side constraintsTL dictionary, aligned SL words (list of projecting features)

October 11, 2002AMTA 200215 Flat Seed Generation - Example The highly qualified applicant did not accept the offer. Der äußerst qualifizierte Bewerber nahm das Angebot nicht an. ((1,1),(2,2),(3,3),(4,4),(6,8),(7,5),(7,9),(8,6),(9,7)) S::S [det adv adj n aux neg v det n] -> [det adv adj n v det n neg vpart] (;;alignments: (x1:y1)(x2::y2)(x3::y3)(x4::y4)(x6::y8)(x7::y5)(x7::y9)(x8::y6)(x9::y7)) ;;constraints: ((x1 def) = *+) ((x4 agr) = *3-sing) ((x5 tense) = *past) …. ((y1 def) = *+) ((y3 case) = *nom) ((y4 agr) = *3-sing) …. )

October 11, 2002AMTA 200216 Compositionality - Overview Traverse the c-structure of the English sentence, add compositional structure for translatable chunks Adjust constituent sequences, alignments Remove unnecessary constraints, i.e. those that are contained in the lower-level rule Adjust constraints: use f-structure of correct translation vs. f-structure of incorrect translations to introduce context constraints

October 11, 2002AMTA 200217 Compositionality - Example S::S [det adv adj n aux neg v det n] -> [det adv adj n v det n neg vpart] (;;alignments: (x1:y1)(x2::y2)(x3::y3)(x4::y4)(x6::y8)(x7::y5)(x7::y9)(x8::y6)(x9::y7)) ;;constraints: ((x1 def) = *+) ((x4 agr) = *3-sing) ((x5 tense) = *past) …. ((y1 def) = *+) ((y3 case) = *nom) ((y4 agr) = *3-sing) …. ) S::S [NP aux neg v det n] -> [NP v det n neg vpart] (;;alignments: (x1::y1)(x3::y5)(x4::y2)(x4::y6)(x5::y3)(x6::y4) ;;constraints: ((x2 tense) = *past) …. ((y1 def) = *+) ((y1 case) = *nom) …. ) NP::NP [det AJDP n] -> [det ADJP n] ((x1::y1)… ((y3 agr) = *3-sing) ((x3 agr = *3-sing) ….)

October 11, 2002AMTA 200218 Seeded Version Space Learning: Overview Goal: further generalize the acquired rules Methodology: –Preserve general structural transfer –Consider relaxing specific feature constraints Seed rules are grouped into clusters of similar transfer structure (type, constituent sequences, alignments) Each cluster forms a version space: a partially ordered hypothesis space with a specific and a general boundary The seed rules in a group form the specific boundary of a version space The general boundary is the (implicit) transfer rule with the same type, constituent sequences, and alignments, but no feature constraints

October 11, 2002AMTA 200219 Seeded Version Space Learning NP v det nNP VP … 1.Group seed rules into version spaces as above. 2.Make use of partial order of rules in version space. Partial order is defined via the f-structures satisfying the constraints. 3.Generalize in the space by repeated merging of rules: 1.Deletion of constraint 2.Moving value constraints to agreement constraints, e.g. ((x1 num) = *pl), ((x3 num) = *pl)  ((x1 num) = (x3 num) 4. Check translation power of generalized rules against sentence pairs

October 11, 2002AMTA 200220 Seeded Version Space Learning: Example S::S [NP aux neg v det n] -> [NP v det n neg vpart] (;;alignments: (x1::y1)(x3::y5)(x4::y2)(x4::y6)(x5::y3)(x6::y4) ;;constraints: ((x2 tense) = *past) …. ((y1 def) = *+) ((y1 case) = *nom) ((y1 agr) = *3-sing) … ) ((y3 agr) = *3-sing) ((y4 agr) = *3-sing)… ) S::S [NP aux neg v det n] -> [NP v det n neg vpart] (;;alignments: (x1::y1)(x3::y5)(x4::y2)(x4::y6)(x5::y3)(x6::y4) ;;constraints: ((x2 tense) = *past) … ((y1 def) = *+) ((y1 case) = *nom) ((y1 agr) = *3-plu) … ((y3 agr) = *3-plu) ((y4 agr) = *3-plu)… ) S::S [NP aux neg v det n] -> [NP n det n neg vpart] ( ;;alignments: (x1::y1)(x3::y5) (x4::y2)(x4::y6) (x5::y3)(x6::y4) ;;constraints: ((x2 tense) = *past) … ((y1 def) = *+) ((y1 case) = *nom) ((y4 agr) = (y3 agr)) … )

October 11, 2002AMTA 200221 Preliminary Evaluation English to German Corpus of 141 ADJPs, simple NPs and sentences 10-fold cross-validation experiment Goals: –Do we learn useful transfer rules? –Does Compositionality improve generalization? –Does VS-learning improve generalization?

October 11, 2002AMTA 200222 Summary of Results Average translation accuracy on cross- validation test set was 62% Without VS-learning: 43% Without Compositionality: 57% Average number of VSs: 24 Average number of sents per VS: 3.8 Average number of merges per VS: 1.6 Percent of compositional rules: 34%

October 11, 2002AMTA 200223 Conclusions New paradigm for learning transfer rules from pre-designed elicitation corpus Geared toward languages with very limited resources Preliminary experiments validate approach: compositionality and VS- learning improve generalization

October 11, 2002AMTA 200224 Future Work 1.Larger, more diverse elicitation corpus 2.Additional languages (Mapudungun…) 3.Less information on TL side 4.Reverse translation direction 5.Refine the various algorithms: Operators for VS generalization Generalization VS search Layers for compositionality 6.User interactive verification

October 11, 2002AMTA 200225 Seeded Version Space Learning: Generalization The partial order of the version space: Definition: A transfer rule tr 1 is strictly more general than another transfer rule tr 2 if all f- structures that are satisfied by tr 2 are also satisfied by tr 1. Generalize rules by merging them: –Deletion of constraint –Raising two value constraints to an agreement constraint, e.g. ((x1 num) = *pl), ((x3 num) = *pl)  ((x1 num) = (x3 num))

October 11, 2002AMTA 200226 Seeded Version Space Learning: Merging Two Rules Merging algorithm proceeds in three steps. To merge tr 1 and tr 2 into tr merged : 1.Copy all constraints that are both in tr 1 and tr 2 into tr merged 2.Consider tr 1 and tr 2 separately. For the remaining constraints in tr 1 and tr 2, perform all possible instances of raising value constraints to agreement constraints. 3.Repeat step 1.

October 11, 2002AMTA 200227 Seeded Version Space Learning: The Search The Seeded Version Space algorithm itself is the repeated generalization of rules by merging A merge is successful if the set of sentences that can correctly be translated with the merged rule is a superset of the union of sets that can be translated with the unmerged rules, i.e. check power of rule Merge until no more successful merges

Automatic Rule Learning for Resource-Limited Machine Translation Alon Lavie, Katharina Probst, Erik Peterson, Jaime Carbonell, Lori Levin, Ralf Brown Language.

Similar presentations

Presentation on theme: "Automatic Rule Learning for Resource-Limited Machine Translation Alon Lavie, Katharina Probst, Erik Peterson, Jaime Carbonell, Lori Levin, Ralf Brown Language."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Automatic Rule Learning for Resource-Limited Machine Translation Alon Lavie, Katharina Probst, Erik Peterson, Jaime Carbonell, Lori Levin, Ralf Brown Language.

Similar presentations

Presentation on theme: "Automatic Rule Learning for Resource-Limited Machine Translation Alon Lavie, Katharina Probst, Erik Peterson, Jaime Carbonell, Lori Levin, Ralf Brown Language."— Presentation transcript:

Similar presentations

About project

Feedback