Presentation is loading. Please wait.

Presentation is loading. Please wait.

Semi-Automatic Learning of Transfer Rules for Machine Translation of Low-Density Languages Katharina Probst April 5, 2002.

Similar presentations


Presentation on theme: "Semi-Automatic Learning of Transfer Rules for Machine Translation of Low-Density Languages Katharina Probst April 5, 2002."— Presentation transcript:

1 Semi-Automatic Learning of Transfer Rules for Machine Translation of Low-Density Languages Katharina Probst April 5, 2002

2 Overview of the talk Introduction and Motivation Overview of the AVENUE project Elicitation of bilingual data Rule Learning Seed Generation Seeded Version Space Learning Conclusions and Future Work

3 Overview of the talk Introduction and Motivation Overview of the AVENUE project Elicitation of bilingual data Rule Learning Seed Generation Seeded Version Space Learning Conclusions and Future Work

4 Introduction and Motivation Basic idea: opening up Machine Translation to Languages to minority languages Scarce resources for minority languages: Bilingual text Monolingual text Target language grammar Due to scarce resources, statistical and example-based methods will likely not perform as well Our approach: A system that elicits necessary information about the target language from a bilingual informant The elicited information is used in conjunction with any other available target language information to learn syntactic transfer rules

5 System overview User Learning Module Elicitation Process SVS Learning Process Transfer Rules Run-Time Module SL Input SL Parser Transfer Engine TL Generator EBMT Engine Unifier Module TL Output

6 Overview of the talk Introduction and Motivation Overview of the AVENUE project Elicitation of bilingual data Rule Learning Seed Generation Seeded Version Space Learning Conclusions and Future Work

7 Elicitation Eliciation is the process of presenting a bilingual speaker with sets of sentences. The user translates the sentences and specifies how the words align The elicitation process serves multiple purposes: Collection of data Feature detection

8 Feature Detection Feature detection is a process by which the learning module answers questions such as “Does the target language mark number on nouns?” The elicitation corpus is organized in minimal pairs, i.e. pairs of sentences that differ in only one feature. For example: 1. You (John) are falling. [2 nd person m, subj, present tense] 2. You (Mary) are falling. [2 nd person f, subj, present tense] 3. You (Mary) fell. [2 nd person f, subj, past tense]  Sentences 1 and 2 and sentences 2 and 3 are minimal pairs. By comparing the translations for “you”, the system gets indications of whether plural is marked on nouns. The results of feature detection will be used to guide the system in navigating through the elicitation corpus by eliminating parts used on Implicational Universals The results will also be used by the rule learning module

9 More on the elicitation corpus Eliciting data from bilingual informants entails a number of challenges: 1. The bilingual informant him/herself 2. Morphology and the lexicon 3. Learning grammatical features 4. Compositional elicitation 5. Elicitation of non-compositional data 6. Verb subcategorization 7. Alignment issues 8. Bias towards the source language

10 Overview of the talk Introduction and Motivation Overview of the AVENUE project Elicitation of bilingual data Rule Learning Seed Generation Seeded Version Space Learning Conclusions and Future Work

11 Rule Learning in the AVENUE project - Introduction The goal is to semi-automatically (i.e. with the help of the user) infer syntactic transfer rules Rule learning can be divided into two main steps: Seed Generation: The system produces an initial “guess” at a transfer rule based on only one sentence. The produced rule is quite specific to the input sentence. Version Space Learning: Here, the system takes the seed rules and generalize them.

12 Transfer rule formalism A transfer rule (TR) consists of the following components: 1. Source language sentence, Target language sentence that the TR was produced from 2. Word alignments 3. Phrase information such as NP, S, … 4. Part-of-Speech sequences for source and target language. 5. X-side constraints, i.e. constraints on the source language. These are used for parsing. 6. Y-side constraints, i.e. constraints on the target language. These are used for generation. 7. XY-constraints, i.e. constraints that transfer features from the source to the target language. These are used for transfer.

13 Seed Generation Type of InformationSource of Information SL, TL sentenceInformant AlignmentInformant Phrase InformationElicitation corpus, same as SL on TL SL POS sequenceEnglish parse (c,f) TL POS sequenceEnglish parse, TL dictionary X-side constraintsEnglish parse (f) Y-side constraintsEnglish parse, list of projecting features, TL dictionary XY constraints---

14 A word on compositionality Basic idea: if you produce a transfer rule for a sentence, and there already exist transfer rules that can translated parts of the sentence, why not use them? Adjust the alignments, part-of-speech sequences, and the constraints The trickiest part is to find new constraints that cannot be in the lower-level rule, but are necessary to translate correctly in the context of a sentence

15 Clustering Seed rules are “clustered” into groups that warrant attempt to merge Clustering criteria: POS sequences, Phrase information, Alignments Main reason for clustering: divide the large version space into a number of smaller version spaces and run the algorithm on each version space separately Possible danger: Rules that should be considered together (such as “the man”, “men”) will not be

16 The Version Space A set of seed rules in a cluster defines a version space as follows: The seed rules form the specific boundary (S). A virtual rule with the same POS sequences, alignments, and phrase information, but no constraints forms the general boundary (G): G boundary: virtual rule with no constraints S boundary: seed rules Generalizations of seed rules, less specific than rule in G

17 The partial ordering of rules in the version space A rule TR2 is said to be strictly more general than another rule TR1 if the set of f-structures that satisfy TR2 are a superset of the set of f-structures that satisfy TR1. It is said to be equivalent to TR1 if the set of f-structures that satisfies TR1 is the same as the set of f-structures that satisfies TR2. We have defined three operations that move a transfer rule to a strictly more general rule

18 Generalization operations Operation 1: delete value constraint, e.g. ((X1 agr) = *3pl) → NULL Operation 2: delete agreement constraint, e.g. ((X1 agr) = (X2 agr)) → NULL Operation 3: merge two value constraints to an agreement constraint ((X1 agr) = *3pl), ((X2 agr) = *3pl) → ((X1 agr) = (X2 agr))

19 Merging two transfer rules At the heart of the seeded version space learning algorithm is the merging of two transfer rules (TR1 and TR2) to a more general rule (TR3): 1. All constraints that are both in TR1 and TR2 are inserted into TR3 and removed from TR1 and TR2. 2. Perform all instances of Operation3 on TR1 and TR2 separately. 3. Repeat step 1.

20 Seeded Version Space Algorithm 1. Remove duplicate rules from the S boundary 2. Try to merge each pair of transfer rules 3. A merge is successful only if the CSet (set of covered sentences, i.e. sentences that are translated correctly) of the merged rule is a superset of the union of the CSets of the two unmerged rules 4. Pick the successful merge that optimizes an evaluation criterion 5. Repeat until no more merges are found

21 Evaluating a set of transfer rules Initial thought: evaluate a merge based on the “goodness” of the new rule, i.e. its CSet and based on the size of the rule set Goal: maximize coverage and minimize set Currently: merges are only successful if there is no loss in coverage, so size of rule set only criterion used Future(1): Coverage should be measured on a test set Future(2): Relax the constraint that a successful merge cannot result in loss of coverage

22 Overview of the talk Introduction and Motivation Overview of the AVENUE project Elicitation of bilingual data Rule Learning Seed Generation Seeded Version Space Learning Conclusions and Future Work

23 Novel approach to data-driven MT: less data, more encoded linguistic knowledge Still in the first stages, so system is under heavy development and subject to major changes Current work: compositionality Future work includes: Expanding coverage Addressing (much) more complex constructions Eliminating some assumptions


Download ppt "Semi-Automatic Learning of Transfer Rules for Machine Translation of Low-Density Languages Katharina Probst April 5, 2002."

Similar presentations


Ads by Google