Presentation is loading. Please wait.

Presentation is loading. Please wait.

Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.

Similar presentations


Presentation on theme: "Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91."— Presentation transcript:

1 Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91

2 Example 1 Sv. Fyll på olja i växellådan.  En. Fill gearbox with oil. (from the Scania corpus) fyll på  fill obj  adv adv  obj

3 Example 2 Sv. I oljefilterhållaren sitter en överströmningsventil.  En. The oil filter retainer has an overflow valve. (from the Scania corpus) sitter  has adv  subj subj  obj

4 Transfer-based translation intermediary sentence structure basic processes –analysis –transfer –generation (synthesis) language modules –dictionary and grammar of SL –transfer dictionary and transfer rules –dictionary and grammar of TL

5 SLTL Interlingua Direct translation Transfer Multra Metal

6 Levels of intermediary structure cf. J&M, Chapter 21 word order

7 Metal See H&S

8 MULTRA Multilingual Support for Translation and Writing translation engine transfer-based –shake-and-bake modular unification-based preference machinery trace-able

9

10 Analysis chart parser (Lisp  C) –procedural formalism unification and other kinds of operations sentence structure –feature structure –grammatical relations –surface order implicit via grammatical relations See further Sågvall Hein&Starbäck (99),Weijnitz (02), Dahllöf (89)

11 Transfer unification-based declarative formalism –Multra transfer formalism (Beskow 93) lexical and structural rules rules are partially ordered a more specific rule takes precedence over a less specific one –specificity in terms of number of transfer equations all applicable rules are applied written in prolog

12 Generation syntactic generation –Multra syntactic generation formalism (Beskow 97a) –PATR-like style unification concatenation typed features morphological generation (Beskow 97b) –lexical insertion rules –morphological realisation and phonological finish in prolog written in prolog

13 An example: Tippa hytten. Tippa hytten. : (* = (PHR.CAT = CL MODE = IMP SUBJ = 2ND VERB = (WORD.CAT = VERB INFF = IMP DIAT = ACT LEX = TIPPA.VB.1 VSURF = +) OBJ.DIR = (PHR.CAT = NP NUMB = SING GENDER = UTR CASE = BASIC DEF = DEF HEAD = (LEX = HYTT.NN.1 WORD.CAT = NOUN))) REG = (V1.LEM = TIPPA.VB) SEP = (WORD.CAT = SEP LEX = STOP.SR.0)))

14 Transfer structure [VERB : [WORD.CAT : VERB LEX : TILT.VB.0 DIAT : ACT INFF : IMP] OBJ.DIR : [PHR.CAT : NP DEF : DEF NUMB : SING HEAD : [WORD.CAT : NOUN LEX : CAB.NN.0]] MODE : IMP SUBJ: 2ND VSURF: + SEP : [WORD.CAT : SEP LEX : STOP.SR.0] PHR.CAT : CL]

15 Generation Tilt the cab.

16 A grammar rule defrule legal.obj { = 'np, not = 'gen, not = 'subj }

17 Transfer rules copy feature delete feature transfer feature assign feature

18 Copy feature LABEL mode SOURCE = ?x1 TARGET = ?x2 TRANSFER

19 Delete feature LABEL REG SOURCE = ANY TARGET = TRANSFER

20 Transfer feature LABEL OBJ.DIR SOURCE = ?x1 TARGET = ?x2 TRANSFER ?x1 ?x2

21 Define feature LABEL trycka.in-press SOURCE =trycka.vb+in.ab.1 =VERB TARGET =press.vb.1 =VERB TRANSFER

22 A generation rule LABEL CL.IMP X1 ---> X2 X3 X4 : = CL = = IMP =

23 A contextual lexical rule LABEL tänka.på-think.about SOURCE = tänka.vb.1 = pp = ?prep = på.pp.1 = ?rect1 TARGET = pp = PREP = about.pp.1 = ?rect2 TRANSFER ?rect1 ?rect2

24 A generation trace 1-Applying Rule cl-sep 1- Applying Rule cl.imp 1- Applying Rule subj2nd-verb-obj.dir 1- Applying Rule verb.main.act 1- Applying Rule np.the-df 1- Applying Rule ng.noun-def 1-Success!

25 Language resources in the MATS system dictionary in a database with different views analysis grammar transfer grammar –incl. contextually defined lexical rules generation grammar

26 sv-en_LinkLexicon

27 en-Inflections

28 en_LemmaLexicon

29 en_LexemeLexicon

30 en_Lexicon

31 en_StemLexicon

32 sv_Inflections

33 sv_LemmaLexicon

34 sv_LexemeLexicon

35 sv_Lexicon

36 sv_StemLexicon

37 The MATS system Frozen demo…

38 Assignment 2: Working with MATS http://stp.ling.uu.se/~evapet/mt04/assignment2.ht ml

39 Lexicalistic translation Identify (lexical) translation units in the source sentence Translate each unit separately (considering the context) Order the result in agreement with a model of the target language Formulation due to Lars Ahrenberg; see further AH (reading list) ; see also Beaven, L. John, Shake-and- Bake Machine Translation. Coling –92, Nantes, 23-28 Aout 1992.

40 T4F – a lexicalistic system processes in T4F –tokenisation –tagging –transfer –transposition –filtering See further AH (in the reading list)

41 Interlingua translation See SN

42

43

44

45 Applications of alignment translation memories translation dictionaries lexicalistic translation statistical machine translation example-based translation

46 Translation memories based on sentence links optionally, sub sentence links See further Macklovitch, E. (2000)

47 Translation dictionaries based on word links refinement of word links

48 Refinement of word alignment data neutralise capital letters where appropriate lemmatise or tag source and target units identify ambiguities –search for criteria to resolve them identify partial links –compounds? –remove or complete them manual revision?

49 Informally about statistical MT build a translation dictionary based on word alignment aim for as big fragments as possible keep information on link frequency build an n-gram model of the target language implement a direct translation strategy –including alternatives ordered by length and frequency process the output by the n-gram model filtering out the best alternatives and adjust the translation accordingly

50 Example-based MT HS (in the reading list)

51 Some current research topics intersentential dependences hybrid systems: data-driven and rule-driven improved alignment techniques improved language modeling in ST automatic learning from post-editing translation by structural correspondences translation of spoken language improved preference strategies ambiguity preserving translation

52 Intersentential dependencies pronoun resolution lexical ambiguity resolution, such as –(torkar)motorn the motor –(förbrännings)motornthe engine fluency

53 Preserving the information structure information structure is expressed in different ways in the source and the target syntactic clues are exploited in the analysis to compute the information structure (topic-focus articulation) information structure is used to guide the generation

54 An example Torkarmotorn M2 är sammankopplad med omkopplare S24 och intervallrelä R22. För att inte motorn skall överbelastas, t.ex. om torkarbladen fastnat, finns en inbyggd termovakt som bryter strömmen till motorn när … Wiper motor M2 is connected to switch S24 and intermittent relay R22. To prevent motor overload, e.g. if the wiper blade gets stuck, there is an integral thermal sensor which breaks the current to the motor when …

55 Preferences syntactic preferences –the principle of right association –the principle of minimal attachment –two-stage processing semantic preferences –lexical selectional restrictions –lexical contextual rules –conceptual taxonomies –likelihood of occurrence See further Bennet, P. & Paggio, P., 1993, Preference in Eurotra.

56 Preferences in Multra parsing –a formalism for expressing syntactic preferences in the parse not fully developed transfer –contextual lexical rules –rule specificity generation –rule specificity

57 Hybrid systems aims components problems architecture scores

58 Aims of a hybrid system simple techniques for simple tasks complex techniques for complex tasks

59 Components of a hybrid systems component strategies –translation memory full sentences fragments direct translation –statistical translation –ebmt

60 Component strategies, cont’d rule-based translation –simplistic analysis (cf. direct translation) word by word (S  sequence of words) phrase by phrase (S  sequence of phrases) –partial parsing –full parsing

61 Problems of a hybrid system how does the system know when a simple technique is appropriate? –does the source tell? –does the target tell?

62 Architecture and scores simple first? concerting results? scoring?

63 Improved techniques for re-use of translation combining clues for word alignment (Tiedemann 2003) interactive word alignment (Ahrenberg et al. 2003) parallel treebanks

64 Translation by structural correspondences LFG HPSG

65 Translation of spoken language See Krauver, Steven (ed.), 2000, Machine Translation, June 2000. Volume 15, Issue 1-2, Special issue on Spoken Language Translation.


Download ppt "Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91."

Similar presentations


Ads by Google