Presentation is loading. Please wait.

Presentation is loading. Please wait.

Week 9: resources for globalisation Finish spell checkers Machine Translation (MT) The ‘decoding’ paradigm Ambiguity Translation models Interlingua and.

Similar presentations


Presentation on theme: "Week 9: resources for globalisation Finish spell checkers Machine Translation (MT) The ‘decoding’ paradigm Ambiguity Translation models Interlingua and."— Presentation transcript:

1 Week 9: resources for globalisation Finish spell checkers Machine Translation (MT) The ‘decoding’ paradigm Ambiguity Translation models Interlingua and First Order Predicate Calculus Human involvement Historical note

2 Spelling dictionaries Implementing spelling identification and correction algorithm

3 Spelling dictionaries Implementing spelling identification and correction algorithm STAGE 1: compare each string in document with a list of legal strings; if no corresponding string in list mark as misspelled STAGE 2: generate list of candidates Apply any single transformation to the typo string Filter the list by checking against a dictionary STAGE 3: assign probability values to each candidate in the list STAGE 4: select best candidate

4 Spelling dictionaries STAGE 3 prior probability given all the words in English, is this candidate more likely to be what the typist meant than that candidate? P(c) = c/N where N is the number of words in a corpus likelihood Given, the possible errors, or transformation, how likely is it that error y has operated on candidate x to produce the typo? P(t/c), calculated using a corpus of errors, or transformations Bayesian rule: get the product of the prior probability and the likelihood P(c) X P(t/c)

5 Spelling dictionaries non-word errors Implementing spelling identification and correction algorithm STAGE 1: identify misspelled words STAGE 2: generate list of candidates STAGE 3a: rank candidates for probability STAGE 3b: select best candidate Implement: noisy channel model Bayesian Rule

6 Resoucres for Globalisation: Machine translation

7 The ‘decoding’ paradigm Assumes one-to-one relation between source symbol and target symbol

8 Resoucres for Globalisation: Machine translation The ‘decoding’ paradigm Assumes one-to-one relation between source symbol and target symbol one-to-many (homonymy)

9 Resoucres for Globalisation: Machine translation The ‘decoding’ paradigm Assumes one-to-one relation between source symbol and target symbol one-to-many (homonymy) one-to-many (hypernym → hyponyms):

10 Resoucres for Globalisation: Machine translation The ‘decoding’ paradigm Assumes one-to-one relation between source symbol and target symbol one-to-many (homonymy) one-to-many (hypernym → hyponyms): many-to-one (hyponyms → hypernym)

11 Machine translation The ‘decoding’ paradigm one-to-many (homonymy) bank → Ufer, Bank (German)

12 Machine translation The ‘decoding’ paradigm one-to-many (homonymy) one-to-many (hypernym → hyponyms): brother → otooto, oniisan (Japanese) blue → синий, голубой (Russian) many-to-one (hyponyms → hypernym)

13 Machine translation The ‘decoding’ paradigm one-to-many (homonymy) one-to-many (hypernym → hyponyms): many-to-one (hyponyms → hypernym) hill, mountain → Berg (German) learn, teach → leren (Dutch)

14 Machine translation and globalisation Ambiguity ‘I made her duck’ “The possibility of interpreting an expression in two or more distinct ways” Collins English Dictionary

15 Machine translation Ambiguity Challenge of the translation depends on the level of ambiguity that arises This depends on the closeness of the source and target languages w.r.t. the following: vocabulary homonyms grammar structural ambiguity conceptual structure specificity ambiguity lexical gaps

16 Machine translation Pragmatic approach

17 Machine translation Pragmatic approach aim for a rough translation, ‘gist’ translation Used for multi-lingual information retrieval

18 Machine translation Pragmatic approach aim for a rough translation, ‘gist’ translation Used for multi-lingual information retrieval involve human translators in the process: computer-aided translation

19 Machine translation Translation models Transfer model ‘the dog bit my friend’ Hindi: kutte-ne mere dost ko-kata dog my friend bit

20 Machine translation Translation models Transfer model Alter grammatical structure of source language to make it adhere to the grammatical structure of target language Use transformation rule Analysis process (source) Transfer process (‘bridge’) Generation process (target) Problem: each source-target pair will need it own unique set of transformation rules

21 Machine translation Translation models Inter-lingua model Extract the meaning from the source string Give it a language independent representation, i.e. an interlingua Translation process takes the interlingua as its input Multiple translation processes take the same input for multiple target language outputs

22 Machine translation Translation models What is the inter-lingua? for words, some sort of semantic analysis, e.g. (GO, BY-FOOT) (GO, BY-TRANSPORT) Russian: идти ехать English: go go

23 Machine translation and globalisation Translation models What is the inter-lingua? for sentences, a logical language e.g. First Order Predicate Calculus

24 Meaning representation Goal: 1.the semantic representation must give you a one-to-one mapping to non-linguistic knowledge of the world 2. The representation must be expressive, i.e. handle different types of data

25 Meaning representation First Order Predicate Calculus computationally tractable objects (terms) properties of objects relations amongst objects Predicate argument structure large composite representations logical connectives

26 Meaning representation First Order Predicate Calculus Object: referred to uniquely by a term constant e.g. SurreyUniversity function e.g. LocationOf(SurreyUniversity) variable

27 Meaning representation First Order Predicate Calculus Relations amongst objects Predicates: “symbols that refer to, or name, the relations that hold among some fixed number of objects” (J & M) Educates(SurreyUniversity, Citizens) two-place predicate

28 Meaning representation First Order Predicate Calculus Relations amongst objects Predicates: Can specify the category of an object University(SurreyUniversity) one-place predicate

29 Meaning representation First Order Predicate Calculus properties / parts of objects functions: LocationOf(SurreyUniversity)

30 Meaning representation First Order Predicate Calculus Composite representations through predicates and functions: Near(LocationOf(SurreyUniversity), LocationOf(Cathedral))

31 Meaning representation First Order Predicate Calculus Logical connectives combine basic representations to form larger more complex representations e.g ٨ operator = ‘and’

32 Meaning representation First Order Predicate Calculus Logical connectives combine basic representations to form larger more complex representations Educates(SurreyUniversity, Citizens) ٨ ¬ Remunerates(SurreyUniversity, Staff)

33 Machine translation and globalisation Machine translation and globalisation: change of priorities 1954: IBM and Georgetown University, first MT demo goal: ‘perfect’ translation 1967: Automatic Language Process Advisory Committee (ALPAC) report: damning of goal Post ALPAC Goal: rough translation, involve human element Current situation: online translation, e.g. Babel Fish, descendant of SYSTRAN whose goal was rough translation Journal of Machine Translation

34 Next week Globalisation as an industry SDL and the SDLX-TRADOS globalisation application


Download ppt "Week 9: resources for globalisation Finish spell checkers Machine Translation (MT) The ‘decoding’ paradigm Ambiguity Translation models Interlingua and."

Similar presentations


Ads by Google