Presentation is loading. Please wait.

Presentation is loading. Please wait.

Machine Translation Introduction

Similar presentations


Presentation on theme: "Machine Translation Introduction"— Presentation transcript:

1 Machine Translation Introduction
Jan Odijk LOT Winterschool Amsterdam January 2011

2 Overview MT: What is it MT: What is not possible (yet?)
MT: Why is it so difficult? MT: Can we make it possible? MT: Evaluation MT: What is (perhaps) possible Conclusions

3 MT: What is it? Input: text in source language
Output text in target language that is a translation of the input text

4 MT: What is it? Interlingua Analyzed input  transfer Analyzed output
Input direct translation Output

5 MT: System Types Direct: Transfer Interlingual
Earliest systems (1950s) Direct word-to-word translation Recent statistical MT systems Transfer Almost all research and commercial systems <= 1990 Interlingual

6 MT: System Types Interlingual Hybrid Interlingual/Transfer
A few research systems in the 1980s Rosetta (Philips), based on Montague Grammar Semantic derivation trees of attuned grammars Distributed Translation (BSO) (enriched) Esperanto Sometimes logical representations Hybrid Interlingual/Transfer Transfer for lexicons; IL for rules

7 Rule-Based Systems Most systems explicit source language grammar
parser yields analysis of source language input transfer component turns it into target language structure no explicit grammar of target language (except morphology)

8 Rule-Based Systems Some systems (Eurotra)
explicit source and target language grammar sometimes reversible parser yields analysis of source language input transfer component turns it into target language structure generation of translation by target language grammar

9 Rule-Based Systems Some systems (Rosetta, DLT)
explicit source and target language grammar in some cases reversible parser yields interlingual representation generation of translation by target language grammar from interlingual representation

10 MT: Is it difficult? FAHQT: Fully Automatic High Quality Translation
Fully Automatic: no human intervention High Quality: close or equal to human translation Even acceptable quality is difficult to achieve

11 MT: Why is it so difficult?
Ambiguity Real Temporary Computational Complexity Complexity of language Divergences Language Competence v. Language Use Require large and rich lexicons

12 MT: Why is it so difficult?
De jongen sloeg het meisje met de gitaar Hij heeft boeken gelezen Hij heeft uren gelezen He has been reading books *He has been reading for books *He has been reading hours He has been reading for hours

13 MT: Why is it so difficult?
Uren: not only also dagen, de hele dag, weken, … (Words expressing units of time) But also: De hele vergadering, meeting, bijeenkomst, les, … (words expressing events)

14 MT: Why is it so difficult?
Hij draagt een bruin pak Dragen: wear or carry Pak: suit or package Hij draagt een bruin pak en zwarte schoenen Hij draagt een bruin pak onder zijn arm

15 MT: Why is it so difficult?
Voert uw bedrijf sloten uit? Uitvoeren: execute, or export? Bedrijf: act, or company? Sloten: ditches, or locks?

16 MT: Why is it so difficult?
Temporary Ambiguity Hij heeft boeken gelezen Heeft: main or auxiliary verb? Boeken: noun or verb Voert uw bedrijf sloten uit? Voert: form of voeren or of uitvoeren, Bedrijf: noun or verb form? Sloten uit: noun+particle or PP: out of ditches/locks

17 Why is MT difficult? Ambiguity of natural language Summary
requires modeling of knowledge of the world /situation by rule systems, and/or by statistics

18 MT: Why is it so difficult?
Computational Complexity High demands of processing capacity High demands on memory Complexity of language Many different construction types All interacting with each other

19 Why is MT difficult? Divergences between language
require deep syntactic analysis Or very sophisticated statistical techniques

20 Divergences: Category mismatches
Simple category mismatches woonachtig (zijn) v. reside (Adj – Verb) zich ergeren v. (be) annoyed (Verb-Adj) verliefd v. in love (Adj- Prep+Noun) kunnen v. (be) able kunnen v. (be) possible door- v. continue (to)

21 Divergences: Category mismatches
More complex category mismatches graag vs. like (Adv vs. Verb) hij zwemt graag vs. he likes to swim toevallig vs. happen hij viel toevallig vs. he happened to fall

22 Divergences: Category mismatches
Phrasal category mismatches de zieke vrouw the woman who is ill (* the ill woman) I expect her to leave ik verwacht dat zij vertrekt She is likely to come het is waarschijnlijk dat zij komt

23 Conflational Divergences:
prepositional complements houden van vs. love existential er vs. Ø er passeerde een auto vs. a car passed verbal particles blow (something) up vs. volar

24 Conflational Divergences:
reflexive verbs zich scheren vs. shave composed vs. simple tense forms he will do it vs. lo hará split negatives vs. composed negatives he does not see anyone vs. hij ziet niemand

25 Functional Divergences:
I like these apples me gustan estas manzanas se venden manzanas aqui hier verkoopt men appels er werd door de toeschouwers gejuicht the spectators were cheering

26 Divergences: MWEs semi-fixed MWEs flexible idioms
nuclear power plant vs. kerncentrale flexible idioms de plaat poetsen vs. bolt de pijp uit gaan v. to kick the bucket

27 Divergences: MWEs semi-idioms (collocations)
zware shag vs. strong tobacco semi-idioms (support verbs) aandacht besteden aan pay attention to

28 MT: Why is it so difficult?
Language Competence v. Language Use Earlier systems implemented idealized reality But not the really occurring language use In some cases focus on theoretically interesting difficult constructions That do occur in reality But other constructions are more important to deal with in practical systems

29 MT: Why is it so difficult?
Large and rich lexicons Existing human-oriented dictionaries are not suited as such All information must be available in a formalized way Much more information is needed than in a traditional dictionary

30 MT: Why is it so difficult?
Multi-word Expressions (MWEs) Are in current dictionaries only in a very informal way No standards on how to represent them lexically Many different types requiring different treatment in the grammar Huge numbers!! Domain and company-specific terminology are often MWEs

31 MT: Can we make it possible?
Probably not, but we can still improve significantly Lexicons Selection restrictions Approximating analyses Statistical MT

32 MT: Can we make it possible?
Large and rich lexicons widely accepted and used (de facto) standards Methods and tools to quickly adapt to domain or company specific vocabulary Better treatment of MWEs and standards for lexical representation of MWEs

33 MT: Can we make it possible?
Selection restrictions with type system to approach modeling of world knowledge Requires sophisticated syntactic analysis Boek: info (legible) Uur: time unit  duration Vergadering: event  duration Lezen: subject=human; object=info (legible) Durational adjunct must be a duration phrase

34 MT: Can we make it possible?
Selection restrictions Pak (1) (suit): cloths Pak (2) (package): entity Dragen (1) (wear): subj=animate; object=cloths Dragen (2) (carry): subj=animate; object= entity Schoen: cloths Entity > cloths Identity preferred over subsumption Homogeneous object preferred over heterogeneous one

35 MT: Can we make it possible?
Selection restrictions Hij draagt een bruin pak He wears a brown suit (1: cloths=cloths) He carries a brown package (1: entity=entity) He carries a brown suit (2: entity > cloth) *He wears a brown package (cloth ¬> entity) Hij draagt een bruin pak en zwarte schoenen He wears a brown suit and black shoes (1: homogeneous and cloths=cloths) He carries a brown suit and black shoes (2: homogeneous but entity > cloths) He carries a brown package and black shoes(2: inhomogeneous but entity=entity) *He wears a brown package and black shoes (cloths ¬> entity)

36 MT: Can we make it possible?
Approximating analyses Ignore certain ambiguities to begin with Use only limited amount of relevant information Cut off analysis when there are too many alternatives This is currently actually done in all practical systems Need new ways of doing this without affecting quality too seriously

37 MT: Can we make it possible?
Statistical MT Derives MT-system automatically From statistics taken from Aligned parallel corpora ( translation model) Monolingual target language corpora ( language model) Being worked since early 90’s

38 MT: Can we make it possible?
Plus: No or very limited grammar development Includes language and world knowledge automatically (but implicitly) Based on actually occurring data Currently many experimental and commercial systems Minus: Requires large aligned parallel corpora Unclear how much linguistics will be needed anyway Probably restricted to very limited domains only

39 MT: Can we make it possible?
Google Translate (statistical MT) Hij draagt een pak.  √He wears a suit. Hij draagt schoenen.  √ He wears shoes. Hij draagt bruine schoenen en een pak.  √ He wears a suit and brown shoes. (!!) Hij draagt het pakket  √ He carries the package Hij heeft een pak aan.  *He has a suit. Voert uw bedrijf sloten uit?  *Does your company locks out?

40 MT: Can we make it possible?
Euromatrix esp. “the Euromatrix” Lists data and tools for European language pairs Goals Translation systems for all pairs of EU languages Organization, analysis and interpretation of a competitive annual international evaluation of machine translation The provision of open source machine translation technology including research tools, software and data A systematically compiled and constantly updated detailed survey of the state of MT technology for all EU language pairs Efficient inclusion of linguistic knowledge into statistical machine translation The development and testing of hybrid architectures for the integration of rule-based and statistical approaches

41 MT: Can we make it possible?
Euromatrix esp. “the Euromatrix” Lists data and tools for European language pairs Goals Translation systems for all pairs of EU languages Organization, analysis and interpretation of a competitive annual international evaluation of machine translation The provision of open source machine translation technology including research tools, software and data A systematically compiled and constantly updated detailed survey of the state of MT technology for all EU language pairs Efficient inclusion of linguistic knowledge into statistical machine translation The development and testing of hybrid architectures for the integration of rule-based and statistical approaches Successor project EuromatrixPlus

42 MT: Can we make it possible?
META-NET (EU-funding) Building a community with shared vision and strategic research agenda Building META-SHARE, an open resource exchange facility Building bridges to neighbouring technology fields Bringing more Semantics into Translation Optimising the Division of Labour in Hybrid MT Exploiting the Context for Translation Empirical Base for Machine Translation

43 MT: Can we make it possible?
PACO-MT Investigates hybrid approach to MT Rule-based and statistical Uses existing parser for source language analysis Uses statistical n-gram language models for generation Uses statistical approach to transfer

44 MT Evaluation Evaluation depends on purpose of MT and how it is used
application, domain, controlled language Many aspects can be evaluated functionality, efficiency, usability, reliability, maintainability, portability translation quality embedding in work flow post-editing options/tools

45 MT Evaluation Focus here: Again, many aspects:
does the system yield good translations according to human judgement in the context of developing a system Again, many aspects: fidelity (how close), correctness, adequacy, informativeness, intelligibility, fluency and many ways to measure these aspects

46 MT Evaluation Test suite Advantages Disadvantages Reference =
list of (carefully selected) sentences with their translations (ordered by score) translations judged correct by human (usually developer) upon every update of the system output of the new system is compared to the reference if different: system has to be adapted, or reference has to be adapted Advantages focus on specific translation problems possible excellent for regression testing Manual judgement needed only once for each new output –other comparisons are automatic Disadvantages not really independent particularly suited for pure rule-based systems human judgement needed if output differs from reference

47 MT Evaluation Comparison against Advantages Disadvantage Useful
translation corpus independently created by human translators possibly multiple equivalently correct translations of a sentence Advantages truely independent also suited for data-driven systems Disadvantage requires human judgement (every time there is a system update) high effort by highly skilled people, high costs, requires a lot of time human judgement is not easy (unless there is a perfect match) Useful for a one-time evaluation of a stable system not for evaluation during development

48 MT Evaluation Edit-Distance (Word Accuracy)
metric to determine closeness of translations automatically the least number of edit operations to turn the translated sentence into the reference sentence Alshawi et al. 1998

49 MT Evaluation WA = 1- ((d+s+i)/max(r,c)) d= number of deletions
s = number of substitutions i = number of insertions r = reference sentence length c = candidate sentence length easy to calculate using Levenshtein distance algorithm (dynamic programming) various extensions have been proposed

50 MT Evaluation Advantages Disadvantages
fully automatic given a reference set Disadvantages penalizes candidates if a synonym is used penalizes swaps of words and block of words too much

51 MT Evaluation BLEU (method to automate MT Evaluation) Required:
the closer a machine translation is to a professional human translation, the better it is BiLingual Evaluation Understudy Required: corpus of good quality human reference translations a “closeness” metric

52 MT Evaluation Two candidate translations from Chinese source
C1: It is a guide to action which ensures that the military always obeys the commands of the party C2: It is to insure the troops forever hearing the activity guidebook that party direct Intuitively: C1 is better than C2

53 MT Evaluation Three reference translations
R1: It is a guide to action that ensures that the military will forever heed Party commands R2: It is the guiding principle which guarantees the military forces always being under the command of the Party R3: It is the practical guide for the army always to heed the directions of the party

54 MT Evaluation Basic idea:
a good candidate translation shares many words and phrases with reference translations comparing n-gram matches can be used to rank candidate translations n-gram: a sequence of n word occurrences in BLEU n=1,2,3,4 1-grams give a measure of adequacy longer n-grams give a measure of fluency

55 MT Evaluation For unigrams: count the number of matching unigrams
in all references divide by the total number of unigrams (in the candidate sentence)

56 MT Evaluation Problem Solution:
C1: the the the the the the the (=7/7=1) R1: the cat is on the mat Solution: clip matching count (7) by maximum reference count (2)  2 (CountClip)  modified unigram precision = 2/7=0.29

57 MT Evaluation Example (unigrams)
C1: It is a guide to action which ensures that the military always obeys the commands of the party (17/18=0.94) R1: It is a guide to action that ensures that the military will forever heed Party commands R2: It is the guiding principle which guarantees the military forces always being under the command of the Party R3: It is the practical guide for the army always to heed the directions of the party

58 MT Evaluation Example (unigrams)
C2: It is to insure the troops forever hearing the activity guidebook that party direct (8/14=0.57) R1: It is a guide to action that ensures that the military will forever heed Party commands R2: It is the guiding principle which guarantees the military forces always being under the command of the Party R3: It is the practical guide for the army always to heed the directions of the party

59 MT Evaluation Example (bigrams)
C1: It is a guide to action which ensures that the military always obeys the commands of the party (10/17=0.59) R1: It is a guide to action that ensures that the military will forever heed Party commands R2: It is the guiding principle which guarantees the military forces always being under the command of the Party R3: It is the practical guide for the army always to heed the directions of the party

60 MT Evaluation Example (bigrams)
C2: It is to insure the troops forever hearing the activity guidebook that party direct (1/13=0.08) R1: It is a guide to action that ensures that the military will forever heed Party commands R2: It is the guiding principle which guarantees the military forces always being under the command of the Party R3: It is the practical guide for the army always to heed the directions of the party

61 MT Evaluation Extend to a full multi-sentence corpus
compute n-gram matches sentence by sentence sum the clipped n-gram counts for all candidates divide by the number of n-grams in the text corpus pn = ∑C ∈ {Candidates}∑n-gram ∈ C Countclip(n-gram) divided by ∑C’ ∈ {Candidates}∑n-gram’ ∈ C’ Count(n-gram’)

62 MT Evaluation Combining n-gram precision scores
weighted linear average works reasonable ∑Nn=1 wn pn but: n-gram decisions decays exponentially with n (so log to compensate for this) exp (∑Nn=1 wn log pn) weights in BLEU: wn = 1/N

63 MT Evaluation BLEU is a precision measure
#(C ∩ R) / #C Recall is difficult to define because of multiple reference translations e.g. #(C ∩ Rs) / # Rs where Rs = Ui Ri will not work

64 MT Evaluation C1: I always invariably perpetually do C2: I always do
R1: I always do R2: I invariably do R3: I perpetually do Recall of C1 over R1-3 is better than C2 but C2 is a better translation

65 MT Evaluation But without Recall: C1: of the
compared with R1-3 as before modified unigram precision = 2/2 modified bigram precision = 1/1 which is the wrong result

66 MT Evaluation Length n-gram precision penalizes translations longer than the reference but not translations shorter than the reference  Add Brevity Penalty (BP)

67 MT Evaluation bi= best match length = reference sentence length closest to candidate sentence i‘s length (e.g. r:12, 15, 17, c: 12  12) r = test corpus effective reference length = ∑i bi c = total length of candidate translation corpus

68 MT Evaluation BP = BLEU = BP • exp (∑Nn=1 wn log pn)
computed over the corpus not sentence by sentence and averaged 1 if c > r e(1-r/c) if c <= r BLEU = BP • exp (∑Nn=1 wn log pn)

69 MT Evaluation BLEU: claim: BLEU closely matches human judgement
when averaged over a test corpus not necessarily on individual sentences shown extensively in Papineni et al. 2001  multiple reference translations are desirable to cancel out translation styles of individual translators (e.g. East Asian economy v. economy of East Asia)

70 MT Evaluation Variants on BLEU NIST ROUGE (Lin and Hovy 2003)
different weights different BP ROUGE (Lin and Hovy 2003) for text summarization Recall-Oriented Understudy for Gisting Evaluation

71 MT Evaluation Main Advantage of BLEU Disadvantage automatic evaluation
good for use during development particularly useful for data-based systems Disadvantage defined for a whole test corpus not for individual sentences just measures difference with reference

72 MT: What is (perhaps) possible
Cross-Language Information Retrieval Low Quality MT for Gist extraction MT and Speech Technology Controlled Language Limited Domain Interaction with author Combinations of the above Computer-aided translation

73 MT: What is (perhaps) possible
Cross-Language Information Retrieval (CLIR) Input query: in own language Input query translated into target languages Search in target language documents Results in target language Translation of individual words only Growing need (growing multilingual Web) No perfect translation required

74 MT: What is (perhaps) possible

75 MT: What is (perhaps) possible
Low quality MT for Gist extraction Low quality but still useful If interesting high quality human translation can be requested (has to be paid for)

76 MT: What is (perhaps) possible

77 MT: What is (perhaps) possible

78 MT: What is (perhaps) possible
CLIR Fills a growing need in the market Is technically feasible Creates need for translation of found documents Solved partially by low quality MT Potentially creates need for more human translation Stimulates (funds) research into more sophisticated MT

79 MT: What is (perhaps) possible
Combine MT (statistical or rule-based) with OCR technology Make a picture of a text with your phone Text is OCR-ed Text is translated (usually a short and simple text) Linguatec Shoot & Translate Word Lens

80 MT: What is (perhaps) possible
Combine MT (statistical or rule-based) with Speech technology Complicates the problem on the one hand but Speech technology (ASR) is currently limited to very limited domains (makes MT simpler) Many useful applications for speech technology currently in the market Directory assistance Tourist Information Tourist communication Call Centers Navigation Hotel reservations Some will profit from in-built automatic translation

81 MT: What is (perhaps) possible
Large EC FP6 project TC-STAR (2004-) ( Research into improved speech technology (ASR and TTS) Research into statistical MT Research in combining both (speech-to-speech translation) In a few selected limited domains

82 MT: What is (perhaps) possible
Commercial Speech2Speech Translation Jibbigo Speech-to-speech translation (iPhone, Android) Talk to Me (Android phones)

83 MT: What is (perhaps) possible
Controlled Language Authoring System limits vocabulary and syntax of document authors Often desirable in companies to get consistent documentation (e.g. aircraft maintenance manuals) AECMA Simplified English GIFAS Rationalized French Makes MT easier (language well-defined)

84 MT: What is (perhaps) possible
Limited Domain Translation of Weather reports (TAUM-Meteo, Canada) Avalanche warnings (Switzerland) Fast adaptation to domain/company-specific vocabulary and terminology

85 MT: What is (perhaps) possible
Interaction with author No fully automatic translation Document author resolves Ambiguities unresolved by the system In a dialogue between the author and the system in the source language Approach taken in Rosetta project (Philips) Will only work if the #unresolved ambiguities is low Questions to resolve ambiguity are clear

86 MT: What is (perhaps) possible
Hij droeg een bruin pak Wat bedoelt u met “pak” (1) kostuum (2) pakket Wat bedoelt u met “dragen (droeg)” (1) aan of op hebben (kleding) (2) bij zich hebben (bijv. in de hand)

87 MT: What is (perhaps) possible
Combinations of the above

88 MT: What is (perhaps) possible
Computer-aided translation For end-users For professional translators/localization industry Limited functionality Specific terminology Bootstrap translation automatically Human revision and correction (Post-edit) Only if MT Quality is such that it reduces effort The system is fully integrated in the workflow system

89 Conclusions FAHQT not possible (yet?) MT is really very difficult!
Several constrained versions do yield usable technology with state-of-the-art MT In some cases: even potentially creates additional needs for MT and human translation

90 Conclusions Statistical MT yields practical relatively quick to produce systems (but low-quality) More research and lots of hard work is needed to get better systems Will probably require hybrid systems (mixed statistically based/knowledge based); the focus of research is here (PACO-MT, META-NET,…) Needs to be financed by niches where current state-of-the art MT yields usable technology and there is a market.


Download ppt "Machine Translation Introduction"

Similar presentations


Ads by Google