Presentation is loading. Please wait.

Presentation is loading. Please wait.

Machine translation markers in post-edited machine translation output

Similar presentations


Presentation on theme: "Machine translation markers in post-edited machine translation output"— Presentation transcript:

1 Machine translation markers in post-edited machine translation output
Translating and the Computer 40 London, UK #TC18

2 Translation vs. Post-Edited MT
Some authors say people prefer translated texts (Fiederer and O’Brien, 2009; Bowker and Buitrago Ciro, 2015) Others say people are not able to tell the difference between HT and PEMT (Daems, De Clercq and Macken, 2017)

3 No difference, really? Given that:
Post-editors tend to leave acceptable solutions unedited Machine translation tends to choose one of the solutions most frequently chosen by translators

4 No difference, really? Then:
The statistically most frequent solutions in human translation will occur with a higher than natural frequency in PEMT MT markers MT markers may be used to design tests to tell HT and PEMT apart

5 Preliminary experiment
51 postgraduate university students Extracts from Wikipedia entries on Venice (153 words) and Verona (168 words) Half did unaided human translations from English into Italian Half full-post-edited machine translation Microsoft Translator, both statistical and neural versions

6 Preliminary experiment
Compare human translations with source text to find turns of phrase and expressions (n-grams) that have been translated in a wide variety of different ways 41 n-grams identified Compare this variety to the number of ways in which the same n-grams have been rendered in post-edited MT Excluding translation errors

7 Example (there are) N-gram HT group PESMT group PENMT group
Combined PEMT group ci sono 10 7 11 18 sono 4 ospita 1 vanta presenta 2 sono presenti vi sono si possono trovare si possono visitare è possibile visitare possiamo trovare offre è famosa per ha TOTAL 26 12 24 È famosa per was rated as a debateable translation solution (see Method above for a definition) and therefore omitted from the calculations (there are numerous attractions in Venice vs. Venice is famous for numerous attractions). Fisher's exact two-tailed test

8 Translation errors HT PEMT Debatable choices 18 12 Mistranslation 35
42 Total 53 54 Errors per translator 2.04 2.25 Errors were only counted for the n-grams analysed 75/153 words = 49% of text The difference between the two groups is not statistically significant The quality is comparable if we evaluate quality purely in terms of translation errors

9 Variety (NTS/S) was: Higher in HT group 22 cases (88%)
Virtually the same 1 cases (4%) Difficult to calculate Clearly higher in PEMT group Difficult to calculate in 1 case (4%) Numerous post-editing errors cause highly uneven group sizes

10 22 cases of greater variety
5 x greater variety 1 4 x greater variety 2 3 x greater variety 2 x greater variety 4 Reverse

11 Conclusion Much greater variety of translation solutions in the HT group than in the combined PEMT group.

12 Translation in raw MT Top human choice 14 cases (56%)
Second to top human translation 3 cases (12%) Different inflection of the THC 1 cases (4%) Mistranslation 2 cases (8%) Unappealing solution Not rated among top human choices 4 cases (16%) 1 was an unappealing solution (all except one post-editor chose to change it)

13 Conclusion The raw MT outputs more often than not propose the most commonly chosen translation solutions found in human translation

14 THC frequency in PEMT There are two predominant cases when there was a statistically significant difference in THC frequency: When the raw MT output contained the THC, in which case it was significantly higher When the raw output contained the second to top human choice, in which case it was significantly lower

15 Conclusion If a post-editor finds a highly appealing translation solution, they tend to leave it and not waste time looking for alternatives.

16 MT markers Ideal candidate MT markers:
THC found in MT output THC occurs a very or extremely statistically significant number of times more in PEMT than in HT There is two or more times greater variety in HT than in PEMT Four n-grams satisfied these conditions There are was chosen for its ubiquity which makes it easily repeatable in a relatively short text without it seeming artificial

17 There are test A text (273 words / 4 paragraphs) containing 5 occurrences of there are was given to three volunteer professional translators for translation Google-translated (neural) and given to another three for full post-editing The raw MT output contained the same proposed solution (ci sono) for each of the five occurrences.

18 There are test SC 8 51 5 HT LZ 11 32 4 MLD 25 64 3 CP 16 47 1 PEMT PV
Professional experience (years) Time (minutes) Number of occurrences of ci sono Number of different solutions chosen HT/PEMT SC 8 51 5 HT LZ 11 32 4 MLD 25 64 3 CP 16 47 1 PEMT PV 28 45 DG 26 2 Surprise result Preliminary experiment says nothing about the variety of solutions adopted by an individual in the same job Variety of solutions chosen by a group or - by extrapolation - the community Post-editors in the there are test came up with a comparably wide range of solutions A different factor may have come into play Small scale may have distorted results This deserves further investigation A different factor may have come into play. Italians are taught that good writers should avoid unnecessary lexical repetition. Five occurrences of the same expression in four paragraphs may have triggered a repetitiveness alarm, turning an otherwise correct solution into an unacceptable one. Alternatively it may also be more simply argued that the scale of the second additional experiment may not be big enough to give reliable results.

19 Discussion Variety and inventiveness are not always desirable features
There are also various kinds of text where lexical uniformity is a negative quality factor In these cases, counting errors and measuring fluency and adequacy are not sufficient to judge translation quality

20 Discussion Preliminary experiment shows apparent normalization and homogenization of the choices made by post-editors as a whole Failure to remedy this normalization and homogenization may eventually lead to lexical impoverishment One solution might be to program NMT engines to sometimes randomly pick the second or third best fit translated sentence vectors Particularly in cultures where English has become the primary language in which new written material is created

21 Discussion Possible to train post-editors to add originality and inventiveness Defeats the object of post-editing (time and cost saving) As MT systems improve, homogenization and normalization will probably be exacerbated

22 Discussion On account of the findings reported herein, the use of PEMT for texts where variety, originality and inventiveness are quality factors would appear to be unadvisable with the MT technology currently available

23 Translating and the Computer 40
The End Translating and the Computer 40 London, UK #TC18


Download ppt "Machine translation markers in post-edited machine translation output"

Similar presentations


Ads by Google