Presentation on theme: "Joke Daems Supervised by: Lieve Macken, Sonia Vandepitte, Robert Hartsuiker Two sides of the same."— Presentation transcript:
Joke Daems firstname.lastname@example.org www.lt3.ugent.be/en/projects/robot Supervised by: Lieve Macken, Sonia Vandepitte, Robert Hartsuiker Two sides of the same coin assessing translation quality through adequacy and acceptability error analysis
What makes error analysis so complicated? “There are some errors for all types of distinctions, but the most problematic distinctions were for adequacy/fluency and seriousness.” – Stymne & Ahrenberg, 2012 Does a problem concern adequacy, fluency, both, neither? How do we determine the seriousness of an error?
Two types of quality “Whereas adherence to source norms determines a translation's adequacy as compared to the source text, subscription to norms originating in the target culture determines its acceptability.” - Toury, 1995 Why mix?
Acceptability: fine-grained Grammar & SyntaxLexiconSpelling & TyposStyle & RegisterCoherence articlewrong prepositioncapitalizationregisterconjunction comparative/superlativewrong collocationspelling mistakeuntranslatedmissing info singular/pluralword nonexistentcompoundrepetitionlogical problem verb form punctuationdisfluentparagraph article-noun agreement typoshort sentencesinconsistency noun-adj agreement long sentencecoherence - other subject-verb agreement text type reference style – other missing superfluous word order structure grammar – other
Adequacy: fine-grained Meaning shift contradiction meaning shift caused by misplaced word word sense disambiguationdeletion hyponymyaddition hyperonymyexplicitation terminologycoherence quantityinconsistent terminology timeother meaning shift caused by punctuation
How serious is an error? “Different thresholds exist for major, minor and critical errors. These should be flexible, depending on the content type, end-user profile and perishability of the content.” - TAUS, error typology guidelines, 2013 Give different weights to error categories depending on text type & translation brief
Reducing subjectivity Flexible error weights More than one annotator Consolidation phase
Next step: diagnostic & comparative evaluation What makes a ST-passage problematic? How problematic is this passage really? (i.e.: how many translators make errors) Which PE errors are caused by MT? Which MT errors are hardest to solve? Link all errors to corresponding ST-passage
Source text-related error sets ST: Changes in the environment that are sweeping the planet... MT: Veranderingen in de omgeving die het vegen van de planeet tot stand brengen... (wrong word sense) "Changes in the environment that bring about the brushing of the planet..." PE1: Veranderingen in de omgeving die het evenwicht op de planeet verstoren... (other type of meaning shift) "Changes in the environment that disturb the balance on the planet..." PE2: Veranderingen in de omgeving die over de planeet rasen... (wrong collocation + spelling mistake) "Changes in the environment that raige over the planet..."
Summary Improve error analysis by: – judging acceptability and adequacy separately – making error weights depend on translation brief – having more than one annotator – introducing consolidation phase Improve diagnostic and comparative evaluation by: – linking errors to ST-passages – taking number of translators into account
Open questions How can we reduce annotation time? – Ways of automating (part) of the process? – Limit annotation to subset of errors? How to better implement ST-related error sets? – Ways of automatically aligning ST, MT, and various TT’s at word-level?
Thank you for listening For more information, contact: email@example.com Suggestions? Questions?
Quantification of ST-related error sets ST MT (1) MT1(0.5) wrong word sense (0.5) MT2 (0.5) PE (1) PE1 (0.5) other meaning shift (0.5) PE2(0.5) wrong collocation (0.25) spelling mistake (0.25)