Presentation on theme: "POST-EDITING – PROFESSIONAL TRANSLATION SERVICE REDEFINED Darja Fišer University of Ljubljana 5 December 2014 Brussels, Belgium."— Presentation transcript:
POST-EDITING – PROFESSIONAL TRANSLATION SERVICE REDEFINED Darja Fišer University of Ljubljana 5 December 2014 Brussels, Belgium
Presentation Outline 1. Basic concepts about post-editing 2. Quality in translation projects 3. Types of post-editing 4. Post-editing guidelines
MOTIVATION Why should I care about post-editing?
Why PEMT? Increasing demand for PEMT in the market: increasing volume of short-lived documents different levels of text quality acceptable The industry perspective on MT: to lower productivity prices to publish more content to publish in more languages to publish in less time The TAUS (2010) survey: 52% companies in the US, EU & Asia provide PE services regularly 74% of the resources they used are freelance translators
THE BIG PICTURE How does MT affect the translation process and the translator?
Integrating MT in the translation process Phase 1: Translation Memories Phase 1: Translation Memories Source text 0% translate d Source text 0% translate d Translation Memory (TM) Hybrid text x% translate d Hybrid text x% translate d Phase 2: Machine Translation Phase 2: Machine Translation Machine Translation (MT) Machine Translation (MT) Hybrid text 100% translate d but with MT errors Hybrid text 100% translate d but with MT errors Un- translated segments Un- translated segments Phase 3: Post-editing Phase 3: Post-editing Target text 100% translate d Target text 100% translate d Post- editor
The role of translators in PEMT The role of PEMT experts: edit the output select the adequate corpus clean up the data so the output is more suitable for the customer provide constant feedback to improve the system’s performance the role changes as MT improves The nature of PEMT projects: large contents of highly repetitive nature, short-lived, internal use pre-editing (at the SL level before MT to avoid ambiguous input to the MY system) post-editing (at the TL level after MT to correct errors in the MT output)
BASIC CONCEPTS IN PEMT Is post-editing a translation task or a revision task?
Post-editing vs. Translation Post-editing: reviewing an MT text against an original text and correcting any errors in order to comply with a set of quality criteria in as few edits as possible the set of quality criteria ≠ a personal idea of translation quality as few edits as possible > to increase the productivity PE vs. Translation: Translation: 1 source PE: 2 sources (the original & the raw MT output): 1. reject the MT output & translate from scratch (PE closer to Trans than Rev) 2. correct a lot/a few of errors (PE closer to Rev than Trans) 3. accept the proposed translation as is (PE closer to Rev than Trans) PE should be done by a translator, not a monolingual reviewer!
Post-editing vs. Revision PE: deals with recurring, predictable errors MT texts put a strain on the post-editing expert, so PE is more cognitively demanding Revision: checks for random mistranslations or omissions human errors more difficult to spot but the texts are easier to read PE & Revision both require specific skills and should be tackled by translators trained & experienced in the task! > 100,000 words / 1 month of full-time post-editing
THE JOB PROFILE What skills and qualities do I need to be a good post- editing expert?
Skills for post-editing (O’Brien 2002) Degree in translation or related subjects Expert in the subject area and target language Proficient in the source language and contrastive issues Experience in technical translation/localization Advanced word processing skills, full key proficiency (search&replace) Positive, tolerant and open-minded towards MT Confidence in abilities and technical expertise Recognition of typical and repetitive MR errors Ability to use macros and coded dictionaries Advanced terminology management skills Background MT knowledge, types of PE and levels of expected quality Pre-editing skills (controlled language & controlled authoring tools) Programming skills for automatically correcting errors
MT QUALITY What can I expect from MT and what can clients expect from PEMT?
Common MT errors What MT errors to expect: Depend on the MT system, the content and the language pair used! error analysis time-consuming but: crucial to improve the MT system crucial to raise awareness about the post-editing task Several error classifications exist (Schäffer, 2003): 1. Lexical errors (general vocabulary, terminology, polysemy, idioms) 2. Syntactic errors (sentence analysis, word order) 3. Grammatical mistakes (tense, number, gender, case, punctuation) 4. Errors due to defective input (mistakes in the source language)
Quality in technical translation and localization Functionalist’s approach to quality: the focus is on the customer’s needs and what they pay for quality is variable and is defined by clients, not the society in general Fit-for-purpose! (not what trained translators would consider the best) Quality of MT standard MT evaluation measures (BLEU, Meteor, NIST, TER): how close the input is to human quality with a single number not very reliable in most translation projects manual quality assessment needed! crucial for productivity savings & pricing random strings checked for grammar, terminology and format (grades 1-5) very specific client’s quality expectations needed! (rapid/full PE) Quality of PE MT is used to save costs, so revision of PE texts is usually not done Crucial to strike a balance between speed and the quality of PE
TYPES OF POST-EDITING How much post-editing should I do?
Different levels of post-editing 1. No post-editing directly published on the internet, with disclaimer 2. Rapid post-editing suitable for short-lived documents needed gisting & internal use min editing, shortest time possible, min no. of changes, to remove blatant & significant errors, no stylistic changes 3. Full post-editing leading to human quality, required for texts for publication max editing, all errors and stylistic changes taken into account (but still in less time than translating from scratch) Criteria: the MT system and language pair used the domain and structure of the text the use of the final text, the desired quality and the type of readers the volume of translation and the time available
POST-EDITING GUIDELINES What exactly should I correct and how much?
General guidelines for PE Language- and project-specific guidelines needed for each project! as short and precise as possible: a description of the MT system and the source text used a description of the quality of MT output and the expected quality of the finished translation scenarios when to discard a useless segment typical types of errors that need to be corrected changes to be avoided terminology issues
Guidelines for rapid PE 1. Read the source segment first 2. Read the MT suggestion 3. Make the necessary changes: Make sure the content of the sentence is accurate If the terminology is incorrect, don’t spend too much time researching Don’t post-edit word-order if the sentence can be understood as is Don’t change style Don’t replace words with a synonym Don’t correct grammar mistakes unless the target sentence doesn’t reflect the meaning of the source sentence
Guidelines for full PE Always very project-specific Use the MT suggestion if: a large piece of the sentence is correct the raw MT quality is very good with only minor corrections needed the raw MT quality is not so good but would still be faster to correct it than to translate from scratch the MT has the correct meaning and is completely understandable Don’t use the MT suggestion if: the raw MT doesn’t make any sense and it would take longer to correct it than to translate from scratch you need a more than a few seconds to understand it there are errors that would require rearranging most of the text
Examples from the guidelines at Microsoft The 5-10 second evaluation: the maximum time you should spend evaluating the validity of the MT suggestion if it is hard to understand already at the beginning, don’t even read the whole sentence, just proceed to translate from scratch instead. The High-5 & Low-5 rule: When you detect a long sentence, do the following: Read the first 5 words. If it’s good, read on until it’s bad, then stop and copy the correct part and continue to translate and forget about reading on. If the first 5 or 6 words aren’t good, skip to read the last 5 or 6 words. It the last part of the sentence is correct, use it, or just start the whole thing from scratch. If both first 5 and last 5 words are incorrect, do not carry on reading through the middle to try to identify correct MT segments. Just discard the MT suggestion and proceed to translate from scratch.
POST-EDITING EFFORT AND PRODUCTIVITY How hard will post-editing be and how much will I gain?
Post-editing effort Key element to decide if the use of MT is worthwhile or not (Krings, 2001): Temporal PE effort Does PEMT save time vs. human translation? Does PEMT save time vs. TM fuzzy matches? Depends of the quality of the raw MT output and type of errors! Cognitive PE effort How complex and cognitively demanding are the corrections? Obvious mistakes (gender) vs. ambiguous complex syntactic structures Technical PE effort Does PE require to delete, insert, reorder or a all 3? Measuring PE effort: temporal: the easiest to measure cognitive & technical PE: eye-trackers, Translog, Think Aloud Protocols (useful in research, less so in the commercial world)
Post-editing productivity One of the big unknown factors in PEMT projects new field, so no standard metrics exist productivity in PE estimated at 4,000-10,000 words/day many variables to consider: the quality of raw MT output? the productivity of translators in general? the experience of post-editors? the amount of effort to post-edit fuzzy matches? inconclusive results: early studies: show productivity gains up to 3 times compared to HT (Vasconcellos and Leon 1985) recent studies: productivity gain not always achieved (O’Brien 2006, Guerberof 2008) commercial users: many claim high productivity gains but don’t make their methodology available Test before you commit!