Post-Editing – Professional translation service redefined

Slides:



Advertisements
Similar presentations
© 2000 XTRA Translation Services Is MT technology available today ready to replace human translators?
Advertisements

Machine Translation The Translator s Choice Heidi Düchting Sylke Krämer Johann Roturier.
Word processors can be used in many inventive ways, by both teachers and students. Teachers can prepare, create, store and share materials for their classes.
The Writing Process Communication Arts.
GETTING TO KNOW THE SAT TIPS AND TRICKS TO IMPROVE YOUR SAT SCORE MR. TORRES 10/02/2013.
Teaching writing.
Quality Standards, Tools and Metrics March 31, 2013 OmarAly & NohaSayed.
Project Proposal.
The Writing Process.
Computer Engineering 203 R Smith Project Tracking 12/ Project Tracking Why do we want to track a project? What is the projects MOV? – Why is tracking.
A Syntactic Translation Memory Vincent Vandeghinste Centre for Computational Linguistics K.U.Leuven
Carlos S. C. Teixeira Intercultural Studies Group Universitat Rovira i Virgili (Tarragona, Spain) Knowledge of provenance.
Machine Translation (Level 2) Anna Sågvall Hein GSLT Course, September 2004.
Machine Translation Anna Sågvall Hein Mösg F
TRANSLINK Training Effective Management and Supervision of PhD Candidates University of Indonesia, 9-10 May 2006 Postgraduate Supervision Dr. Paul Timms.
An innovative platform to allow translation and indexing of internet sites Localization World
MACHINE TRANSLATION TRANSLATION(5) LECTURE[1-1] Eman Baghlaf.
Software Construction and Evolution - CSSE 375 Software Documentation 1 Shawn & Steve Right – For programmers, it’s a cultural perspective. He’d feel almost.
 To highlight those areas of your skills and experience (on your resume) which make you particularly well-suited to the position for which you are applying.
Advanced Research Methodology
Automating Translation in the Localisation Factory An Investigation of Post-Editing Effort Sharon O’Brien Dublin City University.
An Automatic Segmentation Method Combined with Length Descending and String Frequency Statistics for Chinese Shaohua Jiang, Yanzhong Dang Institute of.
Editing Your Paper.
Educator’s Guide Using Instructables With Your Students.
Module Code CT1H01NI: Study Skills For Communication Technology Lecture for Week Autumn.
Carlos S. C. Teixeira Universitat Rovira i Virgili Knowledge of provenance: How does it affect TM/MT integration? New Research in Translation and Interpreting.
Preparing papers for International Journals Sarah Aerni Special Projects Librarian University of Pittsburgh 20 April 2005.
Methodologies. The Method section is very important because it tells your Research Committee how you plan to tackle your research problem. Chapter 3 Methodologies.
A COMPETENCY APPROACH TO HUMAN RESOURCE MANAGEMENT
Can Controlled Language Rules increase the value of MT? Fred Hollowood & Johann Rotourier Symantec Dublin.
Sofia Garcia/Roberto Silva Tutorial Workshop, GrenobleDate: 31/Jan/2007 The work of a professional translator and the translation agency V1.0.
SEG3120 User Interfaces Design and Implementation
Automatic Post-editing (pilot) Task Rajen Chatterjee, Matteo Negri and Marco Turchi Fondazione Bruno Kessler [ chatterjee | negri | turchi
Case Study Summary Link Translation entered a partner agreement with Autodesk to provide translation solutions integrating human and machine translation.
Introduction to Software Testing. Types of Software Testing Unit Testing Strategies – Equivalence Class Testing – Boundary Value Testing – Output Testing.
1 Technical & Business Writing (ENG-315) Muhammad Bilal Bashir UIIT, Rawalpindi.
Planning an Online Interaction "He who fails to plan, plans to fail" Anonymous Proverb.
The MultilingualWeb-LT Working Group receives funding by the European Commission (project name LT-Web) through the Seventh Framework Programme (FP7) in.
4th grade Expository, biography Social Studies- Native Americans
Individual Differences in Human-Computer Interaction HMI Yun Hwan Kang.
Writing Software Documentation A Task-Oriented Approach Thomas T. Barker Chapter 5: Analyzing Your Users Summary Cornelius Farrell Emily Werschay February.
Assessment and Testing
The Writing Process Language Arts.
ACT Reading Test The ACT Reading test is 40 questions long. There are four passages of ten questions. 52 seconds a question 8 minutes a passage 35 minutes.
Instructor Availability AIM address: EleBranch Office Hours: Mondays & Thursdays 9:00 p.m. to 10:00 p.m. And by appointment.
1 CS 501 Spring 2002 CS 501: Software Engineering Lecture 27 Software Engineering as Engineering.
1 Chapter 18: Selection and training n Selection and Training: Last lines of defense in creating a safe and efficient system n Selection: Methods for selecting.
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 4 Slide 1 Software Processes.
Keeping up with translation technologies: a call for experimental pedagogies Anthony Pym.
Focus on Writing How to Identify a Good Writing The Writing Process:Pre-Writing The Writing Process:Drafting and Editing Designing Controlled and Guided.
Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement Ariadna Font Llitjós, Katharina Probst, Jaime Carbonell Language Technologies.
Objective: Enabling students to translate from English into Arabic and vice versa. Why teach translation: It develops accuracy, fluency, clarity, and.
Oman College of Management and Technology Course – MM Topic 7 Production and Distribution of Multimedia Titles CS/MIS Department.
TYPE OF READINGS.
Proofing Documents Lesson 9 #1.09.
Scholastic Aptitude Test Developing Critical Reading Skills Doc Holley.
A Simple English-to-Punjabi Translation System By : Shailendra Singh.
Evaluating Translation Memory Software Francie Gow MA Translation, University of Ottawa Translator, Translation Bureau, Government of Canada
Review: Review: Translating without in-domain corpus: Machine translation post-editing with online learning techniques Antonio L. Lagarda, Daniel Ortiz-Martínez,
Target Language use in the Second Language Classroom.
RUBRICS AND SCALES 1. Rate yourself on what you already know about scales. Use the scale below to guide your reflection. 2.
 In Ned law are a company that provides strategic consulting and management, composed of a team of high academic and social esteem, focused on optimization,
An experiment in teaching translation technologies
The Systems Engineering Context
Master of Translation An introduction to post-editing
ENGLISH TEST 45 Minutes – 75 Questions
Why use quotations and paraphrases?
William Dietz Writing Specialist QU Writing Lab
QA Reviews Lecture # 6.
For this TDA, you will analyze the author's use of suspense.
Presentation transcript:

Post-Editing – Professional translation service redefined Darja Fišer University of Ljubljana MT@Work 5 December 2014 Brussels, Belgium

Presentation Outline Basic concepts about post-editing Quality in translation projects Types of post-editing Post-editing guidelines

MOTIVATION Why should I care about post-editing?

Why PEMT? Increasing demand for PEMT in the market: increasing volume of short-lived documents different levels of text quality acceptable The industry perspective on MT: to lower productivity prices to publish more content to publish in more languages to publish in less time The TAUS (2010) survey: 52% companies in the US, EU & Asia provide PE services regularly 74% of the resources they used are freelance translators

the big picture How does MT affect the translation process and the translator?

Integrating MT in the translation process Phase 1: Translation Memories Source text 0% translated Translation Memory (TM) Hybrid x% Phase 2: Machine Translation (MT) Hybrid text 100% translated but with MT errors Un-translated segments Phase 3: Post-editing Target text 100% translated Post-editor

The role of translators in PEMT The role of PEMT experts: edit the output select the adequate corpus clean up the data so the output is more suitable for the customer provide constant feedback to improve the system’s performance the role changes as MT improves The nature of PEMT projects: large contents of highly repetitive nature, short-lived, internal use pre-editing (at the SL level before MT to avoid ambiguous input to the MY system) post-editing (at the TL level after MT to correct errors in the MT output)

basic concepts in pemt Is post-editing a translation task or a revision task?

Post-editing vs. Translation reviewing an MT text against an original text and correcting any errors in order to comply with a set of quality criteria in as few edits as possible the set of quality criteria ≠ a personal idea of translation quality as few edits as possible > to increase the productivity PE vs. Translation: Translation: 1 source PE: 2 sources (the original & the raw MT output): reject the MT output & translate from scratch (PE closer to Trans than Rev) correct a lot/a few of errors (PE closer to Rev than Trans) accept the proposed translation as is (PE closer to Rev than Trans) PE should be done by a translator, not a monolingual reviewer!

Post-editing vs. Revision PE: deals with recurring, predictable errors MT texts put a strain on the post-editing expert, so PE is more cognitively demanding Revision: checks for random mistranslations or omissions human errors more difficult to spot but the texts are easier to read PE & Revision both require specific skills and should be tackled by translators trained & experienced in the task! > 100,000 words / 1 month of full-time post-editing

the job profile What skills and qualities do I need to be a good post-editing expert?

Skills for post-editing (O’Brien 2002) Degree in translation or related subjects Expert in the subject area and target language Proficient in the source language and contrastive issues Experience in technical translation/localization Advanced word processing skills, full key proficiency (search&replace) Positive, tolerant and open-minded towards MT Confidence in abilities and technical expertise Recognition of typical and repetitive MR errors Ability to use macros and coded dictionaries Advanced terminology management skills Background MT knowledge, types of PE and levels of expected quality Pre-editing skills (controlled language & controlled authoring tools) Programming skills for automatically correcting errors

MT quality What can I expect from MT and what can clients expect from PEMT?

Common MT errors What MT errors to expect: Depend on the MT system, the content and the language pair used! error analysis time-consuming but: crucial to improve the MT system crucial to raise awareness about the post-editing task Several error classifications exist (Schäffer, 2003): Lexical errors (general vocabulary, terminology, polysemy, idioms) Syntactic errors (sentence analysis, word order) Grammatical mistakes (tense, number, gender, case, punctuation) Errors due to defective input (mistakes in the source language)

Quality in technical translation and localization Functionalist’s approach to quality: the focus is on the customer’s needs and what they pay for quality is variable and is defined by clients, not the society in general Fit-for-purpose! (not what trained translators would consider the best) Quality of MT standard MT evaluation measures (BLEU, Meteor, NIST, TER): how close the input is to human quality with a single number not very reliable in most translation projects manual quality assessment needed! crucial for productivity savings & pricing random strings checked for grammar, terminology and format (grades 1-5) very specific client’s quality expectations needed! (rapid/full PE) Quality of PE MT is used to save costs, so revision of PE texts is usually not done Crucial to strike a balance between speed and the quality of PE

types of post-editing How much post-editing should I do?

Different levels of post-editing No post-editing directly published on the internet, with disclaimer Rapid post-editing suitable for short-lived documents needed gisting & internal use min editing, shortest time possible, min no. of changes, to remove blatant & significant errors, no stylistic changes Full post-editing leading to human quality, required for texts for publication max editing, all errors and stylistic changes taken into account (but still in less time than translating from scratch) Criteria: the MT system and language pair used the domain and structure of the text the use of the final text, the desired quality and the type of readers the volume of translation and the time available

post-editing guidelines What exactly should I correct and how much?

General guidelines for PE Language- and project-specific guidelines needed for each project! as short and precise as possible: a description of the MT system and the source text used a description of the quality of MT output and the expected quality of the finished translation scenarios when to discard a useless segment typical types of errors that need to be corrected changes to be avoided terminology issues

Guidelines for rapid PE Read the source segment first Read the MT suggestion Make the necessary changes: Make sure the content of the sentence is accurate If the terminology is incorrect, don’t spend too much time researching Don’t post-edit word-order if the sentence can be understood as is Don’t change style Don’t replace words with a synonym Don’t correct grammar mistakes unless the target sentence doesn’t reflect the meaning of the source sentence

Guidelines for full PE Always very project-specific Use the MT suggestion if: a large piece of the sentence is correct the raw MT quality is very good with only minor corrections needed the raw MT quality is not so good but would still be faster to correct it than to translate from scratch the MT has the correct meaning and is completely understandable Don’t use the MT suggestion if: the raw MT doesn’t make any sense and it would take longer to correct it than to translate from scratch you need a more than a few seconds to understand it there are errors that would require rearranging most of the text

Examples from the guidelines at Microsoft The 5-10 second evaluation: the maximum time you should spend evaluating the validity of the MT suggestion if it is hard to understand already at the beginning, don’t even read the whole sentence, just proceed to translate from scratch instead. The High-5 & Low-5 rule: When you detect a long sentence, do the following: Read the first 5 words. If it’s good, read on until it’s bad, then stop and copy the correct part and continue to translate and forget about reading on. If the first 5 or 6 words aren’t good, skip to read the last 5 or 6 words. It the last part of the sentence is correct, use it, or just start the whole thing from scratch. If both first 5 and last 5 words are incorrect, do not carry on reading through the middle to try to identify correct MT segments. Just discard the MT suggestion and proceed to translate from scratch.

Post-editing effort and productivity How hard will post-editing be and how much will I gain?

Post-editing effort Key element to decide if the use of MT is worthwhile or not (Krings, 2001): Temporal PE effort Does PEMT save time vs. human translation? Does PEMT save time vs. TM fuzzy matches? Depends of the quality of the raw MT output and type of errors! Cognitive PE effort How complex and cognitively demanding are the corrections? Obvious mistakes (gender) vs. ambiguous complex syntactic structures Technical PE effort Does PE require to delete, insert, reorder or a all 3? Measuring PE effort: temporal: the easiest to measure cognitive & technical PE: eye-trackers, Translog, Think Aloud Protocols (useful in research, less so in the commercial world)

Post-editing productivity One of the big unknown factors in PEMT projects new field, so no standard metrics exist productivity in PE estimated at 4,000-10,000 words/day many variables to consider: the quality of raw MT output? the productivity of translators in general? the experience of post-editors? the amount of effort to post-edit fuzzy matches? inconclusive results: early studies: show productivity gains up to 3 times compared to HT (Vasconcellos and Leon 1985) recent studies: productivity gain not always achieved (O’Brien 2006, Guerberof 2008) commercial users: many claim high productivity gains but don’t make their methodology available Test before you commit!

lets pemt! PEMT is here to stay

Learn & Teach PEMT Ride the wave!