Presentation is loading. Please wait.

Presentation is loading. Please wait.

Stephen Doherty, CNGL/SALIS

Similar presentations


Presentation on theme: "Stephen Doherty, CNGL/SALIS"— Presentation transcript:

1 Stephen Doherty, CNGL/SALIS stephen.doherty2@mail.dcu.ie

2  Past Research  Readability & Comprehensibility  Controlled Language  Research Proposal(Methodology)  Evaluation (Eye Tracking)  Conclusion 2

3  Translating Versus Post-Editing: A Segmentation Comparison Based on Pauses (B.A. Dissertation)  Think-Aloud Protocols in Translation Studies (Interessen der kognitiv orientiereten Translationswissenschaft) 3

4  CNGL Work Package: ILT1.8 Controlled Language:  Supervisors – Dr. Sharon O’Brien, Dr. Dorothy Kenny  “adapt the systems developed by other ILT WPs to deal with in-house data which conforms to both source and target controlled language guidelines” 4

5  What is readability?  (Gray 1935: “In the reader, those features affecting readability are 1. prior knowledge, 2. reading skill, 3. interest, and 4. motivation. In the text, those features are 1. content, 2. style, 3. design, and 4. structure”.)  What is comprehensibility? 5

6  Metrics: (Reading scores, recall tests...)  E.g. Flesch Reading Ease:  Gunning-Fog Index – SMOG (Simple Measure of Gobbledygook) (Mc Laughlin 1969) 6

7  What is controlled language? “an explicitly defined restriction of a natural language that specifies constraints on lexicon, grammar, and style” (Huijsen, 1998) 7

8  Types of CL:  Human-Orientated Controlled Language (HOCL): readability & comprehensibility e.g. AECMA Simplified English  Machine-Orientated Controlled Language (MOCL): improved translatability, MT system specific (Huijsen, 1998) 8

9  Examples of CLs: AECMA Simplified English, Sun Microsystem’s Controlled English, IBM Easy English, Caterpillar Technical English, GM...  Usage (mostly English, but…)  Symantec (CNGL Industry Partner) 9

10  Roturier (2006):  C onsistent spelling (54)  Do not use pronouns that have no specific referent (19)  Avoid unusual punctuation (35)  Avoid embedded clauses introduced by commas or dashes (41)  Do not use more than 25 words per sentence (5)  Use a question mark only at the end of a direct question (48) 10

11  O’Brien (2003) - three types of rule categories:  Lexical (e.g. Rules that allow or rule out the use of specific acronyms or abbreviations)  Syntactic (e.g. specifying when and where past participles can be used and avoiding the present participle)  Textual:  Text Structure (e.g. Specifying admissible sentence length)  Pragmatic (e.g. Using certain verb forms for specific text purposes – imperative for instructions) 11

12 A comparative investigation of the readability and comprehensibility of SMT and RBMT output for controlled and uncontrolled input 12

13 13

14 14

15 15

16  Both automatic and human evaluation (focus)  Automatic evaluation (Blue…)  Human evaluation: eye tracking & retrospective protocols (recall tests & interviews) 16

17  Eye Tracking:  What is it exactly? (background)  Successful application in this research area  Tobii Eye Tracker & ClearView software  Additional video recording, keystroke & mouse logging 17

18 18 Tobii 1750 Eye Tracker (www.tobii.se)

19  Recall tests (comprehensibility)  Retrospective interviews (generation of additional data & resolving possible issues) 19

20 20

21 21


Download ppt "Stephen Doherty, CNGL/SALIS"

Similar presentations


Ads by Google