Presentation is loading. Please wait.

Presentation is loading. Please wait.

Carlos S. C. Teixeira Intercultural Studies Group Universitat Rovira i Virgili (Tarragona, Spain) Knowledge of provenance.

Similar presentations


Presentation on theme: "Carlos S. C. Teixeira Intercultural Studies Group Universitat Rovira i Virgili (Tarragona, Spain) Knowledge of provenance."— Presentation transcript:

1 Carlos S. C. Teixeira Intercultural Studies Group Universitat Rovira i Virgili (Tarragona, Spain) Knowledge of provenance and its effects on translation performance (in an integrated TM/MT environment) NLPCS th International Workshop on Natural Language Processing and Cognitive Science Special Issue: Human-Machine Interaction in Translation August, Copenhagen, Denmark

2 Carlos S. C. Teixeira © 2011 Universitat Rovira i Virgili

3 Carlos S. C. Teixeira © 2011 Universitat Rovira i Virgili

4  Speed: Will you translate faster?  Effort: Will you feel more tired?  Quality: Will you translate better? Reason: Does provenance play a role? Carlos S. C. Teixeira © 2011 Universitat Rovira i Virgili

5  Speed: V is faster than B H1: The translation speed is higher in V than in B  Effort: V requires less editing than B H2: The amount of editing is smaller in V than in B  Quality: V and B produce similar quality H4: There is no significant difference in quality between V and B Carlos S. C. Teixeira © 2011 Universitat Rovira i Virgili

6 English textSpanish text Translation Memory (Alignment) Source text 1 Source text 2 Exact matches 90-99% fuzzy 80-89% fuzzy 70-79% fuzzy No matches (MT) TM 1 TM 2

7 ◦ Same type of text ◦ Same types of matches ◦ Same machine-translation engine (ecological validity) So what is different?  Provenance information Carlos S. C. Teixeira © 2011 Universitat Rovira i Virgili

8

9  BBFlashBack ◦ Screen activity ◦ Keystrokes ◦ Mouse movements and clicks ◦ Translator’s face ◦ Sound (voices, keyboard, etc)  Retrospective interviews  Quality assessment

10 Data treatment 1 st RENDERINGTYPINGNOTES2 nd RENDERINGTYPINGNOTES 1FUZZY 75% 00:00,0000:40,3340, :38,4418:43,8905,450 00,00 2FUZZY 86% 00:40,5601:37,7857, :44,5618:46,2201,660 00,00 3NO MATCH 02:30,3304:31,67121,3417Asks a question to researcher18:46,8919:19,3332, ,00 4NO MATCH 04:31,6704:38,2206,55 19:20,0019:28,2208,220 05:05,7805:09,5603,78 00,00 06:28,5606:51,4422,88 00,00 11:51,3313:06,3375, ,00 14:04,3314:35,1130,7834 5NO MATCH 14:35,6715:57,2281, :28,7819:43,6714,898 00,00 6NO MATCH 15:57,8917:17,8980, :44,3319:59,8915,560 00,00 7FUZZY 87% 17:18,5619:14,44115, :00,4420:16,5616,125 00,00 8EXACT 19:14,4420:49,1194, :17,2220:34,3317,119 22:47,2223:03,7816,561 00,00 9NO MATCH 23:04,3323:24,4420,11- 20:35,0020:59,6742, :08,4426:52,0043, ,00 27:53,5628:15,2221, ,00 10FUZZY 95% 28:16,0030:35,11139, :00,4421:11,6711,230 31:03,5631:39,7836,220 00,00 31:51,3332:48,6757, ,00 33:23,6734:28,0064, ,00 11FUZZY 99% 34:28,5634:51,6723, :12,2221:13,3301,110 00,00 12FUZZY 74% 34:52,3335:14,5622,230 21:14,1121:25,1111,000 35:41,1137:17,8996, ,00 37:46,4438:04,3317,892 00,00 43:11,5643:19,5608,000 00,00 55:10,6755:51,8941, EXACT 55:52,4457:12,2279, :25,7821:39,5613,780 00,00 14EXACT 57:12,7857:23,2210,443 Researcher interrupts subject to tell he has to leave the room for a while21:40,0021:44,3304,330 57:35,8958:25,2249, ,00 59:10,2259:31,8921,67 00,00 15NO MATCH 59:32,5600:44,8972, :44,8921:58,7813,890 00,00 16EXACT 00:45,5601:27,4441,88 21:59,4422:04,1104,670 02:28,7802:48,2219,44 00,00 05:22,4405:33,5611,12 00,00 05:37,7806:25,3347,55 00,00 06:54,0007:17,8923,89 00,00 09:11,7809:31,0019,22 00,00 11:05,6711:24,2218, ,00 12:07,8912:28,1120,22 00,00 13:15,6713:32,1116,44 00,00 13:44,2214:24,1139, ,00 258,20 00,00 17FUZZY 86% 14:24,6715:01,4436, :04,6722:17,1112,440 00,00 18NO MATCH 15:02,0015:14,0012, :17,6722:18,8901,220 00,00 19FUZZY 93% 15:14,6715:35,2220, :19,4423:00,4441,0023Check sound here! 15:56,0016:24,0028, ,00 20FUZZY 72% 16:24,5617:11,5647, :01,1123:04,3303,220 00,00 21EXACT 17:12,2217:47,1134,899 23:04,8923:14,4409,550 00,00

11 Data treatment SOURCE WORDS TIME (sec) 1 st rendition SPEED (words/h) 1 st rendition TIME (sec) Proof- reading SPEED (words/h) Combined TARGET CHARS TYPED CHARS 1 st rendition AMOUNT OF EDITING 1 st rendition TYPED CHARS 2 nd rend AMOUNT OF EDITING Combined TRANSLATION BLIND (Text12) EXACT (100%) MATCHES SEGMENT #130111, , ,95%984,87% SEGMENT #23079, , ,09%0 SEGMENT #32581, , ,33%0 SEGMENT #418258,22514, ,49%0 SEGMENT #52534, , ,77%0 TOTAL128565, , ,13%939,27% 90-99% MATCHES SEGMENT # , ,52%0 SEGMENT #2723, , ,56%0 SEGMENT #32048, ,03%2382,35% TOTAL65368, , ,66%2349,57% 80-89% MATCHES SEGMENT #12757, , ,11%0 SEGMENT #224115, , ,13%566,25% SEGMENT #32636, , ,19%0 TOTAL77209, , ,70%539,90% 70-79% MATCHES SEGMENT #11640, , ,19%0 SEGMENT #244186, ,19%0 SEGMENT # , ,49%0 TOTAL77273, , ,06%0 NO MATCHES (MT FEEDS) SEGMENT #131121, , ,76%1916,44% SEGMENT #230138,997778, ,02%0 SEGMENT #32681, , ,38%819,61% SEGMENT # , ,52%0 SEGMENT #51585, , ,11%26109,47% SEGMENT #62972, , ,57%0 SEGMENT # , ,33%0 TOTAL165591, , ,62%5333,58%

12 Carlos S. C. Teixeira © 2011 Universitat Rovira i Virgili SOURCE WORDS TIME (sec) 1 st rendition SPEED (words/h) 1 st rendition TIME (sec) 2 nd rendition SPEED (words/h) Combined TARGET CHARS TYPED CHARS 1 st rendition AMOUNT OF EDITING 1 st rendition TYPED CHARS 2 nd rend AMOUNT OF EDITING Combined COPY ,892337, ,18% TRANSL W/O CAT 79380,89746, ,78% VISUAL EXACT (100%) MATCHES % MATCHES % MATCHES % MATCHES NO MATCHES (MT FEEDS) BLIND EXACT (100%) MATCHES ,13%939,27% 90-99% MATCHES ,66%2349,57% 80-89% MATCHES ,70%539,90% 70-79% MATCHES ,06%0 NO MATCHES (MT FEEDS) ,62%5333,58% ,71% Preliminary results

13 Carlos S. C. Teixeira © 2011 Universitat Rovira i Virgili Preliminary results Subject 1: Translation speed (words/hour)

14 Carlos S. C. Teixeira © 2011 Universitat Rovira i Virgili Preliminary results Subject 1: Translation speed (words/hour)

15 Carlos S. C. Teixeira © 2011 Universitat Rovira i Virgili Preliminary results Subject 1: Translation speed (words/hour)

16 Carlos S. C. Teixeira © 2011 Universitat Rovira i Virgili Preliminary results Subject 2: Translation speed (words/hour)

17 Carlos S. C. Teixeira © 2011 Universitat Rovira i Virgili Preliminary results Subject 2: Translation speed (words/hour)

18 Carlos S. C. Teixeira © 2011 Universitat Rovira i Virgili Preliminary results Subject 2: Translation speed (words/hour)

19 Carlos S. C. Teixeira © 2011 Universitat Rovira i Virgili Preliminary results Quality

20 Carlos S. C. Teixeira © 2011 Universitat Rovira i Virgili Preliminary results Conclusions:  Testing of first hypothesis (speed) is inconclusive if we take the whole texts as a reference.  Subject1 was slightly faster (5.2 percent) in environment V, while Subject2 was slightly faster (5.6 percent) in environment B.  Overall speed depends on the distribution of different types of translation suggestions in the texts (besides individual-specific differences).

21  Small number of subjects  Small number of segments  Irregular segments  Terminology  Segment identification  Experience increases over time  Subject variability  Quality assessment? Carlos S. C. Teixeira © 2011 Universitat Rovira i Virgili

22 Carlos S. C. Teixeira © 2011 Universitat Rovira i Virgili Conclusions ?Generalisations? Specific type of text Particular subject Given fuzzy match grid A particular MT engine

23  Quality assessments  Retrospective interviews  Statistical analysis  MT trust scores?  Eye-tracking?  Translog?  Implications/Applications of findings Carlos S. C. Teixeira © 2011 Universitat Rovira i Virgili

24  O’Brien, Sharon Eye-tracking and translation memory matches. Perspectives: Studies in Translatology 14, n. 3:  Guerberof, Ana Productivity and quality in the post-editing of outputs from translation memories and machine translation. Localisation Focus - The International Journal of Localisation 7, n. 1:  Christensen, Tina Paulsen & Anne Schjoldager “Translation-Memory (TM) Research: What Do We Know and How Do We Know It?” Hermes – Journal of Language and Communication Studies. Carlos S. C. Teixeira © 2011 Universitat Rovira i Virgili

25 Thank you! Carlos S. C. Teixeira Intercultural Studies Group Universitat Rovira i Virgili (Tarragona, Spain)


Download ppt "Carlos S. C. Teixeira Intercultural Studies Group Universitat Rovira i Virgili (Tarragona, Spain) Knowledge of provenance."

Similar presentations


Ads by Google