Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Lending a Hand: Sign Language Machine Translation Sara Morrissey NCLT Seminar Series 21 st June 2006.

Similar presentations


Presentation on theme: "1 Lending a Hand: Sign Language Machine Translation Sara Morrissey NCLT Seminar Series 21 st June 2006."— Presentation transcript:

1 1 Lending a Hand: Sign Language Machine Translation Sara Morrissey NCLT Seminar Series 21 st June 2006

2 2 Overview Introduction -What, why, how…? Out with the old… -SL Corpora -The System …in with the new - *new and improved* Lost in Translation -Evaluation issues Conclusion

3 3 Introduction Q WHAT ? A Sign Language Visually articulated language Linguistic phenomena prevalent to SLs ~Classifiers ~Non-manual features (NMFs) ~Discourse mapping and use of signing space

4 4 Introduction (2) Q WHY ? A a) Improve communication b) Stretching application of EBMT Q HOW? A Our approach ~Annotated SL corpora ~Example-based MT employing Marker Hypothesis (Green, 1979)

5 5 Introduction (3) Other approaches ~Transfer - Grieve-Smith, 1999; Marshall & Sáfár, 2002, Sáfár & Marshall 2002; Van Zijl & Barker, 2003 ~Interlingua – Veale et al., 1998; Zhao et al., 2000 ~Multi-path – Huenerfauth, 2004, 2005 ~Statistical – Bauer et al., 1999, Bungeroth & Ney, 2004, 2005, 2006

6 6 Corpora Out with the old… Corpora Difficult to find ECHO project Nederlandse Gebarentaal (NGT) corpora ~40 minutes of video data ~5 Aesop’s fables by two signers and SL poetry ~Combined corpus of 561 sentences

7 7 Annotation Out with the old… Annotation Why annotate? ~No formal written form for SLs ~Linguistic description including NMFs ~Can include translation making corpus bi/trilingual ~Time for chunking and aligning present ELAN annotation toolkit ~Graphical user interface displaying videos and annotations simultaneously (Fig. 1) ~Time-aligned and non-time-aligned annotations including NMF description, repetition notation and notes on indexing and role.

8 8 Figure 1. ELAN interface

9 9 The System Out with the old… The System Segmentation using the ‘Marker Hypothesis’ (MH) (Green, 1979) ~Analagous to system of (Way & Gough, 2003; Gough & Way, 24a/b) ~Segments spoken language sentences according to a set of closed class words ~Chunks start with closed class words and usually encapsulate a concept or an attribute of a concept forming concept chunks, e.g or with tiny curls

10 10 MH not suitable for use with SL side of corpus due to sparseness of closed class item markers ~NGT gloss tier segmented based on time spans of its annotations, remaining annotations with same time span grouped with gloss tier segments forming concept chunks similar to English marker chunks ~Despite different methods, they are successful in forming potentially alignable concept chunks The System (2) Out with the old… The System (2)

11 11 The System (3) Out with the old… The System (3) English chunk or with tiny curls NGT chunk (Gloss RH) TINY CURLS (Gloss LH) TINY CURLS (Repetition RH) u (Repetition LH) u (Eye Gaze) l,d

12 12 The System (4) Out with the old… The System (4) Searches for exact sentence match in aligned bilingual corpus Uses MH to segment input and searches matching or close match chunks in English side of aligned corpus Looks for individual words in the bilingual lexicon

13 13 Experiments Out with the old… Experiments English and Dutch to NGT (Morrissey & Way, 2005) ~100 sentences ~Annotations subjective so evaluation difficult, but promising results NGT to English Dutch ~Traditional MT evaluation metrics can be applied (SER, WER, PER, BLEU) ~Sparse output and low scores due to lack of closed class lexical items in NGT ~Common marker word insertion

14 14 Experiments (2) Out with the old… Experiments (2) SER96%WER119% PER78%BLEU0 Example output and reference translation: mouse promised help “you see” said the mouse, “I promised to help you”

15 15 …in with the new New Corpus -~1400 sentences (SunDial and ATIS corpora) -Flight information queries ISL signed video version Homespun annotation -With view to end product New system -OpenLab

16 16 Evaluation issues Lost in Translation Evaluation issues Mainstream evaluation techniques ~Exact text matching ~No recognition of synonyms, syntactic structure, semantics ~SLs no gold standard Other possible evaluation metrics ~Number of content words/number of words in ref translation ~Evaluation of syntactic or semantic relations

17 17 Conclusions Basic system Corpus problems - Larger corpus such as ISL one in creation, more scope for matches, annotations subjective EBMT caters for some SL linguistic phenomena Evaluation metrics unsuitable oral non-oral translation

18 18 Future Work Adding in NMF information Manual analysis Language model to improve output Suitable evaluation metrics Review other writing systems for SLs Avatar…

19 19 Thank You Questions?


Download ppt "1 Lending a Hand: Sign Language Machine Translation Sara Morrissey NCLT Seminar Series 21 st June 2006."

Similar presentations


Ads by Google