Presentation is loading. Please wait.

Presentation is loading. Please wait.

Nataliya Nadtoka James Edge, Philip Jackson, Adrian Hilton CVSSP Centre for Vision, Speech & Signal Processing UNIVERSITY OF SURREY.

Similar presentations

Presentation on theme: "Nataliya Nadtoka James Edge, Philip Jackson, Adrian Hilton CVSSP Centre for Vision, Speech & Signal Processing UNIVERSITY OF SURREY."— Presentation transcript:

1 Nataliya Nadtoka James Edge, Philip Jackson, Adrian Hilton CVSSP Centre for Vision, Speech & Signal Processing UNIVERSITY OF SURREY

2 Motivation  Non-verbal cues convey additional information  Existing visual speech from audio methods produce plausible animation of neutral speech, but fail to generate realistic expressive content  The factors that contribute to emotional speech are vastly understudied Aim  Learn the emotional characteristics  Model the emotional characteristics of speech

3 Overview

4 Dataset  4D sequence of geometry and texture (60 fps) and synchronized audio (44100Hz) recorded with 3dMD scanner  Emotions: Anger, Surprise, Fear, Happiness, Disgust, Sadness  All sentences are repeated in Neutral to facilitate cross- comparison  110 sentences with a strong expressive content  Phonetically balanced IR projector IR stereo cameras colour camera

5 Post-processing  Surface registration is done by using painted visual markers  Lip contour is tracked by using blue lipstick  Audio is used to phonetically annotate the data  Differences in duration are further used for emotion analysis

6 Durational differences Neutral Anger Disgust Fear Happiness Sadness Surprise t sec 0 1 0.51.5 Don’t ask me to carry an oily rag like that

7 Isolated region analysis Don’t ask me to carry an oily rag like that Neutral Anger Disgust Fear Happiness Sadness Surprise

8 PCA based Analysis first principal component 55% of total variance Surprise Neutral Happiness t sec 1 2 10 0 -5 0 5 Don’t an that Don’t an that PC 1

9 Emotion Transfer Neutral Sentence ASentence A in Emotion Phonetic transcription emphasis Phonetic transcription emphasis DTW Audio of Emotion Sentence B emphasis Phonetic transcriptio n Neutral animation Δ = Emotion - Neutral Model of Emotion Animation of Emotion Sentence B

10 Conclusions  This work presents an isolated upper face region analysis for selected sentences  Promising relation between the principal component features and emotion  Observed dynamics reflects non-constant nature of emotion within a sentence  Future work will focus on expressive features with respect to emotion transfer

Download ppt "Nataliya Nadtoka James Edge, Philip Jackson, Adrian Hilton CVSSP Centre for Vision, Speech & Signal Processing UNIVERSITY OF SURREY."

Similar presentations

Ads by Google