Presentation is loading. Please wait.

Presentation is loading. Please wait.

Interpreting Ambiguous Emotional Expressions Speech Analysis and Interpretation Laboratory ACII 2009.

Similar presentations


Presentation on theme: "Interpreting Ambiguous Emotional Expressions Speech Analysis and Interpretation Laboratory ACII 2009."— Presentation transcript:

1 Interpreting Ambiguous Emotional Expressions Speech Analysis and Interpretation Laboratory ACII 2009

2 Motivation Emotion expression is challenging: –Multi-scale dependencies: Time, speaker, context, mood, personality, culture –Intentionally obfuscation: Frustration may be suppressed –Inherent multimodality: Contentment is expressed using the face and voice –Colored by mood, culture, personality, dialog-flow Problem Statement: –How can arbitrary emotional expressions be evaluated? –How can interaction-level information be used to inform classification?

3 Operating Emotion Definitions Prototypical emotions: –Expressions that are consistently recognized by a set of human evaluators (e.g., rage, glee, etc.) Nonprototypical emotions: –Expressions that are not consistently recognized by a set of human evaluators –Potential causes: Ambiguous class definitions [frustration, anger] Emotional subtlety Multimodal expression [sarcasm] Natural emotional flow of a dialog

4 Emotion and its Complexities Temporal variability: –Emotion is manifested and perceived across varying time scales Additional challenges: –Individual variability: Emotion perception varies at the individual level –Multi-modality: Emotion is expressed using speech, the face, body posture, etc. –Representation: Emotion reporting may be influenced by the representation and method of evaluation

5 Temporal Variability Multi-scale Representation Emotion is modulated across different time scales There is an inherent interdependency between the manifestations of emotion over the varying scales –Time units: phoneme, syllable, word, phrase, utterance, turn, subdialog, dialog,... –The style of emotion expression is non- constant over these time units. –Segments may be highly prototypical or non- prototypical

6 Temporal Variability Emotional Profile Create emotional profiles to: –Estimate prototypical ebb and flow –Identify “relevance sections,” sections Describes confidence of an emotional label assignment Soft label representative of the classification output Benefits: –Retention of emotional information lost in a single hard emotion assignment –Locate emotional tenor changes in a dialog Emotional profiles as features

7 Temporal Variability Interaction modeling Proposal- use emotional profiles to develop an emotional interaction framework High-level example: Angry Happy Sad Neutral Angry Happy Sad Neutral Utterance 1 Utterance 2 Utterance 3 Utterance 4 Ground Truth: Angry Dialog Angry ???? Angry First level classification: Majority-vote assignment There is no evidence to suggest that the emotional content of the dialog is not angry. Assign the emotional tag of the dialog to “angry.”

8 Temporal Variability Interaction modeling Dynamic dyadic interaction modeling at the dialog level –Captures influences existing between interlocutors –Captures individual-specific temporal characteristics of emotion Emotion state changes as a function of interlocutor’s state Temporal smoothness –Individual’s emotion flow relatively constant between two overlapping windows Captures individual evaluation styles

9 Temporal Variability An Example of Interaction modeling *First-order Markov Chain for Temporal Dynamics of Emotion State Influence of Speaker A’s State on Speaker B Within a Turn Mutual Influence of Emotion State Across Turns Emotion States During Turn t Emotion States During Turn t-1

10 Emotion and its Complexities Temporal variability: –Emotion is manifested and perceived across varying time scales Additional challenges: –Individual variability: Emotion perception varies at the individual level –Multi-modality: Emotion is expressed using speech, the face, body posture, etc. –Representation: Emotion reporting may be influenced by the representation and method of evaluation

11 Additional Challenges Individual Variability: User Perception Emotion perception is colored by: –Emotion content of an utterance –Semantic content of an utterance –Context –Mood of evaluator –Personality of evaluator –Fatigue of evaluator –Attention of evaluator

12 Additional Challenges Individual Variability: Explicit User Models Capture evaluation style Create models that define: –Perception as a function of mood –Perception as a function of attention –Perception as a function of alertness These models can be used to: –Estimate the state of the user –Create “active-learning” environments

13 Additional Challenges Multi-modality of Emotion Expression Inherent limits of uni-modal processing The audio information alone does not fully capture the emotion content –“Prototypical” angry example –Video examples: Subtle angerHot angerSarcasmContentment

14 Additional Challenges The Effect of Representation Reported emotion perception is dependent on the evaluation structure –Evaluation structure for our data: Multi-modal (audio and video) Clips are viewed in order Reported emotion perception is dependent on the evaluation methodology –Categorical –Dimensional

15 Conclusions Goal: develop techniques to interpret emotional expressions independent of their prototypical or non-prototypical nature Improve dialog-level classification: –Consider the dynamics of the acoustic features and the dynamics of the underlying classification –Classify the emotion within the context of a dialog based on emotionally clear data (vs. ambiguous content) –Will result in enhanced automated emotional comprehension by machines

16 Open Questions How can prototypical emotions be used to understand and interpret non-prototypical emotions? Is it important to be able to successfully interpret all utterances of an individual? Should a user’s emotion state ever be discarded? How can we best make use of limited data? How can ambiguous emotional content be interpreted and utilized during human- machine interaction?

17 Questions?

18 Prototypical & Nonprototypical Prototypical Expressions Nonprototypical Majority-Vote Expressions Nonprototypical Non-Majority-Vote Expressions

19 Data overview: IEMOCAP database Modalities: –Audio, video, motion capture Collection style: –Dyadic interaction (mixed-gender) –Scripted and improvisational expressions –“Natural” emotion elicitation Size: –Five pairs (five men, five women) –12 hours

20 Data overview: IEMOCAP database Evaluation: –Twelve evaluators (overlapping subsets) –Sequential annotation –Categorical ratings (3+ per utterance) Angry, happy, excited, sad, neutral, frustrated, surprised, disgusted, fearful, other (~25%) –Dimensional ratings (2 per utterance) Valence, activation

21 Data overview: IEMOCAP database Database specific definitions: –Prototypical- complete evaluator agreement –Nonprototypical majority-vote- majority vote agreement –Nonprototypical non-majority-vote- expressions without a majority consensus Emotional CategoryPrototypicalNP MVNP NMV* Anger497604802 Happiness/Excitement44111892095 Neutrality38812961623 Sadness465618616 Frustration56212801383 * At least one evaluator tagged as given emotion, non-disjoint set

22 Emotional profiling: Sadness

23 Emotional profiling: Anger

24 Emotional profiling: Frustration


Download ppt "Interpreting Ambiguous Emotional Expressions Speech Analysis and Interpretation Laboratory ACII 2009."

Similar presentations


Ads by Google