J-C. Martin, L. Devillers, A. Zara – LIMSI-CNRS V. Maffiolo, G. Le Chenadec – France Télécom R&D France EmoTABOU Corpus.

1 (a)(b)(c)(d)(a)(b)(c)(d) 1 J-C. Martin, L. Devillers, A. Zara – LIMSI-CNRS V. Maffiolo, G. Le Chenadec – France Télécom R&D France EmoTABOU Corpus

2 (a)(b)(c)(d)(a)(b)(c)(d) 2 Research Context and Goals

3 (b)(c)(d)(b)(c)(d) 3 Research context Long-term goal: model of human computer emotional interaction Requires knowledge on emotional multimodal behaviors during human-human interaction (e.g. synchronization between modalities) Corpus-based approach Experimental data and studies Monomodal Acting single emotion Emotion but interaction not videotaped (EmoTV) Or interaction videotaped but not emotion

4 (b)(c)(d)(b)(c)(d) 4 Related work Discriminative features of mvt quality (Wallbott 98) EmotionMovement quality Hot angerHigh mvt activity, expansive mvts, high mvt dynamics Elated JoyHigh mvt activity, expansive mvt, high mvt dynamics HappinessLow movement dynamics DisgustInexpansive movements ContemptLow movement activity, low movement dynamics SadnessLow movement activity, low movement dynamics DespairExpansive movements TerrorHigh movement activity BoredomLow movement activity, inexpansive mvts, low mvt dynamics Shame, Interest, Pride How does that applies to not (instructed in-lab acting of single emotions) ?

5 (b)(c)(d)(b)(c)(d) 5 Research context Cognitive - motivational - emotive systems such as EMA (Gratch and Marsella) are mainly based on theoretical psycho-cognitive models behavioral models based on acted emotions Our aim is to observe and analyze corpora of spontaneous human- human interaction with emotional multimodal behavior to build more realistic and “natural” behavioral models

6 (b)(c)(d)(b)(c)(d) 6 Research questions Questions how do emotion and interaction combine? what is the impact of both on the synchrony between modalities? Gesture stroke phase and lexical affiliate (McNeill 05), max F0 Gaze during mental states (Baron-Cohen 97) and turn-taking (Allwood 06) How to collect behavior that are spontaneous are multimodal are emotional occur during interaction

7 (a)(b)(c)(d)(a)(b)(c)(d) 7 Experimental Protocole

8 (b)(c)(d)(b)(c)(d) 8 Experimental protocol Adaptation of the Taboo game Taboo game 1 card with one secret word and 5 forbiden words Multimodal elicitation Iconic Deictics

9 (b)(c)(d)(b)(c)(d) 9 Experimental protocol Adaptation of the Taboo game Emotion elicitation uncommon word to elicit surprise or embarrassment 1 player was a naive subject other player instructed to elicit emotion using strategies (might not find the word on purpose)

10 (b)(c)(d)(b)(c)(d) 10 Collected data 10 pairs of players 8 hours of video Upper body + face close-up table Table table CS E C : comparse S : sujet E : expérimentateur

11 (b)(c)(d)(b)(c)(d) 11 Sample palimpsest

12 (a)(b)(c)(d)(a)(b)(c)(d) 12 Levels of annotation multimodal behavior (acoustic/gestures/face) linguistic behavior (dialog/com. acts) emotional/mental state behavior strategic behavior

13 (a)(b)(c)(d)(a)(b)(c)(d) 13 Annotation of emotion & context

14 (b)(c)(d)(b)(c)(d) 14 Previous scheme Multi-level scheme for emotion and context representation Emotion labels (broad sense including attitude, emotion, mood) Dimensions (valence and intensity) Contextual information (quality, speaker, etc.) EmoTV (Devillers, Abrilian, Martin, 2005) CEMO (Devillers et al., 2005)

15 (b)(c)(d)(b)(c)(d) 15 EmoTabou scheme Adaptation of our previous scheme for emotions annotation in interaction We added More general set of mental states Dialog acts Communicative acts Contextual information scheme (sub-dialog of the game, role of the subject, card, etc.) Meta information

16 (b)(c)(d)(b)(c)(d) 16 Emotion labels in EmoTabou The protocol for obtaining this list was to rate the emotion words of the Humaine list (55 terms) in terms of their relevance for the task (majority Voting procedure – 5 people). In order to represent complex emotions, we allow the annotation of at most 5 emotions per segment. Then we computed the different annotations of several labelers in a soft vector representation.

17 (b)(c)(d)(b)(c)(d) 17 Other studies: list of emotional labels extended with other “mental states” (Reidsma 06, Le Chenadec et al., 05) We added to our list the mental states defined by Baron-Cohen (96) (ie. “Thinking”, “Unsure”) Aims: Study the relation between emotions and mental states between the two players Study how emotions and mental states are expressed through multimodal behaviors in human interactions Mental states and emotions

18 (b)(c)(d)(b)(c)(d) 18 Mental states (Baron-Cohen, 1996)

19 (b)(c)(d)(b)(c)(d) 19 DAMSL scheme “Dialog Act Markup in Several Layers”: annotation of interaction (4 levels: Information-Level, Communicative Status, and Forward- and Backward-Looking Functions) more than 75 tags Experiments using multi-level annotations: dialogic and emotion tags carried out with the FP5-AMITIES (Devillers 02) showed correlation between emotion/some dialog acts in speech ex: anger with repetition. We use a reduced set of DAMSL tags adapted to EmoTabou Dialog acts (DAMSL)

20 (b)(c)(d)(b)(c)(d) 20 Give a cue Suggest a word Assert other Ask a question Understand Answer Yes Answer No Don’t know Interjection Inintelligible Dialog acts (DAMSL) Forward-Looking Functions Backward-Looking Functions Communicative Status

21 (b)(c)(d)(b)(c)(d) 21 Previous works have already provided lists of communicative functions (Poggi, Pelachaud) Here, we defined a list after analysing our corpus: Abandon, Disapprove, Criticize, Self-criticize, Lack-of-confidence, Doubt about other, Rush, Unkind, Irony, Mocking, Joke, Sarcastic. Admire, Approve, Congratulate, Encourage, Congratulate, propose strategy, Communicative functions

22 (b)(c)(d)(b)(c)(d) 22 We defined a contextual information scheme: strategies: list of strategies given to the associate or observed in the corpus. game phases: give a card, play, give the result card to guess Player role: “devin” (mind-reader) or mime Meta-information: Post-game information subject personality (Eysenck personality inventory) questionnaire (emotions felt and elicited) Contextual information and Meta-information

23 (b)(c)(d)(b)(c)(d) 23 Associate Irritation : the associates have the instruction to criticize the subject Card Embarrassment : to embarrass the subject, unusual words have been chosen like «palimpseste » Experimenter Stress: the subject has 2 minutes to guess a word. After 1mn30, the experimenter announces 30 seconds left, then 15 seconds… Examples of the different strategies

24 (b)(c)(d)(b)(c)(d) 24 The coding scheme is implemented in Anvil (Kipp 04) To annotate the corpus, we proceed in the following way: 1) Segmentation 2) Annotation by four annotators Iterative definition of the coding scheme Test with one video Measure agreement (intra, inter-coder agreement) Annotation protocol for Emotion and context

25 (a)(b)(c)(d)(a)(b)(c)(d) 25 Annotation of multimodal behaviors

26 (b)(c)(d)(b)(c)(d) 26 Informal study of the collected behaviors (a)iconic gesture describing the action "turning a split", (b)deictic gesture indicating the scores listed on the black board, (c) adaptator gesture done by the naive subject and imitation by the instructed subject (d)

27 (b)(c)(d)(b)(c)(d) 27 Annotation of multimodal expressive behaviors Gaze direction Gesture Phase Function (including manipulators) Expressivity (adapted from Pelachaud 05) Facial expressions (subset AU) Head mvt Posture

28 (b)(c)(d)(b)(c)(d) 28 Annotation of multimodal expressive behaviors Anvil (Kipp 04) 1 coder Agreement

29 (a)(b)(c)(d)(a)(b)(c)(d) 29 Future Directions Illustrations of possible measures

30 (b)(c)(d)(b)(c)(d) 30 Descriptive analysis of one clip: Emotion repartition

31 (b)(c)(d)(b)(c)(d) 31 We observed some relations between our set of emotions and more general mental states Embarrassment -> unsure Emotion vs Mental states

32 (b)(c)(d)(b)(c)(d) 32 Ex: soft-vectors representation (Emotion label, weight, intensity, valence) (Mental state, weight) Subject: “ok, I propose that we do not even try this one and accept the penalty”

33 (b)(c)(d)(b)(c)(d) 33 Subject Associate AbandonCriticize Lack-of- confidence Doubt about other Amusement 33%39%31%0% Embarrassment 33%61%28%100% Emotions/Communicative acts Associate Subject CriticizeDisapproveLack-of- confidence Doubt about other Irritation2%01%34% Embarrassment18%01%0 Exasperation (annoyance) 26%3%21%0 Stress18%3%01% Amusement34%29%046%

34 (b)(c)(d)(b)(c)(d) 34 Multimodal behaviors Illustration of measures to be done Naïve player gestures (RH + symetrical gestures) 83% of video / 7 gesture units High percentage of manipulators (48%) and hold (60%) Gaze direction x gesture type Adaptators: 59% Interlocutor, 31% elsewhere Deictics: 63% Interlocutor, 18% Panel, 19% Elsewhere Expressivity and gesture type 59% deictic expanded, 94% adaptator contracted Expressivity and phases 44% of smooth gestures occur during stroke

35 (b)(c)(d)(b)(c)(d) 35 Future directions Synchronization between modalities Further annotations Lexical affiliate Facial expression close-up Temporal analysis Between modalities Between modalities / mental states Measures Gaze / mental state Individual behaviors

36 (b)(c)(d)(b)(c)(d) 36 Future directions Synchronization between modalities Annotation and measures feedback, sequencing et turn management (Allwood 06) Imitation Relations between the different levels of annotation in the behavior of the two players Comparison with pairs of naïve subjects

