Presentation on theme: "JC Martin - LIMSI/CNRS - WP5 WS1 Manual Annotation of Multimodal Behaviors in Emotionnal TV Interviews J.-C. Martin, S. Abrilian, L. Devillers LIMSI-CNRS,"— Presentation transcript:
JC Martin - LIMSI/CNRS - WP5 WS1 Manual Annotation of Multimodal Behaviors in Emotionnal TV Interviews J.-C. Martin, S. Abrilian, L. Devillers LIMSI-CNRS, France
JC Martin - LIMSI/CNRS - WP5 WS2 Outline Introduction Goals Requirements on annotation Emotional parameters of mm behaviors Coding scheme 1st coding scheme and annotation 2nd coding scheme and example on 1 video Future directions
JC Martin - LIMSI/CNRS - WP5 WS4 Introduction Goals How modalities correlate in non acted emotions ? Annotations and models : one source of knowledge Coordination between modalities during non-acted emotion Synthesis of non acted spontaneous multimodal emotions in ECAs How to code/represent multimodal emotional behavior ? Methodology (which attributes can be annotated easily manually) Trade-off / intermediate level Manual global text free whole video Manual medium/high order signs Automatic low level signs WP5 + WP6 + WP4 + (WP3)
JC Martin - LIMSI/CNRS - WP5 WS5 Introduction Requirements on coding scheme Enable annotation (or computation) Literature: Main attributes of emotional behaviors Corpus based approach: Cover behaviors observed in EmoTV Multi-level annotation of temporal data Global annotation: Manual annotation of multimodal signs for the global sequence Computations from manual annotations in each modality (mono, red, comp) Emotional segment level Computations from manual annotations in each modality (mono, red, comp) Provide one source of knowledge for ECA specification Enable reliability and readability Annotation time
JC Martin - LIMSI/CNRS - WP5 WS6 Introduction Emotional parameters of mm behaviors Psychology & behavior Montepare, J., Koff, E., Zaitchik, D. and Albert, M. (1999). "The use of body movements and gestures as cues to emotions in younger and older adults." Journal of Nonverbal Behavior. Wallbott, H. G. (1998). "Bodily expression of emotion." European Journal of Social Psychology Detection of emotions + relevant non-verbal behaviors Acted data +/- Basic emotions Age, Gender Facial expression masked Expressivity in ECAs (Hartman & Pelachaud 2004)
JC Martin - LIMSI/CNRS - WP5 WS7 Introduction Emotional parameters of mm behaviors (Boone and Cunningham 1996; Boone and Cunningham 1998) -Changes in tempo -Directional changes -Frequency -Muscle tension -Duration Acted (DeMeijer 1991)-Trunk (stretching, bowing) -Arm (opening, closing) -Vertical direction (upward, downward) -Sagittal direction (forward, backward) -Force (strong, light) -Velocity (fast, slow) -Directness Acted
JC Martin - LIMSI/CNRS - WP5 WS8 Introduction Multimodal corpora from TV clips Communicative functions Kipp (2003) MUMIN (Alwood et al. 2004) Musical Score (Magno Caldognetto et al. 2004) Emotions / informal annotation Orage (Atifi and Marcoccia 2001)
JC Martin - LIMSI/CNRS - WP5 WS10 Current status 1st annotation on 35 clips from EmoTV with 2 coders 2nd Iterative definition and application to 1 clip of EmoTV using Anvil (SA, JCM) Annotation guide written 1 meeting with Catherine Pelachaud Paris 8 for investigating use for WP6
JC Martin - LIMSI/CNRS - WP5 WS11 Mouvement quality Annotated vs. computed Quality (annotated) Number of repetitions Fluidity: smooth / normal / jerky Strength: soft / normal / hard Speed: slow / normal / fast Spatial expansion: contracted / normal / expanded Computed Start / end / duration Mvt direction, type, angle approximation Torso : Computed from Pose track
JC Martin - LIMSI/CNRS - WP5 WS12 Annotation #1 Multimodal coding scheme Speech transcription including non-verbal events (laughter, cry, …); Posture pose; posture shift including speed and action (4 cues with 3 to 10 attributes per each cue, for instance: cue = action, attribute = walk); Gestures phases of gesture (preparation, stroke, retraction), handedness, speed, energy, spatial region, hand shape, direction of gesture, gesture type (beats, adaptors, deictic…); Facial expressions subset Facial Animation Parameters (FAPs)
JC Martin - LIMSI/CNRS - WP5 WS13 Annotation #1 Statistics most frequently annotated behaviors : facial expressions (78.6% of annotated multimodal behaviors for coder1, 80.4% for coder2), gestures (11.3% for coder1, 11.9% for coder2), posture (10% for coder1, 7.7% for coder2). most frequent attributes were: gaze direction (26.8% for coder1, 17% for coder2), head movements (23.5% for coder1, 21% for coder2), blinking (15.8% for coder1, 17.6% for coder2), eyebrows movements (10% for coder1, 9.3% for coder2). quantitatively agreed for some attributes (number of annotations of preparation and stroke gestures phases, number of annotation of speed of posture shift). Coder1 was more sensitive than coder2 in all the modalities. Disagreements occurred on body poses, and gesture type and energy. Coder1 annotated subtle body moves, contrary to coder2 who annotated well visible movements. Coder2 associated gesture’s energy with gesture’s speed, while coder1 differentiated both attributes, perceiving that a gesture might have a high energy and a slow motion.
JC Martin - LIMSI/CNRS - WP5 WS14 Annotation #1 Statistics Many cues in coder1 annotations are shared by several emotion labels (blinking, head movements…), but there are also typical cues for some emotions such as lowering hands when despaired, slow body movement for serenity. difference between behaviors linked to strong (anger, exaltation…) and weak (irritation, serenity), attributes for discriminating attributes: are speed and energy for gestures, and speed for body movement. Serenity involves no gestures, whereas exaltation is often accompanied by fast and energetic gestures. Anger is correlated with fast and intense gestures, whereas irritation involves slow and low-intensity gestures.
JC Martin - LIMSI/CNRS - WP5 WS15 Annotation #1 Quantitative analysis Low intercoder agreement on some attributes Reduce the number of values 7 => 3 Improve annotation protocole & guide
JC Martin - LIMSI/CNRS - WP5 WS16 Tracks or group Tracks Torso Head Facial expressions Global body Shoulders (Arms) (Gestures) Alternation of pose and movements Torso, head, shoulders Common value for attributes: Asymetry, other
JC Martin - LIMSI/CNRS - WP5 WS17 Methodology Annotation guide Track per track Annotate emotion vs. Communication emotionally rich clips reduced interaction (monologue in interviews) exagerated mouth / brows movements
JC Martin - LIMSI/CNRS - WP5 WS18 Torso Movement direction to be computed from pose Poses 3 dimensions twist, side-side, bend rotational, lateral, sagittal Labels + approximation of angles
JC Martin - LIMSI/CNRS - WP5 WS20 Torso Pose Side-side / Bend
JC Martin - LIMSI/CNRS - WP5 WS21 Example Torso fast movement
JC Martin - LIMSI/CNRS - WP5 WS22 Head Mouvements Numerous and combined => direction annotated in movement track Primary & secondary Position Mouvement FACS
JC Martin - LIMSI/CNRS - WP5 WS23 Example Head : 2 directions - speed
JC Martin - LIMSI/CNRS - WP5 WS24 Gestures structural transcription (Kipp 04; Efron 1941; McNeill 92) PreparationBringing arm and hand into stroke position, note that changing hand shape before/after moving the arm belongs to the preparation StrokeThe most energetic part of the gesture SequenceOfStrokeA number of successive strokes; all strokes should be covered by this phase. RetractMovement back to rest position; in sitting position this is usually the arm rest, the lap or folded arms. HoldA phase of stillness just before or just after the stroke, usually used to defer the stroke so that it coincides with a certain word.
JC Martin - LIMSI/CNRS - WP5 WS25 Gesture functional transcription ManipulatorContact with body or object. Movement which serve functions of drive reduction or other non-communicative functions, like scratching oneself. BeatSynchronised with the emphasis of the speech. DeicticArm/hand is used to point at an existing or imaginary object. RepresentationalRepresents attributes, actions, relationships of objects and characters (concrete or abstract) EmblemMovement with a precise, culturally defined meaning, like the eye- wink, gestures signalling the intellectual deficiency of another person or obscene gestures.
JC Martin - LIMSI/CNRS - WP5 WS26 Example Homogeneous sequence of stroke
JC Martin - LIMSI/CNRS - WP5 WS27 Example Manipulator gesture
JC Martin - LIMSI/CNRS - WP5 WS28 Gesture annotation attributes Deictic target: self / Camera Manipulator target: Chest / Hairs / Eyebrows / Nose / Mouth Object in hand: If the character is holding an object, enter the name of the object. Spatial region: Up / Head / Chest / Down / Extreme periphery Directness: Linear / Shaped pathway Vertical direction: Upward / Downward Horizontal direction: Leftward / Rightward Sagittal direction: Forward / Backward Hands relationship: Independent / Mirror / Asymmetric
JC Martin - LIMSI/CNRS - WP5 WS29 Other annotations Limited set of annotations for Facial expression Label + Action Unit (combination) Gaze, brows, mouth, chin, nose Shoulders Arms Global pose and mouvement
JC Martin - LIMSI/CNRS - WP5 WS30 Future directions Modifications for potential use as one source of knowledge for WP6 / WP4 Adding temporal evolution in segments Wrist position Fluidity only for between gestures or repetitions ? Integration with other sources of knowledge (temporal) Validation of annotation Perceptual tests at the different levels of multimodal annotation Segment of multimodal behavior Annotate common segments + intercoder agreement Annotation of several videos Evaluation of annotation time Correlations between emotions and multimodal annotations
JC Martin - LIMSI/CNRS - WP5 WS31 Architectural Principles of a Software Platform for the Management of Multimodal Emotional Corpora
JC Martin - LIMSI/CNRS - WP5 WS32 Goals Guidelines Illustrative combinations of tools
JC Martin - LIMSI/CNRS - WP5 WS33 Surveys of annotation tools for multimodal corpora Tools Anvil, TasX, Surveys ISLE D10, NITE, Harper Eurospeech, NISLab LREC 2004 paper LREC WS 2002 / 2004
JC Martin - LIMSI/CNRS - WP5 WS37 Platforms examples Wizard of Oz (Buisine et al. 2003)
JC Martin - LIMSI/CNRS - WP5 WS38 Requirements / description Requirements of such a platform for emotion Continuous / discrete Replay / validation Description Software Data files: media, meta data Annotations: manual, automatic, mixed Coding schemes Documentation files Paper forms
JC Martin - LIMSI/CNRS - WP5 WS39 Architecture Tools Input / output Use during various iterations Segmentation Agreement / vote / reduce number of classes Re-annotation Audio only, video only, audio-video
JC Martin - LIMSI/CNRS - WP5 WS40 Manual Annotation of Multimodal Behaviors in Emotionnal TV Interviews J.-C. Martin, S. Abrilian, L. Devillers LIMSI-CNRS, France
JC Martin - LIMSI/CNRS - WP5 WS41 Introduction Emotional parameters of mm behaviors (Montepare et al. 1999)-Hand positions -Gait -Fluidity -Stiffness -Strength -Speed -spatial expansion -Activity Acted (Wallbott 1998)-Upper body -Shoulders (up, backward, forward) -Head (downward, backward, turned sideways, bent sideways) -Arms -Hands -Movement quality (activity, spatial expansion, movement dynamics, energy, power) -Symmetry Acted
Your consent to our cookies if you continue to use this website.