Presentation is loading. Please wait.

Presentation is loading. Please wait.

ENTERFACE’08 Multimodal Communication with Robots and Virtual Agents mid-term presentation.

Similar presentations


Presentation on theme: "ENTERFACE’08 Multimodal Communication with Robots and Virtual Agents mid-term presentation."— Presentation transcript:

1 eNTERFACE’08 Multimodal Communication with Robots and Virtual Agents mid-term presentation

2 Overview Context: Exploitation of multi-modal signals for the development of an active robot/agent listener Storytelling experience : –Speakers told a story of an animated cartoon they had just seen 1- See the cartoon 2- Tell the story to a robot or an agent

3 Overview Active listening : –During natural interaction, speakers see if the statements have been correctly understood (or at least heard). –Robots/agents should also have active listening skills… Characterization of multi-modal signals as inputs of the feedback model: –Speech analysis : prosody, keywords recognition, pauses –Partner analysis : face tracking, smile detection Robot/agent feedbacks (outputs): –Lexical non-verbal behaviors Feedback model: –Exploitation of both inputs and outputs signals Evaluation: –Storytelling experiences are usually evaluated by annotation

4 Audio visual recordings of a storytelling between a speaker and a listener. 22 storytelling sessions telling the “ Tweety and Sylvester - Canary row ” cartoon story. Several conditions (speaker and listener): same language, different. Languages: Arabic, French, Turkish and Slovak Annotation oriented to interaction analysis: – Smile, Head nod, shake, Eye brow, Acoustic prominence

5

6 Architecture of an interaction feedback model Multi-modal feature extraction Feedback strategy Multi-modal feedback

7 Multi-modal feature extraction Key idea: Extraction of features annotated from the STEAD corpus: Face processing: Head nod, shake, smile, activity. Keyword spotting: keywords have been defined in order to switch the agent’s state. Speech Processing: Acoustic Prominence detection

8

9 Multi-modal feature extraction Keyword spotting: keywords have been defined in order to switch the agent’s state.

10 Multi-modal feature extraction Acoustic Prominence Detection: Prosody analysis in real-time by using Pure Data: –Development of different Pure Data objects (written in C): Voice Activity Detection Pitch and Energy extraction Detection: –Statistical model (Gaussian assumption): Kullback-Leibler similarity

11 Feedback model Extraction of rules from the annotations (STEAD corpus): –Rules are defined in the literature –Application to our specific task When a feedback is triggered? Feedback behaviours: –ECA : Several behaviours are already defined (head movements, face expressions) for GRETA with BML (Behaviour Markup Language). –ROBOT: We defined several basic behaviours for our AIBO robot (inspired from dog’s reactions): Mapping from BML and robot movements.

12 Future works Integration: –Real-time Multi-modal Feature Extraction: Prominence detection object (Pure Data) Communication between the modules by PsyClone –Already done for Video processing. –Tests of Feedback Behaviours for AIBO –Agent’s state modifications Recordings and annotations of storytelling experiences with both GRETA and AIBO.

13 Thank for your attention…


Download ppt "ENTERFACE’08 Multimodal Communication with Robots and Virtual Agents mid-term presentation."

Similar presentations


Ads by Google