ENTERFACE’08 Multimodal Communication with Robots and Virtual Agents mid-term presentation.

eNTERFACE’08 Multimodal Communication with Robots and Virtual Agents mid-term presentation

Overview Context: Exploitation of multi-modal signals for the development of an active robot/agent listener Storytelling experience : –Speakers told a story of an animated cartoon they had just seen 1- See the cartoon 2- Tell the story to a robot or an agent

Overview Active listening : –During natural interaction, speakers see if the statements have been correctly understood (or at least heard). –Robots/agents should also have active listening skills… Characterization of multi-modal signals as inputs of the feedback model: –Speech analysis : prosody, keywords recognition, pauses –Partner analysis : face tracking, smile detection Robot/agent feedbacks (outputs): –Lexical non-verbal behaviors Feedback model: –Exploitation of both inputs and outputs signals Evaluation: –Storytelling experiences are usually evaluated by annotation

Audio visual recordings of a storytelling between a speaker and a listener. 22 storytelling sessions telling the “ Tweety and Sylvester - Canary row ” cartoon story. Several conditions (speaker and listener): same language, different. Languages: Arabic, French, Turkish and Slovak Annotation oriented to interaction analysis: – Smile, Head nod, shake, Eye brow, Acoustic prominence

Architecture of an interaction feedback model Multi-modal feature extraction Feedback strategy Multi-modal feedback

Multi-modal feature extraction Key idea: Extraction of features annotated from the STEAD corpus: Face processing: Head nod, shake, smile, activity. Keyword spotting: keywords have been defined in order to switch the agent’s state. Speech Processing: Acoustic Prominence detection

Multi-modal feature extraction Keyword spotting: keywords have been defined in order to switch the agent’s state.

Multi-modal feature extraction Acoustic Prominence Detection: Prosody analysis in real-time by using Pure Data: –Development of different Pure Data objects (written in C): Voice Activity Detection Pitch and Energy extraction Detection: –Statistical model (Gaussian assumption): Kullback-Leibler similarity

Feedback model Extraction of rules from the annotations (STEAD corpus): –Rules are defined in the literature –Application to our specific task When a feedback is triggered? Feedback behaviours: –ECA : Several behaviours are already defined (head movements, face expressions) for GRETA with BML (Behaviour Markup Language). –ROBOT: We defined several basic behaviours for our AIBO robot (inspired from dog’s reactions): Mapping from BML and robot movements.

Future works Integration: –Real-time Multi-modal Feature Extraction: Prominence detection object (Pure Data) Communication between the modules by PsyClone –Already done for Video processing. –Tests of Feedback Behaviours for AIBO –Agent’s state modifications Recordings and annotations of storytelling experiences with both GRETA and AIBO.

Thank for your attention…

ENTERFACE’08 Multimodal Communication with Robots and Virtual Agents mid-term presentation.

Similar presentations

Presentation on theme: "ENTERFACE’08 Multimodal Communication with Robots and Virtual Agents mid-term presentation."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

ENTERFACE’08 Multimodal Communication with Robots and Virtual Agents mid-term presentation.

Similar presentations

Presentation on theme: "ENTERFACE’08 Multimodal Communication with Robots and Virtual Agents mid-term presentation."— Presentation transcript:

Similar presentations

About project

Feedback