Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents S. Kawamoto, et al. October 27, 2004.

Similar presentations


Presentation on theme: "1 Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents S. Kawamoto, et al. October 27, 2004."— Presentation transcript:

1 1 Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents S. Kawamoto, et al. October 27, 2004

2 2 Agenda Introduction Toolkit Design and Outline –Speech recognition module –Speech synthesis module –Facial image synthesis module –Agent manager –Virtual machine model –Task manager –Prototyping tools Prototype Systems Conclusions

3 3 Introduction An anthropomorphic spoken dialog agent (ASDA) is one of the next-generation human-computer interfaces Many ASDA systems have been developed, but developing a high-quality ASDA system is still challenging An unlimited number of life-like agent characters having different faces and voices just like human For this reason, Galatea has been developed to provide a platform to build next-generation ASDA systems

4 4 Features of the Toolkit Easy customization –Model-based approaches Once the model parameters are trained, facial expressions and voice quality can be controlled easily Key techniques for natural spoken dialog Incremental speech recognition, synchronization between speech and facial animation, etc Modularity of functional units –Simple architecture to manage each functional unit User can develop, improve, debug, etc Open-source free software Introduction

5 5 Toolkit Design and Outline Works as an inter-module communication manager Directly managed by the modules which utilize the devices Adding a new module for the function and connecting the module to the agent manager

6 6 Speech Recognition Module (SRM) Major interfaces of SRM are as follows: –Outputs Recognition result (XML format) Engine status (“busy”, “waiting”,... ) –Control command Reload grammar, change the settings of the speech recognition engine –Grammar representation Transforms the XML grammar into a format that is accepted by the speech recognition engine Toolkit Design and Outline Command Interpreter Grammar Transformer Speech Recognition Engine Speech input Grammar Request Response

7 7 Speech Synthesis Module (SSM) Accept arbitrary Japanese texts Synthesize speech with a human voice –HMM-based speech synthesis method is employed Synchronizing the lip movement with speech SSM can interrupt speech output to cope with any interruption by the user Toolkit Design and Outline Command Interpreter Dictionary Acoustic Models Speech Output Text Analyzer Waveform Generation Engine

8 8 Facial Image Synthesis Module (FSM) Supports high-quality facial image synthesis, animation control, precise lip-sync with voice GUI is equipped to fit a generic face wire frame model onto a full-face snapshot image Facial action control –Mouth shape –Facial expression Toolkit Design and Outline

9 9 Agent Manager (AM) Integrator of all the modules of the ASDA system Play a central role of communication Synchronization manager between SSM and FSM to achieve the precise lip-sync Toolkit Design and Outline Dispatcher Macro-command interpreter

10 10 Virtual Machine Model Module interface is modeled as a machine with slots –Each slot is indicates machine status Changing the slot values by a common command set “set Speak = now” means starting voice synthesis of a given text immediately Toolkit Design and Outline

11 11 Task Manager (TM) Define the dialog as a set of interactions which can be represented by a dialog description language Goal in developing the TM is that the system can use several types of dialog description languages –VoiceXML High-level language, task-oriented information and the intentions of the participants –PDOC (primitive dialog operation commands) Low-level language, device events and sequence control Toolkit Design and Outline

12 12 Prototyping Tools “Galatea Interaction Builder (IB)” Toolkit Design and Outline Application Developer Interaction Builder Galatea MMI System XISL File web site Create XISL Document Download and Execute XISL Check Design Scenario

13 13 Prototype Systems

14 14 Echo-back task Prototype Systems

15 15 Conclusions A human-like spoken dialog agent is one of the promising man-machine interfaces for the next generation Galatea is a software toolkit to develop a human-like spoken dialog agent Because of the high modularity and simple communication architecture, it will speed up the research and application development based on ASDA


Download ppt "1 Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents S. Kawamoto, et al. October 27, 2004."

Similar presentations


Ads by Google