Presentation is loading. Please wait.

Presentation is loading. Please wait.

Speech Interface to Virtual Reality Applications Reporter Chun-Feng Liao Authors Wauchope, K., S. Everett, D. Tate, T. Maney M.Cernak, A.Sannier.

Similar presentations


Presentation on theme: "Speech Interface to Virtual Reality Applications Reporter Chun-Feng Liao Authors Wauchope, K., S. Everett, D. Tate, T. Maney M.Cernak, A.Sannier."— Presentation transcript:

1 Speech Interface to Virtual Reality Applications Reporter Chun-Feng Liao Authors Wauchope, K., S. Everett, D. Tate, T. Maney M.Cernak, A.Sannier

2  M.Cernak, A.Sannier,Technical Report, “Command Speech Interface to Virtual Reality Applications”,Virtual Reality Applications Center at Iowa State University of Science and Technology, June 2002.  Wauchope, K., S. Everett, D. Tate, T. Maney, "Speech-Interactive Virtual Environments for Ship Familiarization," 2nd International EuroConference on Computer and IT Applications in the Maritime Industries (COMPIT '03), Hamburg, Germany, May 14-17, 2003, pp. 70-83. References This report discuss 2 implementations of Speech Interface to Virtual Reality Applications.

3

4 Agenda  Introduction  Paper I  Paper II  Conclusion  System design Discussion

5 Introduction  Both papers are newly published.(2002,2003)  These 2 papers address technical details of Speech-VR integration.\  The 2 nd paper take more modern approach.  Both of them use similar architecture.(and are also similar to ours!) Ex:Choosing VRML + Java Speech API platform and encountered serveral difficult problems such as java security constraint and were force to use a “brwoser as an application ” instead of “browser as an applet”

6 Paper I  M.Cernak, A.Sannier,Technical Report, “Command Speech Interface to Virtual Reality Applications”,Virtual Reality Applications Center at Iowa State University of Science and Technology, June 2002.

7 Purposes of this paper  Describe an approach to control VR applications using multimodal command speech interface (CSI)based on dialog modeling.  Used to imporve the usability of VRAC’s C6. VRAC : Virtual Reality Applications Center C6 is a Virtual Reality System developed by VRAC.

8 Multimodal Interaction  U :MoleBio  S :Yes  U :(Targeting the atom 512 by mouse)  U :Go There !  S :OK (goto Atom number 512 ). U: User, S: System Command Addressing,used to trigger system start to record user’s voice for recognition.

9 System Architecture Dialog Management and Speech facilitiesVR System

10 System Architecture  VR : VRAC’s C6  TTS : Festival  SR : CSLU Toolkit  Platform : Windows OS on PII 400

11 Three Main Components(1)  Speech Synthesis (TTS) : Festival.

12 Three Main Components(2)  CSLU Toolkit :Dialog Modeling, Speech Recognition and Nature Language Processing.  CSLU was implemented in C and Tcl/tk, developed by OGI (Oregon Graduate Institute ) CSLU (Center of Spoken Language Understanding)

13

14 Three Main Components(3)  Communication Bridge to VR application.  To Integrate CSLU(Speech) and C6(VR).

15 How to Integrate CSLU and C6  Initial Attempt : CORBA C6 support CORBA. Try to use “Combat” as tcl extension as CORBA Client but failed. Try to use “Tcl Blend”: -Tck->Java->CORBA->C6 (efficient problems) Result : use TCP socket.

16 Natural Language Processing  Instead of using standard JSGF, the authors use a custom grammar and wrote a specific parser to evaluate it.  Very similar to JSGF.  We will not discuss the custom grammar in detail here.

17 SCI Test Environment  A RAD (GUI) tool that help developers to quickly build the dialog flow.

18 Paper I Conclusion  Major advantage of this system is quick deployment.  The problematic area is the Speech Recognition Accuracy(provided by CSLU) was poor.  US Navy also developed a Speech Inteface to VR System, they will imporved the interaction with VR in terms of their method.

19 Future Work  Change TTS and SR to IBM ViaVoice. Support JSAPI(Java Speech API) Java is easier to communicate with C6 via CORBA.

20

21 Paper II  Wauchope, K., S. Everett, D. Tate, T. Maney, "Speech-Interactive Virtual Environments for Ship Familiarization," 2nd International EuroConference on Computer and IT Applications in the Maritime Industries (COMPIT '03), Hamburg, Germany, May 14- 17, 2003, pp. 70-83.

22 Introduction  This paper intruduce 2 systems which help newly-aboard crews of US Navy ships to be familiar with their environment quickly. User : Tell me where is Rom 101 !

23 Motivation  Architects of US Navy Ships heavily use CAD tools to design ship models.  CAD file can be transferred to 3D model format with little effort.  Accroding to author’s previous research,this Virtual Envirionment did shorten crews’ learning time.

24 Systems introduced  2 Systems MSFT(Multimodal Ship Familiarization Tool) ISFS(Interactive Ship Familiarization System)  ISFS is a recent transition fo MSFT.

25 System Architecture:MSFT Run as different process

26 MSFT  VE veiwer component and speech interface run as two separate processes.  Speech interface : using a total IBM solution : ViaVoice. IBM’s SMAPI. IBM’s SRCL grammar. Platform : PIII 500MHz

27 ISFS  A recent transistion of MSFT.  Using VRML as 3D modeling language.  Using JSAPI as interface to speech engine. ViaVoice totally support JSAPI. VRML support Java as a scripting language  Other structure is identical to MSFT system. Platform : Xeon 2.0GHz ->Need more computing power!

28 Why Chose to Use Standalone VRML Brwoser?  Security Limitations. (detail will be discussed later)  VM Limitations. (detail will be discussed later)  Provide opportunities to customize interface to VRML browser. In my personal experience,system usually become unstable when speech engine work with VRML Plug- in via EAI’s Java interface.

29 Security Limitations  JRE imposes security limitations on Java Applets.  JSAPI was unable to establish a connection with speech engine unless we explicitly reconfig the security settings.

30 Limited VM  Most VRML Browser ‘s EAI were implemented using ActiveX thus only support Microsoft’s old VM which dosen’t support most modern functions of Java. Ex:This may force us to use Java AWT instead of swing which provide better GUI.

31 Providing GUI as VUI Fallback  GUI provides a fallback in case the speech recognizer is having trouble accurately transcribing the user’s voice.  GUI is adjusted dynamically to provide one-to-one correspondence to VUI.

32 Paper 2 Conclusion  The Speech Interface is needed because GUI and VE Viewer both rely on direct manipulation and keep our hand too busy.  As HCI become increasingly multimodel,care must be taken to integrate in natural manner.

33 Future Work  VRML is more close to Object –oriented and tree-structured.  It is hard to represent them in RDBMS.  Must find some way to store model data easily and efficiently. Personal thought : Using XML Database.

34 Switchable! Discussions

35 Q & A


Download ppt "Speech Interface to Virtual Reality Applications Reporter Chun-Feng Liao Authors Wauchope, K., S. Everett, D. Tate, T. Maney M.Cernak, A.Sannier."

Similar presentations


Ads by Google