1 Component Description Multimodal Interface Carnegie Mellon University Prepared by: Michael Bett 3/26/99.

Slides:



Advertisements
Similar presentations
ISABEL Platform: Hardware and Software Requirements.
Advertisements

Multimedia Components (Develop & Delivery System)
Multi-Model Digital Video Library Professor: Michael Lyu Member: Jacky Ma Joan Chung Multi-Model Digital Video Library LYU9904 Multi-Model Digital Video.
XISL language XISL= eXtensible Interaction Sheet Language or XISL=eXtensible Interaction Scenario Language.
The physical parts of Computer
VIDEO SURVEILLANCE SYSTEM
Networks Adapting Computers to Telecommunications Media.
Practical Object-Oriented Design with UML 2e Slide 1/1 ©The McGraw-Hill Companies, 2004 PRACTICAL OBJECT-ORIENTED DESIGN WITH UML 2e Chapter 5: Restaurant.
1 L45 Multimedia: Applets and Applications. 2 OBJECTIVES  How to get and display images.  To create animations from sequences of images.  To create.
- List of Multimodal Libraries - (EIF students only)
1 Component Description Pebbles PDA Software Human Computer Interaction Institute Carnegie Mellon University Prepared by: Brad Myers, March.
Presentation Outline  Project Aims  Introduction of Digital Video Library  Introduction of Our Work  Considerations and Approach  Design and Implementation.
1 PC Audio 2 Sound Card  An expansion board that enables a computer to receive, manipulate and output sounds.
Application architectures
1 Component Description Alice 3d Graphics Software Human Computer Interaction Institute Carnegie Mellon University Prepared by: Randy Pausch,
1 Component Description CMU Note-Taker Tools Human Computer Interaction Institute Carnegie Mellon University Prepared by: Bill Scherlis March 26, 1999.
1 Component Description Ariadne Client/Server Architecture Interactive Systems Labs Prepared by: Matthias Denecke Date.
FTP. SMS based FTP Introduction Existing System Proposed Solution Block Diagram Hardware and Software Features Benefits Future Scope Conclusion.
Protocols and the TCP/IP Suite Chapter 4. Multilayer communication. A series of layers, each built upon the one below it. The purpose of each layer is.
1 Introduction to Web Development. Web Basics The Web consists of computers on the Internet connected to each other in a specific way Used in all levels.
FALL 2005CSI 4118 – UNIVERSITY OF OTTAWA1 Part 4 Web technologies: HTTP, CGI, PHP,Java applets)
Cli/Serv.: JXTA/151 Client/Server Distributed Systems v Objective –explain JXTA, a support environment for P2P services and applications ,
Chapter 2 IT Foundation Data: facts about objects Store data in computer: – binary data – bits – bytes Five types of data.
Des Hommes de Parole ® WP Des Hommes de Parole ®
Conversational Applications Workshop Introduction Jim Larson.
Android SMIL Messenger Presented By: Alex Povkov Brad Gardner Jeremy Spitzig Santiago Jamriska.
Video Conferencing-introduction --- IT Acumens. COM --- IT Acumens. COMIT Acumens. COMIT Acumens. COM.
Modeling Process CSCE 668Set 14: Simulations 2 May be several algorithms (processes) runs on each processor to simulate the desired communication system.
UNIT - 1Topic - 1. An electronic device, operating under the control of instructions stored in its own memory unit, that can accept data (input), manipulate.
ITCS 6010 SALT. Speech Application Language Tags (SALT) Speech interface markup language Extension of HTML and other markup languages Adds speech and.
CHAPTER FOUR COMPUTER SOFTWARE.
1 Module Objective & Outline Module Objective: After completing this Module, you will be able to, appreciate java as a programming language, write java.
Chapter 8 Introduction to HTML and Applets Fundamentals of Java.
Lecture 2 Multimedia Hardware and Software. MM hardware We need to distinguish between hardware requirements for MM production, and hardware requirements.
An application architecture specifies the technologies to be used to implement one or more (and possibly all) information systems in terms of DATA, PROCESS,
November 4th, 1996ICAD Industry Panel1 Audio Taken Seriously; The present and future of audio at Microsoft Ken Greenebaum Internet.
Construction Planning and Prerequisite
X-WindowsP.K.K.Thambi The X Window System Module 5.
Multimedia Hardware. Fast processor  e.g. Pentium Large RAM (Random Access memory)  Memory space that the computer uses when performing work.  More.
Identify internal hardware devices (e. g
User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.
Collaborator Revolutionizing the way you communicate and understand
Chapter 5 Introduction To Form Builder. Lesson A Objectives  Display Forms Builder forms in a Web browser  Use a data block form to view, insert, update,
+ TIC-TAC-TOE GAME CAPSTONE PROJECT SEN Team Members Sno.NameITU ID 1Keerthi Alimity Venkata Ganugapati Sujitha Sanku Bavi Bharathan87550.
BY KALP SHAH Sentence Recognizer. Sphinx4 Sphinx4 is the best and versatile recognition system. Sphinx4 is a speech recognition system which is written.
WebFlow High-Level Programming Environment and Visual Authoring Toolkit for HPDC (desktop access to remote resources) Tomasz Haupt Northeast Parallel Architectures.
Presentation Title 1 1/27/2016 Lucent Technologies - Proprietary Voice Interface On Wireless Applications Protocol A PDA Implementation Sherif Abdou Qiru.
Week1: Introduction to Computer Networks. Copyright © 2012 Cengage Learning. All rights reserved.2 Objectives 2 Describe basic computer components and.
Speech Recognition Created By : Kanjariya Hardik G.
Automating Installations by Using the Microsoft Windows 2000 Setup Manager Create setup scripts simply and easily. Create and modify answer files and UDFs.
W3C Multimodal Interaction Activities Deborah A. Dahl August 9, 2006.
Software Architecture for Multimodal Interactive Systems : Voice-enabled Graphical Notebook.
Parts of a Computer Created by Carmen Garzes. An electronic device that manipulates information or data. It can store, retrieve or process data. There.
Identify internal hardware devices (e. g
#SummitNow Yes, I'm able to index audio files within Alfresco 2013 Fernando González @fegorama.
VIRTUAL NETWORK COMPUTING SUBMITTED BY:- Ankur Yadav Ashish Solanki Charu Swaroop Harsha Jain.
Access Grid Workshop – APAC ‘05 Node Services Development Thomas D. Uram Argonne National Laboratory.
Introduction to Algorithm. What is Algorithm? an algorithm is any well-defined computational procedure that takes some value, or set of values, as input.
Identify internal hardware devices (e. g
Identify internal hardware devices (e. g
Create setup scripts simply and easily.
Yes, I'm able to index audio files within Alfresco
System Design.
The next generation of collaboration
Java programming lecture one
Online Shopping APP.
Video Conferencing-introduction
Lecture 12: Data Wrangling
Software models - Software Architecture Design Patterns
Identify internal hardware devices (e. g
Presentation transcript:

1 Component Description Multimodal Interface Carnegie Mellon University Prepared by: Michael Bett 3/26/99

2 1 - Overview Description of the Multimodal Toolkit (MMI) What MMI is... Integrated Speech, Handwriting, and Gesture Recognizer Java Based API Integrated Recording Feature Plug-n-Play Recognizer Interface. Allows recognizers to be replaced Internet Enabled Interface. Recognizers may run remotely over the internet Simultaneous Multiple User Support Supports Natural Interface Development

3 2 - Architecture Overview Multimodal Server Janus/Speech Recognizer Gestures Handwriting Recognizer Gesture Recognizer Handwriting Speech Vocabulary Acoustic Model Language Model MMI is a toolkit that allows multiple modalities to be easily integrated into applications. Applications can mixed modalites (speech, gesture, and handwriting) The Java based API communicates directly with each recognizer The multimodal applet is the user interface; the applet window presents a view onto a domain- dependent representation of application data and state in the form of objects to be manipulated. Sample Application Which Uses Multimodal Error Repair Multimodal Applet

4 3 - Component Description The following modalites have the following level of support in multimodal toolkit

5 4 - External Interfaces The user defines their grammer using six probabilistically weighted nodes:  A Toplevel represents an entire input model and contains one or more sequences, each of which contains exactly one AFrame;  An AFrame represents an action frame and contains one or more sequences, each of which consists of one or more PSlots;  A PSlot represents a parameter slot and contains one or more UnimodalNodes (at most one for each input modality);  A UnimodalNode specifies a sub-grammar for a single input modality and has the same structure as a NonTerm, with the addition of a label specifying the modality;  A NonTerm is a non-terminal node consisting of one or more sequences, each of which contains zero or more NonTerms or Literals;  A Literal is a terminal node containing a text string representing one or more input tokens.

6 4 - External Interfaces The Multimodal Server sends a series of points to the pen and gesture recognizers. The audio is sent to the speech recognizer. The pen, gesture and speech recognizers return their hypothesis to the multimodal toolkit which is responsible for integrating the results in an optimizing programming search as shown below. [Minh Tue Voh Dissertation 1998 CMU]

7 5 - Existing Software “Bridges” The multimodal toolkit uses a Java API which allows applets or applications to incorporate multimodal functionality

8 6 - Information Flow Part 1 - Specify how other CPOF components can send and receive data to your system - Please be explicit Components may directly interface with the multimodal server Part 2 - What are the inputs to your system - Please specify formats and protocol - provide details Multimodal grammar Part 3 - What are the outputs of your system - Please specify format and protocol - provide details Hypothesis according to the multimodal grammer

9 7 - Plug-n-play Part 1 - We have not currently identified how our components interact with other CPOF components. Please present a diagram that shows this interaction TBD Part 2 - Are there components in your system that are functionally “similar” to another CPOF component? TBD Part 3 - Are any of your components complementing other CPOF components? (e.g ZUI and Sage/Visage) TBD

Operating Environments and COTS Component Name Required Hardware Operating System Language Required COTS Multimodal Server PC or SunIndependentJava JDK 1.1.* JanusSun - Ultra 60Solaris Tcl/tk C Tcl/Tk NPen++Sun or PC Solaris or Windows NT C++ None Gesture Recognizer Sun or PC Solaris or Windows NT C++ None

Hardware Platform Requirement Specify the hardware required to support your system: MMI can run on a PC with a minimum of 32 Meg RAM and 200 Mhz processor. The Speech Recognizer requires a Sun Ultra 60 dual processor with 500 Meg RAM minimum. (Current recognizer under development will require 500 Mhz Pentium III with a 128 Meg minimum, 256 Meg preferred) Video capture cards, Soundblaster compatitable sound cards, table top and lapel microphones, pan tilt and stationary cameras are required.