HUMANOID ANIMATION DRIVEN BY HUMAN VOICE Thesis Advisor : Dr. Donald P. Brutzman Second Reader : Dr. Xiaoping Yun A Thesis By Ozan APAYDIN, Turkish Navy.

Slides:



Advertisements
Similar presentations
Question examples. Session 1 Objectives Why certify? Positioning of the non-technical version What is Java? Key advantages of Java Java Applications vs.
Advertisements

TSpaces Services Suite: Automating the Development and Management of Web Services Presenter: Kevin McCurley IBM Almaden Research Center Contact: Marcus.
Team RFEyes Presents:. Project SmartCart Sponsored By Dr. Andrew Szeto & The National Science Foundation Tatsani Inkhamfong Andres Flores Alain Iamburg.
Introduction To Java Objectives For Today â Introduction To Java â The Java Platform & The (JVM) Java Virtual Machine â Core Java (API) Application Programming.
© TMC Computer School HC20203 VRML HIGHER DIPLOMA IN COMPUTING Chapter 1 – Introduction to VRML.
Department of Electrical and Computer Engineering He Zhou Hui Zheng William Mai Xiang Guo Advisor: Professor Patrick Kelly ASLLENGE.
The State of the Art in VoiceXML Chetan Sharma, MS Graduate Student School of CSIS, Pace University.
Presentation Outline  Project Aims  Introduction of Digital Video Library  Introduction of Our Work  Considerations and Approach  Design and Implementation.
Virtual reality interfaces in connection with building process simulations. Prof. Nash Dawood Centre for Construction Innovation Research University of.
Spik v1.0 Voice Commands Execution in a Windows Environment Dekel Abelson Eliran Dahan Instructor: Ari Todtfeld.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display. COMPSCI 125 Introduction to Computer Science I.
Optimize tomorrow today. TM 1 Optimize tomorrow today. Arlene Minkiewicz, Chief Scientist PRICE Systems, LLC Software.
RoboTechTronix Ryan Fonnesbeck (CS and CE) Brian Clay (CE) Justin Hansen (CE)
Multimodal Architecture for Integrating Voice and Ink XML Formats Under the guidance of Dr. Charles Tappert By Darshan Desai, Shobhana Misra, Yani Mulyani,
BLUETOOTH CONTROLLER BLUETOOTH CONTROLLER HARDWARE AND LIBRARY HARDWARE AND LIBRARYPROJECT ComFUTURE TECHNOLOGY.
“ Walk to here ” : A Voice Driven Animation System SCA 2006 Zhijin Wang and Michiel van de Panne.
Energy Smart Room GROUP 9 PRESENTERS DEMO DATE SPECIAL THANKS TO ADVISOR PRESENTERS Thursday April 19, 2007 Department of Electrical and Systems Engineering.
Development of mobile applications using PhoneGap and HTML 5
L EC. 01: J AVA FUNDAMENTALS Fall Java Programming.
Bernd Bruegge & Allen H. Dutoit Object-Oriented Software Engineering: Using UML, Patterns, and Java 1 Requirements Analysis Document Template 1.Introduction.
Messaging Technologies Group: Yuzhou Xia Yi Tan Jianxiao Zhai.
Speech Recognition Calculator ECE L02 - Group 8 Alfredo Herrera John Holmes Josh Liang Alex Kee.
Creation of hybrid portlet application for file download using IBM Worklight and IBM Rational Application Developer v9 Gaurav Bhattacharjee Lakshmi Priya.
Programming Tools and Applications. Programming Tools 3D systems – Maya – Blender – Unity – Ogre3D Libraries – OpenGL – Direct3D.
CS-EE 481 Spring Founders Day, 2005 University of Portland School of Engineering Project Pocket Gopher Conversational Learning Agent Team Josh Jones.
Daniel Arnett · Joseph Vanciel · Brian Krueger Advisor: Dr. Samuel Richie Sponsor: Workforce Central Florida Mentor: Sean Donovan 4 th Annual Senior Design.
Patient Location via Received Signal Strength (RSS) Analysis Dan Albano, Chris Comeau, Jeramie Ianelli, Sean Palastro Project Advisor Taib Znati Tuesday.
11.10 Human Computer Interface www. ICT-Teacher.com.
Group Members: Group Members:.  Introduction  Current Scenario  Proposed Solution  Block Diagram  Technical Implementation  Hardware & Software.
Classification of Computers
Spoken dialog for e-learning supported by domain ontologies Dario Bianchi, Monica Mordonini and Agostino Poggi Dipartimento di Ingegneria dell’Informazione.
CMU Shpinx Speech Recognition Engine Reporter : Chun-Feng Liao NCCU Dept. of Computer Sceince Intelligent Media Lab.
Mobile Navigation With SVG Christian Schmitt SVG Open 2005.
JSF Introduction Copyright © Liferay, Inc. All Rights Reserved. No material may be reproduced electronically or in print without written permission.
Speech Interface to Virtual Reality Applications Reporter Chun-Feng Liao Authors Wauchope, K., S. Everett, D. Tate, T. Maney M.Cernak, A.Sannier.
Advanced Java
Engineering Senior Presentations Spring Founder’s Day, 2006 University of Portland School of Engineering Havoc Command Authors Ray Dehler Brandon.
IBM - CVUT Student Research Projects Google maps with voice Martin Absolon Ivo Čermák
France Télécom R&D Advanced Interactive Content Olivier Avaro Chairman MPEG Systems and AIC Initiative France Telecom R&D.
MULTIMEDIA INPUT / OUTPUT TECHNOLOGIES INTRODUCTION 6/1/ A.Aruna, Assistant Professor, Faculty of Information Technology.
GemIsland Prepared by: Areen Jondi Diala Hamadneh Supervised by: Dr. Raed Alqadi Dr. Luai Malhis.
Fundamentals of Information Systems, Sixth Edition1 Natural Language Processing and Voice Recognition Processing that allows the computer to understand.
Speech Recognition MIT SMA 5508 Spring 2004 Larry Rudolph (MIT)
Voice-based generic UPnP Control Point Andreas BobekUniversity of Rostock Faculty of Computer Science and Electrical Engineering Andreas Bobek, Hendrik.
Marketing Development Block 4 Dr. Uma Kanjilal. Stages of a Multimedia Project  Planning and costing- infrastructure, time, skills etc.  Designing and.
Controlling Computer Using Speech Recognition (CCSR) Creative Masters Group Supervisor : Dr: Mounira Taileb.
1 Location Based File Exchange Controlled By Speech(LBFE-S) IT University Of Copenhagen Thesis Presentation By: Muhammed Marouf
PROPOSAL : The Use of Voice Command in Operating Personal Computer By : COLLEGE OF ART & SCIENCE UNIVERSITI UTARA MALAYSIA STIW5023 ADVANCED PROGRAMMING.
Using Voice to Solve Ergonomic Problems Dr. William Lenharth, CHFP UNH – Project54.
People today are limited to a mouse and keyboard when using a computer There are little to no alternatives out in the market at this moment Natural human.
Basic Components used in Electronics. Used in electronic circuits to charge up to a given potenti al.
Language Technologies Capability Demonstration Alon Lavie, Lori Levin, Alex Waibel Language Technologies Institute Carnegie Mellon University CATANAL Planning.
Accelerometer based motion gestures for mobile devices Presented by – Neel Parikh Advisor Committee members Dr. Chris Pollett Dr. Robert Chun Dr. Mark.
Speech User Interface 10/26/2010. Pervasive Information Access Information & Services I-Land vision by Streitz, et. al.
Distributed Systems Architectures. Topics covered l Client-server architectures l Distributed object architectures l Inter-organisational computing.
Application Sharing Bhavesh Amin Casey Miller Casey Miller Ajay Patel Ajay Patel Bhavesh Thakker Bhavesh Thakker.
How can speech technology be used to help people with disabilities?
Hammoudeh S. Alamri1, Balsam A
Hand Gestures Based Applications
11.10 Human Computer Interface
Computer Parts There are many parts that work together to make a computer work.
Dialog Design 4 Speech & Natural Language
Resources and Schedule
Group 8 Nurul Fathiyah Abdul Muen
Information Technology and AISs
SpeechClipse v 1.0 “An Effective Plug-In for the Eclipse IDE”
Mobile Reference Diagram Template
Calypso Service Architecture
from Lutz Dietrich and Hans Kluge
VoiceXML An investigation Author: Mya Anderson
Presentation transcript:

HUMANOID ANIMATION DRIVEN BY HUMAN VOICE Thesis Advisor : Dr. Donald P. Brutzman Second Reader : Dr. Xiaoping Yun A Thesis By Ozan APAYDIN, Turkish Navy March 2002

GOALS Perform a background search on speech recognition technology to find a suitable component for this project, Develop a VUI (Voice User Interface) that maps between human voice commands and a set of animations of the avatar and provides access to the application, Build a motion library to animate available humanoids, Demonstrate interchangeability of the behaviors and the humanoids, Create humanoid animation driven by a human voice.

INTRODUCTION HUMAN VOICEVOICE RECEIVER MEDIUM (AIR) SPEECH RECOGNITION APPLICATION RULE CHOOSER GEOMETRY Rule A Rule B Rule C. Animation X Animation Y Animation Z. COMPUTER ENVIRONMENT

SPEECH RECOGNITION TECHNOLOGY (SRT) HISTORY – THE FIRST A toy company logged the first success story in the field of speech recognition decades before major research in the area was considered. “Radio Rex” was a celluloid dog that responded to its name. Lacking the computation power that powers recognition devices today, Radio Rex was a simple electromechanical device. The dog was held within its house by an electromagnet. As current flowed through a circuit bridge, the magnet was energized. The bridge was sensitive to 500 cps of acoustic energy. The energy of the vowel sound of the word “Rex” caused the bridge to vibrate, breaking the electrical circuit, and allowing a spring to push Rex out of his house.

SRT - BASIC CONCEPTS Grammar, Training, Speaker Dependence vs. Independence, Natural Language Commands, Accuracy.

SRT – APPLICATION FEATURES Command & Control Dictation Synthesizing

SRT – FACTORS AFFECTING ACCURACY Environment Hardware Speaker/User Vocabulary Size Grammar Training

SRT – LIMITATIONS Free-form Speech Input Mistakes oRejection oMisrecognition oMisfire

SRT POTENTIALS VUIs have their greatest potential in the following cases : o Users with various disabilities that prevent them from using a mouse/or keyboard. o All users, with or without disabilities, who are in an eyes busy, hands-busy situation. o Users who don’t have access to a keyboard and/or a monitor. For example accessing a system through a payphone.

JAVA SPEECH API “The Java Speech API, developed by Sun Microsystems in cooperation with speech technology companies, defines a software interface that allows developers to take advantage of speech technology for personal and enterprise computing.”

JAVA SPEECH API Cross-Platform, Cross-Vendor Support for Speech Synthesizers and for both Command & Control and Dictation Speech Recognizers Integration with Other Capabilities of the Java Platform

IBM VIAVOICE SDK Implementation of Java Speech API Provides an access to IBM ViaVoice engine Requires IBM ViaVoice or ViaVoice Runtimes

H-ANIM WORKING GROUP GOALS Specify a way of defining interchangeable humanoids and animations Allow people to author humanoids and animations independently

H-ANIM WORKING GROUP SPECIFICATIONS H-Anim 1.0 Specification H-Anim 1.1 Specification H-Anim 2001 Specification (Draft)

MODELS ?

INTERCHANGEABLE ACTORS Putting the avatars and their behaviors together in such a way that the final product should be: Efficient, Easy to expand.

Creating behavior prototypes, Converting to X3D native tags, Forming a switchable design for avatars, Employing dynamic routing. INTERCHANGEABLE ACTORS

SYSTEM INFRASTRUCTURE VIAVOICE ENGINE VIAVOICE SDK (JAVA SPEECH API IMPLEMENTATION) RECOGNIZER AND SERVER ORDER EXECUTOR AND CLIENT VRML SCENE INVOKER CLIENT BROWSER

FINAL PRODUCT Hybrid (VUI + GUI), Networked (UDP/IP), User-Independent, Mono-Lingual, Multi-Platform.

FINAL PRODUCT

DEMO

CONCLUSIONS Speech Recognition Technology (SRT) can be integrated into Virtual Environments (VEs). Hybrid (VUI + GUI) applications can be very powerful. Humanoids and animation behaviors can be designed interchangeably.

FUTURE WORK Simulation of a scenario or a game, Improving networking, Expanding motion library, Combination of animation behaviors. For example : Walk & Jump Thesis Follower : Ekrem SERIN