Frontiers in Interaction The Power of Multimodal Standards Deborah Dahl Principal, Conversational Technologies Chair, W3C Multimodal Interaction Working.

Slides:



Advertisements
Similar presentations
Device receives electronic signal transmitted from signs containing information A device that can communicate GPS location relative to the destination.
Advertisements

Natural Language Systems
An overview of EMMA— Extensible MultiModal Annotation Michael Johnston AT&T Labs Research 8/9/2006.
Business computer application University of Palestine College of Business Instructor: Mr. Ahmed Abumosameh.
Irek Defée Signal Processing for Multimodal Web Irek Defée Department of Signal Processing Tampere University of Technology W3C Web Technology Day.
Empirical and Data-Driven Models of Multimodality Advanced Methods for Multimodal Communication Computational Models of Multimodality Adequate.
Chapter 5 Input and Output. What Is Input? What is input? p. 166 Fig. 5-1 Next  Input device is any hardware component used to enter data or instructions.
XISL language XISL= eXtensible Interaction Sheet Language or XISL=eXtensible Interaction Scenario Language.
Lesson 4 Alternative Methods Of Input.
Class 6 LBSC 690 Information Technology Human Computer Interaction and Usability.
Support for Palm Pilot Collaboration Including Handwriting Recognition.
Input Devices Text Entry Devices, Positioning, Pointing and Drawing.
Your Interactive Guide to the Digital World Discovering Computers 2012.
Adapted from CTAE Resources Network PROFITT Curriculum Basic Computer Skills Module 1 Hardware.
Parts of a Computer.
CHAPTER 2 Input & Output Prepared by: Mrs.sara salih 1.
Mobile Multimodal Applications. Dr. Roman Englert, Gregor Glass March 23 rd, 2006.
Chapter 5 Input. What Is Input? What are the input devices? Input device is any hardware component used to enter data or instructions Data or instructions.
IT Introduction to Information Technology CHAPTER 05 - INPUT.
Should Intelligent Agents Listen and Speak to Us? James A. Larson Larson Technical Services
   Input Devices Main Memory Backing Storage PROCESSOR
1 Skip Cave Chief Scientist, Intervoice Inc. Multimodal Framework Proposal.
Conversational Applications Workshop Introduction Jim Larson.
Multimedia Specification Design and Production 2013 / Semester 2 / week 8 Lecturer: Dr. Nikos Gazepidis
Chapter 5 Input By: Matthew D McCoog What Is Input? Any data or instructions entered into the memory of a computer.
ITCS 6010 SALT. Speech Application Language Tags (SALT) Speech interface markup language Extension of HTML and other markup languages Adds speech and.
11.10 Human Computer Interface www. ICT-Teacher.com.
Interaction Gavin Sim HCI Lecture /111. Aims of this lecture Last week focused on persona and scenario creation. This weeks aims are: ◦ To introduce.
Unit 1_9 Human Computer Interface. Why have an Interface? The user needs to issue instructions Problem diagnosis The Computer needs to tell the user what.
Spoken dialog for e-learning supported by domain ontologies Dario Bianchi, Monica Mordonini and Agostino Poggi Dipartimento di Ingegneria dell’Informazione.
1 Commercial Applications of Natural Language Technology April 14, 2005 Deborah A. Dahl Principal, Conversational Technologies Chair, World Wide Web Consortium.
COMPUTER PARTS AND COMPONENTS INPUT DEVICES
Input By Hollee Smalley. What is Input? Input is any data or instructions entered into the memory of a computer.
Outline Grammar-based speech recognition Statistical language model-based recognition Speech Synthesis Dialog Management Natural Language Processing ©
 Data or instructions entered into memory of computer  Input device is any hardware component used to enter data or instructions 2.
Larry Shi Computer Science Department Graduate Research Mini Talk.
Spoken Dialog Systems and Voice XML Lecturer: Prof. Esther Levin.
Chapter 5: Input CSC 151 Beth Myers Kristy Heller Julia Zachok.
Human Computer Interaction © 2014 Project Lead The Way, Inc.Computer Science and Software Engineering.
Microsoft Assistive Technology Products Brought to you by... Jill Hartman.
E.g.: MS-DOS interface. DIR C: /W /A:D will list all the directories in the root directory of drive C in wide list format. Disadvantage is that commands.
1 ISO/IEC JTC1/SC37 Standards A presentation of the family of biometric standards October 2008.
1 Human Computer Interaction Week 5 Interaction Devices and Input-Output.
Assistive Technology November 14, Screen Reader Who uses screen readers? –People with little to no vision What is it? –A form of “Assistive Technology”
Robert Crawford, MBA West Middle School.  Explain how input devices are suited to certain kinds of data.  Distinguish between RAM and ROM.  Identify.
Different Types of HCI CLI Menu Driven GUI NLI
Conceptual Design Dr. Dania Bilal IS588 Spring 2008.
Identify input devices and their uses, e. g
Multi-Modal Dialogue in Personal Navigation Systems Arthur Chan.
INPUT AND OUTPUT DEVICES Group: 10 Batch: TIS 76 Gaby Alexander Damini Mangaj Amritpreet Singh Arunavathi Suraj Medda.
Preparing for the 2008 Beijing Olympics : The LingTour and KNOWLISTICS projects. MAO Yuhang, DING Xiao-Qing, NI Yang, LIN Shiuan-Sung, Laurence LIKFORMAN,
1 Interaction Devices CIS 375 Bruce R. Maxim UM-Dearborn.
James A. Larson Developing & Delivering Multimodal Applications 1 EMMA Extensible MultiModal Annotation markup language Canonical structure for semantic.
© NCC Education Limited V1.0 Introduction to Computing  Unit 4: Human-Computer Interaction.
Computer Applications I I dentify alternative input devices and techniques.
W3C Multimodal Interaction Activities Deborah A. Dahl August 9, 2006.
Audio & Vibration MOBILE IS EVERYWHERE Problems: External stimuli & noise Seeing & hearing challenges Accessibility Solutions: Audio & Vibrations.
Discovering Computers Fundamentals, 2010 Edition Living in a Digital World Chapter Two Input, Output and Storage.
QSREALM.BLOGSPOT.COM Input Output Devices. QSREALM.BLOGSPOT.COM Input – Output Devices Also known as Peripheral Devices. These surround a computer’s CPU.
Presented By Sharmin Sirajudeen S7 CS Reg No :
Lesson 4 Alternative Methods Of Input.
Derek Hunt Education Commons
Designing Speech and Multimodal Applications for Seniors
Improving Dialogs with EMMA
Google translate app demo
Derek Hunt Education Commons
Lesson 4 Alternative Methods Of Input.
Chapter 2: Input and output devices
Optimizing Multimodal Interfaces for Speech Systems in the Automobile
Lesson 4 Alternative Methods Of Input.
Presentation transcript:

Frontiers in Interaction The Power of Multimodal Standards Deborah Dahl Principal, Conversational Technologies Chair, W3C Multimodal Interaction Working Group VoiceSearch 2009 March 2-4 San Diego, California Conversational Technologies

User Experience vs. Complexity 1 The command prompt – technically simple but a bad user experience

User Experience vs. Complexity 2 The iPhone Google Earth application – technically complex but a good user experience

As the user experience becomes more natural and intuitive, the more complex the supporting technology becomes

Multimodality offers an opportunity for a more natural user experience, but …

There are a lot of technologies involved in multimodal applications!

Parts of a Multimodal Application User experience Technology Technologies for managing the interaction Technologies for capturing and interpreting user input Technologies for capturing sensor data (geolocation) Technologies for presenting output Communication protocols Devices Servers Networks

Multimodal applications require coordinating many forms of user and sensor input… Speech Audio Stylus Touch Accelerometer Keyboard/keypad Mouse/touchpad Camera Geolocation Handwriting recognition Speaker verification Signature verification Fingerprint identification ….

…while presenting the user with a natural and usable interface

A multimodal application can include many kinds of input, used for many purposes

Interaction Manager Interaction Manager Speaker verification Speaker verification Face identification Face identification Audio recording Audio recording fingerprint drawing video photograph Vital signs Vital signs geolocation speech text mouse handwriting accelerometer User intent Sensor Recording Identification

It can include many technologies Interaction Manager Interaction Manager Speech recognition Speech recognition Handwriting recognition Handwriting recognition Accelerometer Geolocation Touchscreen Keypad Fingerprint recognition Fingerprint recognition

Getting everything to work together is complicated.

One way to make things easier is to represent the same information from different modalities in the same format.

We need a common language for representing the same information from different modalities.

EMMA: (Extensible MultiModal Annotation) A uniform representation for multimodal information Interaction Manager Interaction Manager Speech recognition Speech recognition Handwriting recognition Handwriting recognition Accelerometer Geolocation Touchscreen Keypad Fingerprint recognition Fingerprint recognition EMMA

EMMA can represent many types of information because it includes both general annotations as well as application-specific semantics

General Annotations Confidence Timestamps Alternative interpretations Language Medium (visual, acoustic, tactile) Modality (voice, keys, photograph) Function (dialog, recording, verification…)

Representing user intent

Example: Travel Application “I want to go from Boston to Denver on March 11”

Boston Denver Boston Denver Application Semantics (modality-independent) This could have originated from speech, keyboard, pull-down menus, handwriting… But it looks (almost) the same in EMMA because it has the same meaning

The same meaning with speech and mouse input Boston Denver Boston Denver Boston Denver Boston Denver Speech Mouse

Handwriting, too Boston Denver Boston Denver

Identifying users -- biometrics

The same identification with face recognition or speaker identification <emma:interpretation id=“int1 emma:confidence=".75” emma:medium="visual" emma:mode="photograph" emma:verbal="false" emma:function="identification"> Mary Smith <emma:interpretation id=“int1 emma:confidence=".75” emma:medium="visual" emma:mode="photograph" emma:verbal="false" emma:function="identification"> Mary Smith <emma:interpretation id=“int1 emma:confidence=".80” emma:medium=“acoustic" emma:mode=“voice" emma:verbal="false" emma:function="identification"> Mary Smith <emma:interpretation id=“int1 emma:confidence=".80” emma:medium=“acoustic" emma:mode=“voice" emma:verbal="false" emma:function="identification"> Mary Smith

Other Modalities Geolocation – latitude, longitude, speed Text – plain old typing Accelerometer – position of device …

More Information The specification: The Multimodal Interaction Working Group: An open source implementation: