Speech User Interfaces

Slides:



Advertisements
Similar presentations
Delivery: The Acting Part of Public Speaking Four Modes of Delivery Vocal Aspects of Delivery Nonverbal Aspects of Delivery Perfecting Your Delivery.
Advertisements

Communication Transferring information from one person to another. Communication is used to instruct, clarify interpret, notify, warn, receive feedback,
Natural Language Systems
Arthur Fink Page 1 Notes on Designing User Interfaces for OpenEdge GUI for.NET Arthur Fink Arthur Fink Consulting © 2008 by Arthur Fink.
Nonverbal Communication Actions, as opposed to words, that send messages Body language, behavior Some messages are subtle, such as posture Can be so strong.
 INTRODUCTION  STEPS OF GESTURE RECOGNITION  TRACKING TECHNOLOGIES  SPEECH WITH GESTURE  APPLICATIONS.
Arthur Fink Page 1 Thinking about User Interface Design for OpenEdge GUI for.NET (or any other environment) by Arthur Fink
Class 6 LBSC 690 Information Technology Human Computer Interaction and Usability.
HCI Issues in eXtreme Computing James A. Landay Endeavour-DARPA Meeting, 9/21/99.
Interface Design for ICT4B Speech, Dialects, and Interfaces Prof. Dan Klein and Prof. Marti Hearst.
ISTD 2003, Audio / Speech Interactive Systems Technical Design Seminar work: Audio / Speech Ville-Mikko Rautio Timo Salminen Vesa Hyvönen.
1 Speech User Interfaces 2 Outline Review Review Motivation for speech UIs Motivation for speech UIs Speech recognition Speech recognition UI problems.
ITCS 6010 Speech Guidelines 1. Errors VUIs are error-prone due to speech recognition. Humans aren’t perfect speech recognizers, therefore, machines aren’t.
Auditory User Interfaces
SM3121 Software Technology Mark Green School of Creative Media.
Setup Guide for Win 7 Speech Recognition 6/30/2014 Debbie Hebert, PT, ATP Central AT Services.
Web Design and Patterns CMPT 281. Outline Motivation: customer-centred design Web design introduction Design patterns.
Communication Skills - Chapter 2 Mr. Sherpinsky Business Management Class Council Rock School District.
Design Considerations & User Experience Guidelines for Mobile Tablet Applications Arnie Lund Director, User Experience David Hale Developer Experience.
Speech Guidelines 2 of Errors VUIs are error-prone due to speech recognition. Humans aren’t perfect speech recognizers, therefore, machines aren’t.
Revision Lesson : DESIGNING COMPUTER-BASED INFORMATION SYSTEMS.
What is Effective Communication? Is more then just exchanging of information; it’s about understanding the emotion behind the information. It combines.
Essential Presentation Skills
Chapter 11: Interaction Styles. Interaction Styles Introduction: Interaction styles are primarily different ways in which a user and computer system can.
Chapter 2 Communication Skills.
Nonverbal Communication
Speaking to Computers Alex Acero Manager, Speech Research Group Microsoft Research Feb 14 th 2003.
User Interface in the Digital Decade Kai-Fu Lee Corporate Vice President Microsoft Corporation.
Speech User Interfaces Katherine Everitt CSE 490 JL Section Wednesday, Oct 27.
CP SC 881 Spoken Language Systems. 2 of 23 Auditory User Interfaces Welcome to SLS Syllabus Introduction.
Fall 2002CS/PSY Pervasive Computing Ubiquitous computing resources Agenda Area overview Four themes Challenges/issues Pervasive/Ubiquitous Computing.
Making a great Project 2 OCR 1994/2360. Analysis This is the key to getting it right. Too many candidates skip through this section. It’s worth 20% of.
Cognitive Systems Foresight Language and Speech. Cognitive Systems Foresight Language and Speech How does the human system organise itself, as a neuro-biological.
Object-Oriented Software Engineering Practical Software Development using UML and Java Chapter 7: Focusing on Users and Their Tasks.
Modal Interfaces & Speech User Interfaces Katherine Everitt CSE 490F Section Nov 20 & 21, 2006.
BSBPMG507A Manage Project Communications Shannon’s Communication Model Communication is impaired by noise factors or ‘barriers’
Healthcare Communications Shannon Cofield, RDH. Essential Question How can communication affect patient care?
Part 2 – Skills for Success
AVI/Psych 358/IE 340: Human Factors Interfaces and Interaction September 22, 2008.
Human-Computer Interaction
INFO 355Week #71 Systems Analysis II User and system interface design INFO 355 Glenn Booker.
ENTERFACE 08 Project 1 “MultiParty Communication with a Tour Guide ECA” Mid-term presentation August 19th, 2008.
Speech Interfaces User Interfaces Spring 1998 Drew Roselli.
Creating User Interfaces Directed Speech. XML. VoiceXML Classwork/Homework: Sign up to be Voxeo developer. Do tutorials.
“Whether you think you can or you think you can’t, you’re right.” - Henry Ford.
MIT 6.893; SMA 5508 Spring 2004 Larry Rudolph Lecture Introduction Sketching Interface.
Input Design Lecture 11 1 BTEC HNC Systems Support Castle College 2007/8.
The Communication Process WHAT IS COMMUNICATION?.
1 Professional Communication. 1 Professional Communication.
Stanford hci group / cs376 Research Topics in Human-Computer Interaction Design Tools Ron B. Yeh 26 October 2004.
What is communication?!!!. Elements of communication verbal (the words we use) 7% vocal (expressions, intonation) 38% visual (facial and body language)
Anywhere, Anytime, Anydevice Interfaces: Tools, Infrastructure, & Applications Summer 2002 BID/HCC Retreat for User Interface Research Group Prof. James.
Part 2 – Skills for Success Chapter 5 Communicating on the Job.
Systems and User Interface Software. Types of Operating System  Single User  Multi User  Multi-tasking  Batch Processing  Interactive  Real Time.
Stanford hci group / cs376 u Jeffrey Heer · 19 May 2009 Speech & Multimodal Interfaces.
6.S196 / PPAT: Principles and Practice of Assistive Technology Wed, 19 Sept Prof. Rob Miller Today: User-Centered Design [C&H Ch. 4]
© 2011 Cengage Learning. All Rights Reserved. CHAPTER 11 Speech Delivery 11.1Effective Speech Delivery 11.2Delivery, Rehearsal, and Audience Adaptation.
Communication. Communication It is a process of exchanging –  Information  Ideas  Thoughts  Feelings  Emotions Through –  Speech  Signals  Writing.
EFFECTIVE PUBLIC SPEAKING HOW TO DELIVER YOUR SPEECH.
1. Chapter Preview Part 1 – Listening in the Classroom  Listening Skills: The Problem and the Goal  Listening Tasks in Class Part 2 – Listening outside.
Speech and multimodal Jesse Cirimele. papers “Multimodal interaction” Sharon Oviatt “Designing SpeechActs” Yankelovich et al.
Speech User Interface 10/26/2010. Pervasive Information Access Information & Services I-Land vision by Streitz, et. al.
MULTIMODAL AND NATURAL COMPUTER INTERACTION Domas Jonaitis.
Human Computer Interaction Lecture 21 User Support
GESTURE RECOGNITION TECHNOLOGY
CS 2610 Project Presentation Presented By- Zuha Agha and Tazin Afrin
Communication Disability
Pervasive Computing Ubiquitous computing resources
“Let’s Talk” Lesson 10.
Computer Vision Readings
Presentation transcript:

Speech User Interfaces include material from UPA workshop and CHI 2000 tutorial by Jennifer Lai include more speech UI examples & less of Speechacts video (show only usage parts). Show an example of a multimodal UI in action

Outline Motivation for speech UIs Speech recognition UI problems with speech UIs SpeechActs: Guidelines for speech UIs Speech UI design tools Multimodal UIs

Motivation for Speech UIs: Pervasive Information Access I-Land vision by Streitz, et. al. Information & Services

UIs in the Pervasive Computing Era Future computing devices won’t have the same UI as current PCs wide range of devices small or embedded in environment often w/ “alternative” I/O & w/o screens information appliances I-Land vision by Streitz, et. al. pens, speech, vision...

Information Access via Speech Read my important email How to Design?

Industry Leaders Nuance Corporation Applications: TellMe, … Users: Government, Computers- Microsoft, IBM,

Speech UI Motivation Smaller devices -> difficult I/O people can talk at ~ 90 wpm -> high speed “Virtually unlimited” set of commands Freedom for other body parts imagine you are working on your car & need to know something from the manual Natural evolutionarily selected for reading, writing, & typing are not (too new)

Why are Speech UIs Hard to Get Right? Speech recognition far from perfect imagine inputting commands w/ the mouse & getting the wrong result 5-20% of the time Speech UIs have no visible state can’t see what you have done before or what affect your commands have had Speech UIs are hard to learn how do you explore the interface? how do you find out what you can say?

Speech UIs Require Speech recognition Speech production (or synthesis) the computer understanding what the customer is saying Speech production (or synthesis) the computer talking to the customer

Speech Recognition Continuous vs. non-continuous Speaker independent vs. dependent Speech often misunderstood by people feedback via speech, facial expressions, & gesture Recognizers trained with real samples often get gender-based problems Based on probabilities (HMMs - Bayes) trigrams of sounds or words Several popular recognizers Nuance, SpeechWorks, IBM ViaVoice

Speech Production Three frequency regions of great intensity visible on oscilloscope come from larynx, throat, mouth Two needed for recognition but “tinny” Can generate emotion affect in speech Demo anger, disgust, gladness, sadness, fear, & surprise http://cahn.www.media.mit.edu/people/cahn/emot-speech.html

Recognition Problems Good recognition Background noise Speed humans < 1% error rate on dictation top recognition systems get <1-X% error rates computers don’t use much context Key is to be application specific for lower error rates Background noise even worse recognition rates (20-40% error) Speed Better as hardware getting faster in 10 years gone from 5 high-end workstations required to some speech systems running on laptops or even PDAs

More Recognition Problems Isolated, short words difficult common words become short Segmentation silly versus sill lea Spelling mail vs. male -> need to understand language

Speech UI Problems Speech UI no-nos modes (no feedback) certain commands only work when in specific states deep hierarchies (aka voice mail hell) Verbose feedback wastes time/patience only confirm consequential things use meaningful, short cues Interruption half-duplex communication (i.e., no barge-in support) Too much speech on the part of customer is tiring Speech takes up space in working memory can cause problems when problem solving

SpeechActs: Guidelines for Speech UIs Speech interface to computer tools email, calendar, weather, stock quotes Establish common ground & shared context make sure people know where they are in the conversation Pacing recog. delays are unnatural, make it clear when this occurs barge-in lets user interrupt like in real conversations tapering of prompts progressive assistance: short errors messages at first, longer when user needs more help implicit confirmation: include confirm in next command

One Vision of Future User Interfaces Star Trek style UI verbally ask the computer for information may be common in mobile/hands-busy situations problem: hard to design, build, & use! requires perfect speech recognition & language understanding

Our Vision of Future User Interfaces Multimodal, Context-aware UIs multimodal uses multiple input modalities (speech & gesture) to disambiguate user says “move it to this screen” while pointing context-aware apps can be aware of location, user, what they are doing, … people are talking -> don’t rely on speech I/O Problem: how to prototype & test new ideas? Informal UI Design Tools! combine Wizard of Oz & informal storyboarding At a recent conference on user interface software and technology, Jim Foley, a pioneer in the HCI field, noted that in the 20 years since “Put That There,” there are still no tools that would let application designers build it multimodal UIs allow computers to be used in more situations & places and by more diverse people Ame’s boyfriend got a new Audi & the manual is on CD – try to repair from CD – takes 2 people & a laptop These previous projects have influenced this new work! WoZ from SUEDE, Storyboarding from DENIM, etc.

Multimodal Error Correction Dictation error correction study found users are better at correcting recognition errors with a different input modality recognizer got it wrong the first time -> it will get it wrong the second time hyperarticulating aggravates Correct dictation errors with vocal spelling, writing, typing, etc

Summary Speech UIs UI tools are needed for speech UI design may permit more natural computer access allow us to use computers in more situations are hard to get to work well lack of visible state, tax working memory, recognition problems, etc. UI tools are needed for speech UI design Multimodal UIs address some of the problems with pure speech UIs help disambiguate help w/ correction