Speech Interfaces User Interfaces Spring 1998 Drew Roselli.

Slides:



Advertisements
Similar presentations
Natural Language Systems
Advertisements

TECHNOLOGY FOR MOBILE ADVERTISING SEARCH & COMMERCE © 2007 Apptera Inc. Optimizing Software Architecture for Voice Search SpeechTek 2007.
Language Special form of communication in which we learn complex rules to manipulate symbols that can be used to generate an endless number of meaningful.
Do you suffer from judgement creep? A group moderation session will soon put you right!
Lecture 1: History of Operating System
16/13/2015 3:30 AM6/13/2015 3:30 AM6/13/2015 3:30 AMIntroduction to Software Development What is a computer? A computer system contains: Central Processing.
Interface Design for ICT4B Speech, Dialects, and Interfaces Prof. Dan Klein and Prof. Marti Hearst.
1: Operating Systems Overview
Interface Guidelines & Principles Responsiveness.
03/04/2005ENEE408G Spring 2005 Multimedia Signal Processing 1 ENEE408G: Capstone Design Project: Multimedia Signal Processing Design Project 3: Digital.
1 Speech User Interfaces 2 Outline Review Review Motivation for speech UIs Motivation for speech UIs Speech recognition Speech recognition UI problems.
ITCS 6010 Speech Guidelines 1. Errors VUIs are error-prone due to speech recognition. Humans aren’t perfect speech recognizers, therefore, machines aren’t.
RoboTechTronix Ryan Fonnesbeck (CS and CE) Brian Clay (CE) Justin Hansen (CE)
Overview of Long-Term Memory laura leventhal. Reference Chapter 14 Chapter 14.
Auditory User Interfaces
Speech User Interfaces
Chapter 3 Software Two major types of software
CSC450 Software Engineering
CS364 CH08 Operating System Support TECH Computer Science Operating System Overview Scheduling Memory Management Pentium II and PowerPC Memory Management.
What are the basic principles of successful Public Speaking?
Brad Myers A/05-499A: Interaction Techniques Spring, 2014 Lecture 25: Past to Future: Artificial Intelligence (AI) in Interaction Techniques 1 ©
Speech Guidelines 2 of Errors VUIs are error-prone due to speech recognition. Humans aren’t perfect speech recognizers, therefore, machines aren’t.
1 “ Speech ” EMPOWERED COMPUTING Greenfield Business Centre, 20 th September, 2006.
A Visual Interactive Tool For the Course “Automata and Formal Languages” Holon Institute of Technology Mark Trakhtenbrot, Vladimir Nodelman, Avi Lamai.
Speech Recognition. My computer doesn’t understand me……….. Software is now mainstream Many people use it within office/home setting for inputting text.
April 24, 2015 MAER Conference Kathy SleeLaura HommingaSpecial Ed SupervisorCalhoun ISD.
Chapter 11: Interaction Styles. Interaction Styles Introduction: Interaction styles are primarily different ways in which a user and computer system can.
LOGO OPERATING SYSTEM Dalia AL-Dabbagh
Operating System Review September 10, 2012Introduction to Computer Security ©2004 Matt Bishop Slide #1-1.
Word processing. Advantages of word processors 1) It is faster and easier than writing by hand. 2) You can store documents on your computer, which you.
User Interface in the Digital Decade Kai-Fu Lee Corporate Vice President Microsoft Corporation.
Speech User Interfaces Katherine Everitt CSE 490 JL Section Wednesday, Oct 27.
Interaction Design Session 12 LBSC 790 / INFM 718B Building the Human-Computer Interface.
Chapter 5 Operating System Support. Outline Operating system - Objective and function - types of OS Scheduling - Long term scheduling - Medium term scheduling.
1 Computational Linguistics Ling 200 Spring 2006.
1 Chapter 15 User Interface Design. 2 Interface Design Easy to use? Easy to understand? Easy to learn?
Modal Interfaces & Speech User Interfaces Katherine Everitt CSE 490F Section Nov 20 & 21, 2006.
 Communication allows us to connect to other people  Communication is a combination of verbal and non- verbal skills.
©2001 Southern Illinois University, Edwardsville All rights reserved. Today Fun with Icons Thursday Presentation Lottery Q & A on Final Exam Course Evaluations.
Communication derives from the Latin word ''communis'' that means ''in common''.
Teaching Speaking and Teaching Pronunciation. Teaching Pronunciation: Pronunciation involves far more than individual sounds. Word stress, sentence stress,
Creating User Interfaces Directed Speech. XML. VoiceXML Classwork/Homework: Sign up to be Voxeo developer. Do tutorials.
Interface Guidelines & Principles Responsiveness.
MIT 6.893; SMA 5508 Spring 2004 Larry Rudolph Lecture Introduction Sketching Interface.
Intermediate1/2 Administration Communication. An Admin Assistant needs to communicate with lots of people everyday. Communication can happen in many different.
The Structure of the “THE”- Multiprogramming System Edsger W. Dijkstra Presented by: Jin Li.
AAC and Acquired Disorders. Aphasia There are different types of aphasia. With aphasia there is a rehabilitation period. There is a Psychological Impact.
USER INTERFACE DESIGN (UID). Introduction & Overview The interface is the way to communicate with a product Everything we interact with an interface Eg.
Using Voice to Solve Ergonomic Problems Dr. William Lenharth, CHFP UNH – Project54.
Direct Method.
D EFINITION OF AUDITORY PROCESSING DISORDER  APD is defect in the neural processing of auditory stimuli that caused by higher level of language, cognitive.
Goal :Communicative Competence
Speech Processing 1 Introduction Waldemar Skoberla phone: fax: WWW:
Team working in distributed environments M253 Working at Distance Faculty of Computer Studies Arab Open University Kuwait Branch 2/25/20161Kwuait Branch.
Knowledge Based Systems ExpertSystems Difficulties in Expert System Development u Scarce resources – new technology – demand for trained personnel u Development.
Etiquette Mr. Eble CP1 English II.
Stanford hci group / cs376 u Jeffrey Heer · 19 May 2009 Speech & Multimodal Interfaces.
1 Applying Principles To Reading Presented By Anne Davidson Michelle Diamond.
Notes for Speech Recognition. Speech Recognition Continuous Speech Recognition (CSR) is the software that allows users to speak normally and input data.
Unit 3: Language and Verbal Communication.” We may often think that, having good communication skills is all about the ability to speak well….. Or all.
 What is one fun thing that you did this summer?  Think about this question and be prepared to share aloud.
Using Commonsense Reasoning to Improve Voice Recognition.
Speech and multimodal Jesse Cirimele. papers “Multimodal interaction” Sharon Oviatt “Designing SpeechActs” Yankelovich et al.
Speech User Interface 10/26/2010. Pervasive Information Access Information & Services I-Land vision by Streitz, et. al.
Siri Voice controlled Virtual Assistant Haroon Rashid Mithun Bose 18/25/2014.
Difficulties in Expert System Development
William Stallings Computer Organization and Architecture
HoloSync: Exploring Discoverable Conversational Interfaces for Model State Control ALI SIDDIQUI.
CIS 375 Bruce R. Maxim UM-Dearborn
Chapter 9 System Control
Presentation transcript:

Speech Interfaces User Interfaces Spring 1998 Drew Roselli

Motivation: Mechanical Smaller devices => difficult I/O Speed, > 90 wpm (?) “Virtually unlimited” set of commands Freedom for other body parts

Motivation: User Natural Easy to remember Evolutionarily selected for –reading and writing are not –neither is typing

Speech Background Speech is faster than vocal apparatus »nasals spread Phonetic rules provide redundancy »taboo combinations, SR in Srini »contextual pronunciation: /t/ -> aspirated, flap, unreleased

Speech Recognition Often misunderstood by people »continuous feedback Longer words are easier Maximally different vowels: a, i, u Individual training »gender-based »“meaningless” conversation openers

Speech Production Three formants visible on oscilloscope Harmonics from larynx, throat, mouth Two needed for recognition but “tinny” 1989 demo – ot-speech.html

More Gratuitous Opinions (I’m really talking out of my butt here.) Recently a visual culture TV generation require pictured textbooks Notes mean “I’ll learn it later” Oral tradition has strong history – Could we go verbal?

Recognition Problems Poor recognition –humans < 1% error rate on dictation –Janus 7% error rate (how much context?) –Janus 20% in real time Background noise Slow –(simple matter of hardware) Homonym-rich languages (Cantonese)

More Recognition Problems Isolated, short words difficult –common words become short Segmentation –silly versus sill lea No semantic help Spelling –interface with printer, mail

UI Problems: Navigation Aural no-nos –modes –deep hierarchies Speech analog Grammar = how to re-structure linear sequence of words Is there a UI equivalent?

UI Problems: Feedback Verbose feedback wastes time/patience –only confirm consequential things –use meaningful, short cues Interruption –half-duplex communication –real-time scheduling

UI Problems: Meaning “Do what I mean not what I say” Silence means “Do the right thing”

VoiceNotes Voice-based file system Replacement for tapes “Hierarchical” access to voice data Thorough documentation of problems

SpeechActs Speech interface to computer tools – , calendar, weather, stock quotes Conversions to canonical form –keyword based? confused by negations? Inconsistent recognition –misunderstand system –progressive assistance –implicit confirmation

Multimodal Error Correction Dictation error correction study Results very unclear Recognizer got it wrong the first time => will get it wrong the second time hyperarticulating aggravates Correct dictation errors with: vocal spelling, writing, typing, etc