Conversational Technologies 10 Innovative Speech Applications Deborah A. Dahl, Ph.D. Conversational Technologies Spring SpeechTEK San Francisco February 23, 2007
Conversational Technologies Innovative Applications of Speech Most people think of call center self service or dictation when they think of speech applications There are many more ways to apply speech technology! The applications we’ll talk about use recognition synthesis other analyses of the speech signal These applications range from research to commercial projects
Conversational Technologies 1. MossTalk Words: The Problem Over 1 million Americans have difficulty with language because of an injury to the parts of the of the brain that control language (aphasia) Aphasia leads to social isolation and inability to work Insurance only pays for a limited amount of speech therapy
Conversational Technologies MossTalk Words: How Speech Helps The user sees a picture and tries to say the word that corresponds to the picture Cues are available to help the user remember the word As soon as the user says the right word, the speech recognizer says “that’s right” Advantages of the computer Low cost Available 24/7 Consistent User’s performance is automatically recorded
Conversational Technologies MossTalk Words: Example
Conversational Technologies MossTalk Words: More Information
Conversational Technologies 2. English X-Change: The Problem Millions of people in China want to learn English There are fewer than 100,000 English teachers in China, concentrated in urban areas But there are 225,000,000 English learners!
Conversational Technologies English X-Change: How Speech Helps The EX software program is simulation-based, interactive computer program for learning English Many of the lessons require students to speak English The EX program uses speech recognition to evaluate students’ pronunciation The degree to which a student’s pronunciation of a word or phrase approaches correct native spoken English pronunciation can be adjusted In one study, students who studied using the EX software program produced substantially (and significantly) higher test scores than did those who experienced traditional classroom instruction with trained native English speakers
Conversational Technologies English X-Change: Example
Conversational Technologies English X-Change: More Information
Conversational Technologies 3. Compliance for Life: The Problem People often don’t take their medications as directed “The misuse or nonuse of prescribed medications is estimated to add nearly $200 billion a year to the cost of medical care.” (Reported by Jane Brody, Just What the Doctor Ordered? Not Exactly, The New York Times, May 9, 2006) Non-compliance rates can be very high for example hypertension non-compliance is estimated at 40% (avg.) “simply forgot” is the most common reason
Conversational Technologies Compliance for Life: How Speech Helps Provides a phone- and web-based automated notification system to create, edit and cancel medication reminders The phone interface allows users to manage reminders when the web isn’t available Example of creating a reminder
Conversational Technologies Compliance for Life: More Information
Conversational Technologies 4. Animated Speech: The Problem Millions of preschool and elementary school children have language and speech disabilities There is a shortage of skilled teachers and professionals to give them the one on one attention that they need
Conversational Technologies Animated Speech: How Speech Helps Applies animated agents to produce accurate visible speech facilitate face-to-face oral communication Teaches vocabulary to children with language challenges Instruction is always available to the child, 24 hours a day, 365 days a year System has extreme patience, doesn’t become angry, tired, or bored
Conversational Technologies Animated Speech: Example
Conversational Technologies Animated Speech: More Information
Conversational Technologies 5. Diagnosing Depression: The Problem Depression is traditionally diagnosed by self- report Some depressed individuals are reluctant to admit that their medication is ineffective
Conversational Technologies Diagnosing Depression: How Speech Helps Depression is correlated with specific vocal characteristics These can be observed in speech recorded over the phone
Conversational Technologies Diagnosing Depression: More Information ml ml
Conversational Technologies 6. Guided Speech: The Problem Automatic speech recognition is widely used for telephone self-service, but it’s not always accurate Human speech recognition is accurate, but humans are expensive and get bored with handling routine calls
Conversational Technologies Guided Speech: How Speech Helps Human guide in the background assists self service application to ensure completion Use speech recognition with an operator backup Agents are able to handle 4 calls- silently and simultaneously
Conversational Technologies Guided Speech IVR Call
Conversational Technologies Guided Speech: More Information
Conversational Technologies 7. Model Talker: The Problem People who have limited ability to speak can use TTS to speak their typed utterances Concatenative TTS sounds much better than formant-based TTS The number of TTS voices is limited, and there may not be an existing voice that a user likes
Conversational Technologies Model Talker: How Speech Helps Model Talker lets users record their own voice and generate a TTS voice from their own recordings Example of a field-generated voice (generated by a user who downloaded Model Talker from the internet and recorded their voice on their own computer) and a example of this user’s actual voice
Conversational Technologies Model Talker: More Information
Conversational Technologies 8. ASL Speech Recognition: The Problem American Sign Language is the fourth most- used language in the United States Currently, human ASL translators are frequently necessary to facilitate communication between deaf and hearing presenters and their audiences Good ASL translators are in high demand and are not always available
Conversational Technologies ASL Speech Recognition: How Speech Helps Combine speech recognition and understanding with automatic ASL generation to translate from speech to sign language Example of generated ASL:
Conversational Technologies ASL Speech Recognition: More Information
Conversational Technologies 9. VoiceBox in Car Navigation: The Problem Current in-car navigations systems require multiple button presses to set destinations In one test of Neverlost the average time to set a destination for 25 first time users of was 4 minutes and 31 seconds 315 button pushes 5 testers dropped out and said they could not do it
Conversational Technologies VoiceBox in-Car Navigation: How Speech Helps For setting destinations, speech is much faster and less confusing For the same task of setting a destination, the VoiceBox time average was 18 seconds for new users
Conversational Technologies VoiceBox in-Car Navigation: Example
Conversational Technologies VoiceBox: More Information
Conversational Technologies 10. Rex the Talking Pill Box: The Problem Some patients can’t read or understand the instructions on their prescription bottles illiteracy low vision cognitive limitations
Conversational Technologies Rex Talking Pill Bottle: How Speech Helps Pharmacist programs the bottle with the medication instructions Or, user can record their own message Users presses a button to hear instructions Example:
Conversational Technologies Rex Talking Pill Bottle: More Information
Conversational Technologies Summary Speech technologies can be applied in many innovative ways make up for expensive or unavailable human expertise (MossTalk, Timo Stories, English X- Change, Guided Speech) help users who need assistance understanding or producing spoken language (ASL translator, Model Talk) remembering (Compliance for Life) or reading (Rex, Timo Stories) systematically analyze voice quality in ways that most people can’t do (depression diagnosis)