Lti Shaping Spoken Input in User-Initiative Systems Stefanie Tomko and Roni Rosenfeld Language Technologies Institute School of Computer Science Carnegie.

Slides:



Advertisements
Similar presentations
Telephone Conversation
Advertisements

Non-Native Users in the Let s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine Raux & Maxine Eskenazi Language Technologies Institute.
Sheltered Instruction Observation Protocol
Introduction & Content Objectives Ed 326. PARTNERS BB = “Buddy Buzz”
ESL PARENT ORIENTATION AGENDA Welcome Introductions Greetings ESL Program Questions Tour.
A tool for simplifying text For people with autism who have difficulty with understanding regular text Converts.
Welcome Session Norms: All pagers and cell phones on vibrate Stay on topic being discussed Use professional courtesy.
Understanding Progress in English A Guide for Parents.
Spoken Communication Skills Developing Listening and Speaking Skills.
Speech Graffiti Tutorial MovieLine version Fall 03.
The Effect of Miscommunication Rate on User Response Preferences Hua Ai, University of Pittsburgh Thomas Harris, Carolyn Penstein Rosé, Carnegie Mellon.
Sorry, I didn’t catch that! – an investigation of non-understandings and recovery strategies Dan Bohuswww.cs.cmu.edu/~dbohus Alexander I. Rudnickywww.cs.cmu.edu/~air.
Do you suffer from judgement creep? A group moderation session will soon put you right!
What can humans do when faced with ASR errors? Dan Bohus Dialogs on Dialogs Group, October 2003.
Lti Intelligent Help (or lack thereof) in Spoken Dialog Systems Dialogs on Dialogs discussion Stefanie Tomko 20-Feb-04 HELP!
ITCS 6010 Spoken Language Systems: Architecture. Elements of a Spoken Language System Endpointing Feature extraction Recognition Natural language understanding.
Verbal (symbol) Based Interactions Dr.s Barnes and Leventhal.
Speech Graffiti Tutorial MovieLine version Fall 03.
Basic Scientific Writing in English Lecture 3 Professor Ralph Kirby Faculty of Life Sciences Extension 7323 Room B322.
1 Shaping in Speech Graffiti: results from the initial user study Stefanie Tomko Dialogs on Dialogs meeting 10 February 2006.
ITCS 6010 Speech Guidelines 1. Errors VUIs are error-prone due to speech recognition. Humans aren’t perfect speech recognizers, therefore, machines aren’t.
Towards Learning Dialogue Structures from Speech Data and Domain Knowledge: Challenges to Conceptual Clustering using Multiple and Complex Knowledge Source.
Unit 8 Presentations: The Middle of a Presentation
Speech Graffiti Tutorial FlightLine version Fall 03.
Speech Guidelines 2 of Errors VUIs are error-prone due to speech recognition. Humans aren’t perfect speech recognizers, therefore, machines aren’t.
The Cambridge BEC Exam.
Academic English Seminar Skills “An Introduction to EAP – Academic Skills in English” Lesson 1.
The content of these slides by John Galeotti, © Carnegie Mellon University (CMU), was made possible in part by NIH NLM contract# HHSN P,
Interaction Design Session 12 LBSC 790 / INFM 718B Building the Human-Computer Interface.
Things to Think About from the SIOP Model
Getting the Language Right ITSW 1410 Presentation Media Software Instructor: Glenda H. Easter.
The new languages GCSE: STRATEGIES FOR SUCCESSFUL IMPLEMENTATION.
How to develop an oral presentation You have one chance to make a point.
Classroom Interactions in Science and Math # 05: Classroom Norms.
How to develop an oral presentation You have one chance to make a point.
First-day activity Adults – real beginners.
Lesson Planning SIOP.
Slide 1 Welcome California Mathematics and Science Partnership (CaMSP) Year 2, Follow Up 3 January 22, 2011.
EA in ESL Teacher Training Workshops June 4, 6, & 8, 2007 – 4:30 to 7:30 p.m. Kapi‘olani Community College Teacher Preparation Program Shawn Ford and Veronica.
Academia Británica Pulling teeth UTTERANCE above ALL March ̍11 UTTERANCE above ALL Academia Británica Pulling teeth March ̍11 um, so...what are we talkin’about?
Data Sampling & Progressive Training T. Shinozaki & M. Ostendorf University of Washington In collaboration with L. Atlas.
16.0 Spoken Dialogues References: , Chapter 17 of Huang 2. “Conversational Interfaces: Advances and Challenges”, Proceedings of the IEEE,
Background: Speakers use prosody to distinguish between the meanings of ambiguous syntactic structures (Snedeker & Trueswell, 2004). Discourse also has.
1 Natural Language Processing Lecture Notes 14 Chapter 19.
Extending VERA (Conference Information) Design Specification & Schedules Arthur Chan (AC) Rohit Kumar (RK) Lingyun Gu (LG)
Comprehensible Input “Say WHAT?!” Translating “teacherese” into “studentese” with ease! ~Dr. Cindy Oliver.
BUILDING A GOOD PARAGRAPH OR: SAYING IT WITH STYLE!
How to Teach English Language Learners Tips and Strategies
PRESENTATION SKILLS Presenter: Rony Jose Thekkel Date: Mar 18 th 2011.
SIOPComprehensibleInput. Review Homework You will have 3 minutes to complete this task. Use a colored marker, write/draw what you and your family like.
Speech Graffiti Tutorial FlightLine version Fall 03.
Module 1 How to learn English
Speech Processing 1 Introduction Waldemar Skoberla phone: fax: WWW:
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
Year R Stay and Play Talk. Why?  Communication is the number one skill. Without it, children will struggle to make friends, learn and enjoy life.
CELDT PRACTICE Speaking Version B.
Unit 1 Try not to translate every word. Module 1 How to learn English.
Objectives of session By the end of today’s session you should be able to: Define and explain pragmatics and prosody Draw links between teaching strategies.
Module 1 How to learn English Unit 1 Let’s try to speak English as much as possible.
Supporting Children with Speech, Language and Communication Needs (SLCN) Monday 21 st March Emily Alderson – Speech and Language Therapist.
Objective: Students will learn the formal essay writing format. Bellwork: Why do you think it’s important to use a specific format when writing an essay?
Predicting and Adapting to Poor Speech Recognition in a Spoken Dialogue System Diane J. Litman AT&T Labs -- Research
Sandy Lozito ATM2003 June 2003 The Impact of Voice, Data Link, and Mixed Modality Environments on Flight Deck Procedures Sandy Lozito 1, Savvy Verma 2,
A Presentation, Practice and Production (PPP) B Task-based Learning (TBL) C Total Physical Response (TPR) D The Lexical Approach E Grammar Translation.
Automatic Speech Recognition
Chapter 6. Data Collection in a Wizard-of-Oz Experiment in Reinforcement Learning for Adaptive Dialogue Systems by: Rieser & Lemon. Course: Autonomous.
Spoken Dialogue Systems
Language is a medium of communication.
Managing Dialogue Julia Hirschberg CS /28/2018.
Spoken Dialogue Systems
Presentation transcript:

lti Shaping Spoken Input in User-Initiative Systems Stefanie Tomko and Roni Rosenfeld Language Technologies Institute School of Computer Science Carnegie Mellon University Presented by: Thomas Kevin Harris

lti 2 outline introduction –related work method results –user perceptions discussion

lti 3 introduction (all other things being equal) spoken dialog systems perform best when users speak within the grammar that the system understands –make grammar more accepting? –get users to speak within the smaller grammar? ok, how do we do this? –this study is a preliminary step

lti 4 related work shaping is pretty ubiquitous –users adapt to features of system prompts formality (Ringle & Halstead-Nussloch 1989) length (Zoltan-Ford 1991) vocabulary (Brennan 1996; Gustafson et al 1997) –user also simplify input under higher WER conditions (Shriberg et al 1992)

lti 5 foundation: speech graffiti a structured, subset language interaction protocol for interacting with simple machines user input –slot+value pairs: theater is the Galleria Six –what-questions: what are the movies? system output –terse restatement of value: Galleria six user initiative

lti 6 previous speech graffiti studies shown to be an effective interaction style –compared to a natural language interface higher user satisfaction lower task completion times similar task completion rates but… users can have difficulty learning & speaking subset language when users tried NL with speech graffiti system, their utterances were simpler than NL to an NL system

lti 7 initial questions about shaping how can different instructions influence user input? how will users shape input in response to –rejection of conversational, NL speech –speech graffiti-style, terse, value-only confirmations how will this work in a user-initiative environment?  wizard-of-oz study

lti 8 outline introduction –related work method results –user perceptions discussion

lti 9 wizard-of-oz study 18 participants, mostly CMU students –most with non-technical backgrounds –most had used ASR systems before but not regularly interact with a telephone information system providing movie schedules & airline flight data 10 tasks, e.g. –a friend told you that Miracle was pretty good. where is this movie playing? –a friend has told you that she's flying to San Francisco on United flight 500. when will she get there?

lti 10 instructions 3 conditions: short - medium - long welcome: Welcome to the InfoLine. instruction-short: The system you are talking to only understands very simple English, so please speak to it as simply as you can. instruction-medium: The system you are talking to only understands very simple English, so please speak to it as simply as you can. It will understand you best if you tell it only one idea at a time. instruction-long: The system you are talking to only understands very simple English, so please speak to it as simply as you can. It will understand you best if you tell it only one idea at a time. This system understands only keywords, and not the structure of sentences. example: For instance, you might say "movie The Lord of the Rings,” or “airline is United,” or “what are show times?" prompt: You can now start speaking whenever you're ready.

lti 11 wizard rules reject: –non-task conversational words could you tell me… –task-based non-content items what movies are showing in West Mifflin? –task-based OOV words films; earliest flight rejection messages –excuse me? –I'm sorry, I didn't understand that. –[replay instruction & example strings from introduction]

lti 12 sample user interaction TASK A: user: where is Miracle movie playing? wizard: excuse me? u: Miracle movie w: Miracle u: where is it playing? w: excuse me? u: which theaters? w: 3 theaters: Destinta Theatres Plaza 22, Loews Waterfront Theatre, Northway Mall Cinemas 8 TASK B: u: United flight 500 to San Francisco, what time does it arrive? w: United, to San Francisco, flight 500 u: Arrival time w: arriving at 12:33 pm

lti 13 outline introduction –related work method results –user perceptions discussion

lti 14 results overall completion rate: 96.1% users generally used value-only no effect of instruction length on number of utterances per session longer instruction length  shorter user utterances –due to extra content? instruction condition mean # words per utterance short4.49 medium3.36 long2.98

lti 15 rejected input on average, about 22% of user's utterances were rejected –no effect of instruction condition –users only repeated input verbatim in 7 cases sequential rejection instance # of occurrences 1 st nd 52 3 rd 15 4 th 3 rejected input shaped after this point 50% 75% 84% 85% excuse me? I'm sorry… the system you are talking to…

lti 16 user perceptions participants clearly aware of limited style participants mentioned –simplification –minimization –keywords (key words) these ideas parallel the instruction conditions –speak simply –one idea at a time –use keywords but comments did not match condition

lti 17 outline introduction –related work method results –user perceptions discussion

lti 18 discussion (1) how can different instructions influence user input? –more explanation of simplification  shorter utterances how will users shape input in response to –rejection of conversational, NL speech 50% of rejected utts shaped after one rejection 75% shaped after two rejections –speech graffiti-style, terse, value-only confirmations most shaped user input mimicked this style

lti 19 discussion (2) how will this work in a user-initiative environment? –96% task completion rate without explicit system prompts future work –shape input more precisely shape with slot+value confirmation, to avoid ambiguity? shape to specific acoustically distinct vocabulary? –can input be shaped even if some NL is handled?