Non-Native Users in the Let s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine Raux & Maxine Eskenazi Language Technologies Institute.

Slides:



Advertisements
Similar presentations
( current & future work ) explicit confirmation implicit confirmation unplanned implicit confirmation request constructing accurate beliefs in spoken dialog.
Advertisements

Grammar: Meaning and Contexts * From Presentation at NCTE annual conference in Pittsburgh, 2005.
Improved Name Recognition with Meta-data Dependent Name Networks published by Sameer R. Maskey, Michiel Bacchiani, Brian Roark, and Richard Sproat presented.
Developing Spoken Dialogue Systems in the Communicator / RavenClaw Framework Sphinx Lunch Talk Carnegie Mellon University, October 2004 Presented by:Dan.
Markpong Jongtaveesataporn † Chai Wutiwiwatchai ‡ Koji Iwano † Sadaoki Furui † † Tokyo Institute of Technology, Japan ‡ NECTEC, Thailand.
Probabilistic Adaptive Real-Time Learning And Natural Conversational Engine Seventh Framework Programme FP7-ICT
Acoustic Model Adaptation Based On Pronunciation Variability Analysis For Non-Native Speech Recognition Yoo Rhee Oh, Jae Sam Yoon, and Hong Kook Kim Dept.
(Spoken) Dialogue and Information Retrieval Antoine Raux Dialogs on Dialogs Group 10/24/2003.
Sean Powers Florida Institute of Technology ECE 5525 Final: Dr. Veton Kepuska Date: 07 December 2010 Controlling your household appliances through conversation.
HIGGINS Error handling strategies in a spoken dialogue system Rolf Carlson, Jens Edlund and Gabriel Skantze Error handling research issues The long term.
Flexible Turn-Taking for Spoken Dialog Systems PhD Thesis Defense Antoine Raux Language Technologies Institute, CMU December 12, 2008 Thesis Committee.
Course Overview Lecture 1 Spoken Language Processing Prof. Andrew Rosenberg.
What can humans do when faced with ASR errors? Dan Bohus Dialogs on Dialogs Group, October 2003.
Belief Updating in Spoken Dialog Systems Dan Bohus Computer Science Department Carnegie Mellon University Pittsburgh,
Online supervised learning of non-understanding recovery policies Dan Bohus Computer Science Department Carnegie.
Madeleine, a RavenClaw Exercise in the Medical Diagnosis Domain Dan Bohus, Alex Rudnicky MITRE Workshop on Dialog Management, Boston, October 2003.
The Use of Speech in Speech-to-Speech Translation Andrew Rosenberg 8/31/06 Weekly Speech Lab Talk.
1 Shaping in Speech Graffiti: results from the initial user study Stefanie Tomko Dialogs on Dialogs meeting 10 February 2006.
Extending VERA (Conference Information) Design Specification & Schedules Arthur Chan (AC) Rohit Kumar (RK) Lingyun Gao (LG)
“k hypotheses + other” belief updating in spoken dialog systems Dialogs on Dialogs Talk, March 2006 Dan Bohus Computer Science Department
Regular Expressions and Automata Chapter 2. Regular Expressions Standard notation for characterizing text sequences Used in all kinds of text processing.
MUSCLE Multimodal e-team related activity Technical University of Crete Speech Processing and Dialog Systems Group Presenter: Prof. Alex Potamianos Technical.
Towards Natural Clarification Questions in Dialogue Systems Svetlana Stoyanchev, Alex Liu, and Julia Hirschberg AISB 2014 Convention at Goldsmiths, University.
May 20, 2006SRIV2006, Toulouse, France1 Acoustic Modeling of Accented English Speech for Large-Vocabulary Speech Recognition ATR Spoken Language Communication.
Introduction to Automatic Speech Recognition
Speech Recognition Final Project Resources
An Evaluation Framework for Natural Language Understanding in Spoken Dialogue Systems Joshua B. Gordon and Rebecca J. Passonneau Columbia University.
Classical Method : The very boring-sounding method of language teaching described above is the Classical Method, also known as the Grammar Translation.
1 7-Speech Recognition (Cont’d) HMM Calculating Approaches Neural Components Three Basic HMM Problems Viterbi Algorithm State Duration Modeling Training.
1 Incorporating In-domain Confidence and Discourse Coherence Measures in Utterance Verification ドメイン内の信頼度と談話の整 合性 を用いた音声認識誤りの検出 Ian R. Lane, Tatsuya Kawahara.
DFKI GmbH, , R. Karger Indo-German Workshop on Language Technologies Reinhard Karger, M.A. Deutsches Forschungszentrum für Künstliche Intelligenz.
1 BILC SEMINAR 2009 Speech Recognition: Is It for Real? Tony Mirabito Defense Language Institute English Language Center (DLIELC) DLIELC.
circle Adding Spoken Dialogue to a Text-Based Tutorial Dialogue System Diane J. Litman Learning Research and Development Center & Computer Science Department.
Crowdsourcing for Spoken Dialogue System Evaluation Ling 575 Spoken Dialog April 30, 2015.
Adaptive Spoken Dialogue Systems & Computational Linguistics Diane J. Litman Dept. of Computer Science & Learning Research and Development Center University.
DIALOG SYSTEMS FOR AUTOMOTIVE ENVIRONMENTS Presenter: Joseph Picone Inst. for Signal and Info. Processing Dept. Electrical and Computer Eng. Mississippi.
Tuning Your Application: The Job’s Not Done at Deployment Monday, February 3, 2003 Developing Applications: It’s Not Just the Dialog.
CMU Robust Vocabulary-Independent Speech Recognition System Hsiao-Wuen Hon and Kai-Fu Lee ICASSP 1991 Presenter: Fang-Hui CHU.
Boosting Training Scheme for Acoustic Modeling Rong Zhang and Alexander I. Rudnicky Language Technologies Institute, School of Computer Science Carnegie.
1 Boostrapping language models for dialogue systems Karl Weilhammer, Matthew N Stuttle, Steve Young Presenter: Hsuan-Sheng Chiu.
Chapter 2 Chapter 2 Teaching Pronunciation. I why teach pronunciation? 1. Inaccurate production of a phoneme or inaccurate use of suprasegmental elements.
1 Natural Language Processing Lecture Notes 14 Chapter 19.
Recognizing Discourse Structure: Speech Discourse & Dialogue CMSC October 11, 2006.
Extending VERA (Conference Information) Design Specification & Schedules Arthur Chan (AC) Rohit Kumar (RK) Lingyun Gu (LG)
DIALOG SYSTEMS FOR AUTOMOTIVE ENVIRONMENTS Presenter: Joseph Picone Inst. for Signal and Info. Processing Dept. Electrical and Computer Eng. Mississippi.
ACOUSTIC-PHONETIC UNIT SIMILARITIES FOR CONTEXT DEPENDENT ACOUSTIC MODEL PORTABILITY Viet Bac Le*, Laurent Besacier*, Tanja Schultz** * CLIPS-IMAG Laboratory,
CELTA TKT. CELTA Target candidature The Certificate in Teaching English to Speakers of Other Languages (CELTA) is an introductory course for candidates.
Lti Shaping Spoken Input in User-Initiative Systems Stefanie Tomko and Roni Rosenfeld Language Technologies Institute School of Computer Science Carnegie.
Spoken Dialogue Systems: Advances and Challenges Presenter Sherif Abdou Electrical and Computer Engineering University of Miami.
Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement Ariadna Font Llitjós, Katharina Probst, Jaime Carbonell Language Technologies.
Integrating Multiple Knowledge Sources For Improved Speech Understanding Sherif Abdou, Michael Scordilis Department of Electrical and Computer Engineering,
Lexical, Prosodic, and Syntactics Cues for Dialog Acts.
SEESCOASEESCOA SEESCOA Meeting Activities of LUC 9 May 2003.
circle Spoken Dialogue for the Why2 Intelligent Tutoring System Diane J. Litman Learning Research and Development Center & Computer Science Department.
Phone-Level Pronunciation Scoring and Assessment for Interactive Language Learning Speech Communication, 2000 Authors: S. M. Witt, S. J. Young Presenter:
Gaussian Mixture Language Models for Speech Recognition Mohamed Afify, Olivier Siohan and Ruhi Sarikaya.
1 7-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches Recognition Theories Bayse Rule Simple Language Model P(A|W) Network Types.
Discriminative n-gram language modeling Brian Roark, Murat Saraclar, Michael Collins Presented by Patty Liu.
Objectives of session By the end of today’s session you should be able to: Define and explain pragmatics and prosody Draw links between teaching strategies.
CS416 Compiler Design1. 2 Course Information Instructor : Dr. Ilyas Cicekli –Office: EA504, –Phone: , – Course Web.
Cross-Dialectal Data Transferring for Gaussian Mixture Model Training in Arabic Speech Recognition Po-Sen Huang Mark Hasegawa-Johnson University of Illinois.
Predicting and Adapting to Poor Speech Recognition in a Spoken Dialogue System Diane J. Litman AT&T Labs -- Research
3.0 Map of Subject Areas.
CS416 Compiler Design lec00-outline September 19, 2018
Issues in Spoken Dialogue Systems
Sphinx Lunch Talk Carnegie Mellon University, October 2004
An ICALL writing support system tunable to varying levels
Joys and Pains of building a Dialog System
CS416 Compiler Design lec00-outline February 23, 2019
Artificial Intelligence 2004 Speech & Natural Language Processing
Presentation transcript:

Non-Native Users in the Let s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine Raux & Maxine Eskenazi Language Technologies Institute Carnegie Mellon University

Background Speech-enabled systems use models of the user s language Such models are tailored for native speech Great loss of performance for non-native users who don t follow typical native patterns

Previous Work on Non-Native Speech Recognition Assumes knowledge about/data from a specific non-native population Often based on read speech Focuses on acoustic mismatch: Acoustic adaptation Multilingual acoustic models

Linguistic Particularities of Non-Native Speakers Non-native speakers might use different lexical and syntactic constructs Non-native speakers are in a dynamic process of L2 acquisition

Outline of the Talk Baseline system and data collection Study of non-native/native mismatch and effect of additional non-native data Adaptive lexical entrainment

The CMU Let s Go!! System: Bus Schedule Information for the Pittsburgh Area ASR Sphinx II Parsing Phoenix Dialogue Management RavenClaw Speech Synthesis Festival HUB Galaxy NLG Rosetta

Data Collection Baseline system accessible since February 2003 Experiments with scenarios Publicized the phone number inside CMU in Fall 2003

Data Collection Web Page

Data Directed experiments: 134 calls 17 non-native speakers (5 from India, 7 from Japan, 5 others) Spontaneous: 30 calls Total: 1768 utterances Evaluation Data: Non-Native: 449 utterances Native: 452 utterances

Speech Recognition Baseline Acoustic Models: semi-continuous HMMs (codebook size: 256) 4000 tied states trained on CMU Communicator data Language Model: class-based backoff 3-gram trained on 3074 utterances from native calls

Speech Recognition Results NativeNon-Native 20.4%52.0% Causes of discrepancy: Acoustic mismatch (accent) Linguistic mismatch (word choice, syntax) Word Error Rate:

Language Model Performance Evaluation on transcripts. Initial model: 3074 native utterances

Adding non-native data: 3074 native+1308 non-native utterances Initial (native) model Mixed model Language Model Performance

Natural Language Understanding Grammar manually written incrementally, as the system was being developed Initially built with native speakers in mind Phoenix: robust parser (less sensitive to non-standard expressions)

Grammar Coverage Initial grammar: Manually written for native utterances

Grammar Coverage Grammar designed to accept some non- native patterns: reach = arrive What is the next bus? = When is the next bus?

Relative Improvement due to Additional Data

Effect of Additional Data on Speech Recognition

Adaptive Lexical Entrainment If you can t adapt the system, adapt the user System should use the same expressions it expects from the user But non-native speakers might not master all target expressions Use expressions that are close to the non- native speaker s language Use prosody to stress incorrect words

Adaptive Lexical Entrainment: Example Iwanttogotheairport Iwanttogotheairport?TO Did you mean:

Adaptive Lexical Entrainment: Algorithm Target Prompts ASR Hypothesis DP-based Alignment Prompt Selection Emphasis Confirmation Prompt I want to go the airport

Adaptive Lexical Entrainment: Algorithm Target Prompts ASR Hypothesis DP-based Alignment Prompt Selection Emphasis Confirmation Prompt I want to go the airport Id like to go to the airport

Adaptive Lexical Entrainment: Algorithm Target Prompts ASR Hypothesis DP-based Alignment Prompt Selection Emphasis Confirmation Prompt I want to go the airport Id like to go to the airport I want to go to the airport

Adaptive Lexical Entrainment: Algorithm Target Prompts ASR Hypothesis DP-based Alignment Prompt Selection Emphasis Confirmation Prompt I want to go the airport Id like to go to the airport I want to go to the airport

Adaptive Lexical Entrainment: Algorithm Target Prompts ASR Hypothesis DP-based Alignment Prompt Selection Emphasis Confirmation Prompt I want to go the airport Id like to go to the airport I want to go to the airport

Adaptive Lexical Entrainment: Algorithm Target Prompts ASR Hypothesis DP-based Alignment Prompt Selection Emphasis Confirmation Prompt I want to go the airport Id like to go to the airport I want to go to the airportDid you mean:?

Adaptive Lexical Entrainment: Study 24 dialogues, 12.4 turns/dialogue, 2.5 entrainment prompts/dialogue Trigger entrainment whenever best target does not perfectly match ASR Too simple? 40.7% false positive 64.4% prompts are triggered by misrecognitions

Conclusion LM and Grammar perform significantly worse on non-native speech Including non-native data reduces the gap in linguistic metrics but not WER Future work: Improve Speech Recognition More data for LM Acoustic/Lexical adaptation Integrate adaptive lexical entrainment in the dialog management framework

Example Dialogue S:Welcome to the Lets Go!! bus information system. What can I do for you? U: Excuse me. Im in the place which uh where the Forbes Avenue and Murray Avenue are crossing and I want to go to the place mm Forbes where the Forbes Avenue and Craig Street are crossing. S:I heard SIX. Is that correct? U:Which bus uh should I take? Uh sorry uh maybe 61C bus I have… What time the next 61C bus will come? S:The time is currently 5:37 PM. Where are you leaving from? …

Example Dialogue S:Welcome to the Lets Go!! bus information system. What can I do for you? U: I want to go to downtown now Im at Fifth and Bigelow. S:Sorry, did you say I want to go to downtown? U:Which number bus can I take and when will the next bus come here? S:... What can I do for you? …