SoarTech Proprietary Automatic Speech Recognition in Training Systems: An Introduction Presenter: Brian Stensrud, Ph.D. 21 Jan 2016 PAO Approval: 15-ORL110503.

Slides:

Advertisements

Similar presentations

Chapter 12 User Interface Design

Advertisements

AIM CPM/LOM Tom Bonnano N74 21 August 2014.

1990s DARPA Programmes WSJ and BN Dapo Durosinmi-Etti Bo Xu Xiaoxiao Zheng.

Critical Thinking Course Introduction and Lesson 1

5/10/20151 Evaluating Spoken Dialogue Systems Julia Hirschberg CS 4706.

Modeling Human Reasoning About Meta-Information Presented By: Scott Langevin Jingsong Wang.

Robert J. Mislevy & Min Liu University of Maryland Geneva Haertel SRI International Robert J. Mislevy & Min Liu University of Maryland Geneva Haertel SRI.

Languages & The Media, 5 Nov 2004, Berlin 1 New Markets, New Trends The technology side Stelios Piperidis

Assurance through Enhanced Design Methodology Orlando, FL 5 December 2012 Nirav Davé SRI International This effort is sponsored by the Defense Advanced.

Introduction to Linguistics and Basic Terms

Analysis Concepts and Principles

© Lethbridge/Laganière 2001 Chapter 7: Focusing on Users and Their Tasks1 7.1 User Centred Design (UCD) Software development should focus on the needs.

Lecturing with Digital Ink Richard Anderson University of Washington.

Projects in the Intelligent User Interfaces Group Frank Shipman Associate Director, Center for the Study of Digital Libraries.

Automatic Speech Recognition

Click to edit the title text format An Introduction to TuTalk: Developing Dialogue Agents for Learning Studies Pamela Jordan University of Pittsburgh Learning.

Semantic Parsing for Robot Commands Justin Driemeyer Jeremy Hoffman.

Natural Language Understanding

Software Development Process

Chapter 4 Interpreting the CMM. Group (3) Fahmi Alkhalifi Pam Page Pardha Mugunda.

Common Data Elements and Metadata: Their Roles in Integrating Public Health Surveillance and Information Systems Ron Fichtner, Chief, Prevention Informatics.

MG&AF Education Partnerships A Pedagogic Support Programme to Facilitate Self-managed Learning Within the Curriculum for the age Groups Supporting.

Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.

Some Thoughts on HPC in Natural Language Engineering Steven Bird University of Melbourne & University of Pennsylvania.

Conversational Applications Workshop Introduction Jim Larson.

Interactive Dialogue Systems Professor Diane Litman Computer Science Department & Learning Research and Development Center University of Pittsburgh Pittsburgh,

المحاضرة الثالثة. Software Requirements Topics covered Functional and non-functional requirements User requirements System requirements Interface specification.

` Tangible Interaction with the R Software Environment Using the Meuse Dataset Rachel Bradford, Landon Rogge, Dr. Brygg Ullmer, Dr. Christopher White `

SOFTWARE ENGINEERING1 Introduction. Software Software (IEEE): collection of programs, procedures, rules, and associated documentation and data SOFTWARE.

1 ISA&D7‏/8‏/ ISA&D7‏/8‏/2013 Systems Development Life Cycle Phases and Activities in the SDLC Variations of the SDLC models.

Help Desk System How to Deploy them? Author: Stephen Grabowski.

Spoken dialog for e-learning supported by domain ontologies Dario Bianchi, Monica Mordonini and Agostino Poggi Dipartimento di Ingegneria dell’Informazione.

circle Adding Spoken Dialogue to a Text-Based Tutorial Dialogue System Diane J. Litman Learning Research and Development Center & Computer Science Department.

Human Computer Interaction

Spoken Dialog Systems and Voice XML Lecturer: Prof. Esther Levin.

The Role of Library Media Specialists in Alabama Reading Initiative (ARI) schools Presented By Christine Spear Rechelle Anders.

I n t e g r i t y - S e r v i c e - E x c e l l e n c e Air Force Institute for Operational Health Ergonomic Recommendations for the Air Force Officer.

1 Introduction to Software Engineering Lecture 1.

Introduction – Addressing Business Challenges Microsoft® Business Intelligence Solutions.

Computer Vision Michael Isard and Dimitris Metaxas.

Issues in Multiparty Dialogues Ronak Patel. Current Trend  Only two-party case (a person and a Dialog system  Multi party (more than two persons Ex.

Chapter 3 Culture and Language. Chapter Outline  Humanity and Language  Five Properties of Language  How Language Works  Language and Culture  Social.

© 2013, published by Flat World Knowledge Chapter 10 Understanding Software: A Primer for Managers 10-1.

© 2013 by Larson Technical Services

Speech Processing 1 Introduction Waldemar Skoberla phone: fax: WWW:

Presenter: Elizabeth Perry, M Ed. TESL-NS Fall 2014 Conference Dalhousie University.

W3C Multimodal Interaction Activities Deborah A. Dahl August 9, 2006.

Software Architecture for Multimodal Interactive Systems : Voice-enabled Graphical Notebook.

INTRODUCTION TO THE WIDA FRAMEWORK Presenter Affiliation Date.

© STZ Language Learning Media Telos Language Partner (TLP Pro) TLP Pro combines communication-oriented interactive self-study activities with intuitive.

PREPARED BY MANOJ TALUKDAR MSC 4 TH SEM ROLL-NO 05 GUKC-2012 IN THE GUIDENCE OF DR. SANJIB KR KALITA.

Deploying an Intelligent Pairing Assistant for Air Operation Centers Jeremy Ludwig, Ph.D. June 21, Distribution A: Approved for public release.

Requirements of an ITS/Simulation Interoperability Standard (I/SIS)

Appendix 2 Automated Tools for Systems Development

Interpreting as Process

Introducing Advanced Features of MyFloridaMarketPlace

Customer Relationship Management Systems

Automatic Speech Recognition

Chapter 6. Data Collection in a Wizard-of-Oz Experiment in Reinforcement Learning for Adaptive Dialogue Systems by: Rieser & Lemon. Course: Autonomous.

Speech Processing AEGIS RET All-Hands Meeting

Artificial Intelligence for Speech Recognition

The Systems Engineering Context

INAGO Project Automatic Knowledge Base Generation from Text for Interactive Question Answering.

Developing an Android application for

SPEAKING ASSESSMENT Joko Nurkamto UNS Solo 11/8/2018.

Advantages OF BDD Testing

LANGUAGE TEACHING MODELS

SPEAKING ASSESSMENT Joko Nurkamto UNS Solo 12/3/2018.

Intermediates Here is a simple profile for Intermediate proficiency speakers from ACTFL 2012.

Intermediates Here is a simple profile for Intermediate proficiency speakers from ACTFL 2012.

Presentation transcript:

SoarTech Proprietary Automatic Speech Recognition in Training Systems: An Introduction Presenter: Brian Stensrud, Ph.D. 21 Jan 2016 PAO Approval: 15-ORL The views expressed herein are those of the authors and do not necessarily reflect the official position of the organizations with which they are affiliated

Introduction: Automatic Speech Recognition (ASR) ASR: spoken words interpreted by a software system to elicit some activity or response (Rabiner and Juang, 1993) Implications for ASR in simulated training systems Natural modality for human input Interactive, autonomous role players Communication-heavy training activities (e.g., CRM) ASR is not widely used in current training simulations

The Trouble with ASR Speech recognition failures are inherently frustrating and off-putting for the user ASR can fail in a variety of different ways that produce the same symptom: a failed response ASR involves more than just what is spoken and what is heard

Phase 1: Recognition This is a necessary but NOT sufficient feature of ASR! “recognized utterance”

Failures in the Recognition Phase All speech recognizers are NOT created equal Some are better than others Many are tailored for a specific use case (e.g. dictation) All speech producers are NOT created equal Some speakers are more easily recognized Use of ASR ‘best practices’ when speaking Unrealistic expectations on the part of the speaker

Phase 2: Understanding Converting spoken words into text does NOT constitute understanding! Words, sentence structure, etc. all have meanings that are (potentially) necessary Speech understanding often depends on domain and context knowledge Understanding does not come free with recognition

Example Methods and Artifacts of Speech Understanding for ASR Semantic parsing Part-of-speech tagging Named-entity recognition Sentiment analysis

Phase 3: Behavior Behavior: the ability to execute/react to the contents of speech once understood Initiate/update a new goal or activity Process new situation awareness Typically behaviors are separated from speech processing Behavior does not come free with recognition or understanding

Phases of ASR - Example from Air Traffic Control What behaviors are directed/implied here? Tower34(trainee): “Joker21, Tower34, report Yankee or next 3 0” Joker21(CGF): “Tower34 roger will report at Yankee or next 3 0” Joker21: “Tower34, Stalker21 at Yankee, proceeding to Sierra”

Phase 4: Dialog Dialog: multi-turn interactions between the user/trainee and the ASR-supported system (CGF) Requires: Representation and ability to reference dialog history Domain situation awareness Maintenance of interactions over time

Summary ASR has many potential benefits for training simulations but is a complex feature to implement Developing and integrating an ASR capability requires addressing multiple phases Failures can occur at any phase of ASR process, often misdiagnosed as a speech failure

Questions? Q1: Can’t you just integrate Siri into the system? (short answer: no) Q2: State of current research in ASR? Research and development referred to in this paper has been funded over the past 15 years through multiple funding sources including Office of Naval Research (ONR), Air Force Research Laboratory (AFRL), Army Research Laboratory (ARL), and the Defense Advanced Research Projects Agency (DARPA). Ongoing research is currently being funded via contracts #N C-0010 and #FA C-6593 as well as internal SoarTech and NAWCTSD research and development efforts.