DIALOG SYSTEMS FOR AUTOMOTIVE ENVIRONMENTS Presenter: Joseph Picone Inst. for Signal and Info. Processing Dept. Electrical and Computer Eng. Mississippi.

Slides:



Advertisements
Similar presentations
MAI Internship April-May MAI Internship 2002 Slide 2 of 14 What? The AST Project promotes development of speech technology for official languages.
Advertisements

In collaboration with Hualin Gao, Richard Duncan, Julie A. Baca, Joseph Picone Human and Systems Engineering Center of Advanced Vehicular System Mississippi.
5/10/20151 Evaluating Spoken Dialogue Systems Julia Hirschberg CS 4706.
Distribution-Based Feature Normalization for Robust Speech Recognition Leveraging Context and Dynamics Cues Yu-Chen Kao and Berlin Chen Presenter : 張庭豪.
PERFORMANCE ANALYSIS OF AURORA LARGE VOCABULARY BASELINE SYSTEM Naveen Parihar, and Joseph Picone Center for Advanced Vehicular Systems Mississippi State.
Dialogue – Driven Intranet Search Suma Adindla School of Computer Science & Electronic Engineering 8th LANGUAGE & COMPUTATION DAY 2009.
James Martin CpE 691, Spring 2010 February 11, 2010.
Jianwei Lu1 Information Extraction from Event Announcements Student: Jianwei Lu ( ) Supervisor: Robert Dale.
NaLIX: A Generic Natural Language Search Environment for XML Data Presented by: Erik Mathisen 02/12/2008.
What can humans do when faced with ASR errors? Dan Bohus Dialogs on Dialogs Group, October 2003.
ITCS 6010 Spoken Language Systems: Architecture. Elements of a Spoken Language System Endpointing Feature extraction Recognition Natural language understanding.
The Data Mining Visual Environment Motivation Major problems with existing DM systems They are based on non-extensible frameworks. They provide a non-uniform.
1 Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang, Assistant Professor Dept. of Computer Science & Information Engineering National Central.
The Chinese University of Hong Kong Department of Computer Science and Engineering Lyu0202 Advanced Audio Information Retrieval System.
The Software Product Life Cycle. Views of the Software Product Life Cycle  Management  Software engineering  Engineering design  Architectural design.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
Mobile Multimodal Applications. Dr. Roman Englert, Gregor Glass March 23 rd, 2006.
CS-EE 481 Spring Founders Day, 2005 University of Portland School of Engineering Project Pocket Gopher Conversational Learning Agent Team Josh Jones.
Aurora: A Conceptual Model for Web-content Adaptation to Support the Universal Accessibility of Web-based Services Anita W. Huang, Neel Sundaresan Presented.
Chapter 6 System Engineering - Computer-based system - System engineering process - “Business process” engineering - Product engineering (Source: Pressman,
1 7-Speech Recognition (Cont’d) HMM Calculating Approaches Neural Components Three Basic HMM Problems Viterbi Algorithm State Duration Modeling Training.
Petter Nielsen Information Systems/IFI/UiO 1 Software Prototyping.
Some Thoughts on HPC in Natural Language Engineering Steven Bird University of Melbourne & University of Pennsylvania.
"Dude, Where's My... Signals and Systems Textbook?" Joseph Picone Inst. for Signal and Info. Processing Dept. Electrical and Computer Eng. Mississippi.
Research Challenges for Spoken Language Dialog Systems Julie Baca, Ph.D. Assistant Research Professor Center for Advanced Vehicular Systems Mississippi.
1 ISA&D7‏/8‏/ ISA&D7‏/8‏/2013 Systems Development Life Cycle Phases and Activities in the SDLC Variations of the SDLC models.
Abstract Developing sign language applications for deaf people is extremely important, since it is difficult to communicate with people that are unfamiliar.
Experiments on Building Language Resources for Multi-Modal Dialogue Systems Goals identification of a methodology for adapting linguistic resources for.
Expanding the Accessibility and Impact of Language Technologies for Supporting Education (TFlex): Edinburgh Effort Dr. Myroslava Dzikovska, Prof. Johanna.
Knowledge Representation and Indexing Using the Unified Medical Language System Kenneth Baclawski* Joseph “Jay” Cigna* Mieczyslaw M. Kokar* Peter Major.
CROSSMARC Web Pages Collection: Crawling and Spidering Components Vangelis Karkaletsis Institute of Informatics & Telecommunications NCSR “Demokritos”
Research Challenges for Spoken Language Dialog Systems Julie Baca, Ph.D. Center for Advanced Vehicular Systems Mississippi State University Computer Science.
Theban Stanley Human and Systems Engineering Center for Advanced Vehicular Systems Enhancements to the DARPA Communicator Architecture.
Temple University QUALITY ASSESSMENT OF SEARCH TERMS IN SPOKEN TERM DETECTION Amir Harati and Joseph Picone, PhD Department of Electrical and Computer.
Design of a Search Engine for Metadata Search Based on Metalogy Ing-Xiang Chen, Che-Min Chen,and Cheng-Zen Yang Dept. of Computer Engineering and Science.
IMPROVING RECOGNITION PERFORMANCE IN NOISY ENVIRONMENTS Joseph Picone 1 Inst. for Signal and Info. Processing Dept. Electrical and Computer Eng. Mississippi.
Conversation as Action Under Uncertainty Tim Paek Eric Horvitz.
Automatic Speech Recognition: Conditional Random Fields for ASR Jeremy Morris Eric Fosler-Lussier Ray Slyh 9/19/2008.
Dept. of Computer Science University of Rochester Rochester, NY By: James F. Allen, Donna K. Byron, Myroslava Dzikovska George Ferguson, Lucian Galescu,
1 Boostrapping language models for dialogue systems Karl Weilhammer, Matthew N Stuttle, Steve Young Presenter: Hsuan-Sheng Chiu.
The GriPhyN Planning Process All-Hands Meeting ISI 15 October 2001.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
DIALOG SYSTEMS FOR AUTOMOTIVE ENVIRONMENTS Presenter: Joseph Picone Inst. for Signal and Info. Processing Dept. Electrical and Computer Eng. Mississippi.
SIS Spatial Information Solutions April 23, 2005 MSU ERAC Presentation Spatial Information Solutions: A New Business Delivering Spatial Technology Research.
INSTITUTE FOR SIGNAL AND INFORMATION PROCESSING Joseph Picone Inst. for Signal and Info. Processing Dept. Electrical and Computer Eng. Mississippi State.
Duraid Y. Mohammed Philip J. Duncan Francis F. Li. School of Computing Science and Engineering, University of Salford UK Audio Content Analysis in The.
Speech Communication Lab, State University of New York at Binghamton Dimensionality Reduction Methods for HMM Phonetic Recognition Hongbing Hu, Stephen.
Experimental Results Abstract Fingerspelling is widely used for education and communication among signers. We propose a new static fingerspelling recognition.
Theban Stanley, Julie Baca, Matt Elliott and Joseph Picone Human and Systems Engineering Center for Advanced Vehicular Systems Mississippi State University.
M. Liu, T. Stanley, J. Baca and J. Picone Intelligent Electronic Systems Center for Advanced Vehicular Systems Mississippi State University URL:
1 Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents S. Kawamoto, et al. October 27, 2004.
1 Advanced Software Architecture Muhammad Bilal Bashir PhD Scholar (Computer Science) Mohammad Ali Jinnah University.
Performance Analysis of Advanced Front Ends on the Aurora Large Vocabulary Evaluation Authors: Naveen Parihar and Joseph Picone Inst. for Signal and Info.
SoarTech Proprietary Automatic Speech Recognition in Training Systems: An Introduction Presenter: Brian Stensrud, Ph.D. 21 Jan 2016 PAO Approval: 15-ORL
Integrating Multiple Knowledge Sources For Improved Speech Understanding Sherif Abdou, Michael Scordilis Department of Electrical and Computer Engineering,
Language Model Grammar Conversion Wesley Holland, Julie Baca, Dhruva Duncan, Joseph Picone Center for Advanced Vehicular Systems Mississippi State University.
SEESCOASEESCOA SEESCOA Meeting Activities of LUC 9 May 2003.
1 Electrical and Computer Engineering Binghamton University, State University of New York Electrical and Computer Engineering Binghamton University, State.
Speech Processing 1 Introduction Waldemar Skoberla phone: fax: WWW:
1 ICASSP Paper Survey Presenter: Chen Yi-Ting. 2 Improved Spoken Document Retrieval With Dynamic Key Term Lexicon and Probabilistic Latent Semantic Analysis.
TRIUMF HLA Development High Level Applications Perform tasks of accelerator and beam control at control- room level, directly interfacing with operators.
Message Source Linguistic Channel Articulatory Channel Acoustic Channel Observable: MessageWordsSounds Features Bayesian formulation for speech recognition:
A NONPARAMETRIC BAYESIAN APPROACH FOR
Spectral and Temporal Modulation Features for Phonetic Recognition Stephen A. Zahorian, Hongbing Hu, Zhengqing Chen, Jiang Wu Department of Electrical.
HUMAN LANGUAGE TECHNOLOGY: From Bits to Blogs
Submitted By: Usha MIT-876-2K11 M.Tech(3rd Sem) Information Technology
EEG Recognition Using The Kaldi Speech Recognition Toolkit
Automatic Speech Recognition: Conditional Random Fields for ASR
Voice Activation for Wealth Management
HUMAN AND SYSTEMS ENGINEERING:
Presentation transcript:

DIALOG SYSTEMS FOR AUTOMOTIVE ENVIRONMENTS Presenter: Joseph Picone Inst. for Signal and Info. Processing Dept. Electrical and Computer Eng. Mississippi State University Co-Authors: Julie Baca, Feng Zheng, Hualin Gao Center for Advanced Vehicular Systems Mississippi State University Mississippi State, Mississippi URL: EUROSPEECH

In-vehicle dialog systems improve information access. Advanced user interfaces enhance workforce training and increase manufacturing efficiency. Noise robustness in both environments to improve recognition performance Advanced statistical models and machine learning technology Multidisciplinary team (IE, ECE, CS). INTRODUCTION IN-VEHICLE DIALOG SYSTEMS

DIALOG SYSTEM ARCHITECTURE SYSTEM ARCHITECTURE DARPA COMMUNICATOR FRAMEWORK

…. Uses publicly available ISIP speech recognition toolkit. Implements standard HMM- based speaker independent continuous speech recognition system. Complete toolkits available for many popular tasks including conversational speech. On-line educational materials Extensive documentation SYSTEM ARCHITECTURE PUBLIC DOMAIN ASR

Transduction: Andrea NC-65 head-mounted Feature extraction: standard 39-element MFCCs Acoustic modeling: 8-mixture Gaussian HMMs Lexicon: 7,100 words (5K WSJ, 2K names) Language modeling: Interpolated Bigram (ppl: ~70) Search: Hierarchical Viterbi Beam SYSTEM ARCHITECTURE ASR SYSTEM COMPONENTS

Uses Phoenix semantic case frame parser from Colorado Univ. (CU). Employs semantic grammar consisting of case frames with named slots. FRAME: Drive [route] [distance] [route] (*IWANT [go_verb][arrive_loc]) IWANT (I want *to)(I would *like *to) (I will) (I need *to) [go_verb] (go)(drive)(get)(reach) [arriveloc] [*to [placename][cityname]] SYSTEM ARCHITECTURE NATURAL LANGUAGE UNDERSTANDING

SYSTEM ARCHITECTURE Accepts ungrammatical input, “I want… I need to drive to the campus post office.” Current version of the semantic grammar contains over 500 rules and 2000 words. Developed from pilot test corpus of sentence patterns. Route IWANTgo_verbarrive_loc “I need to” “drive” placenamecityname “post office” “campus” NLU MODULE

Controls interaction between user and system. Accepts parsed input from NLU module. Determines data requested, obtains data and controls presentation to user. Handles clarification if necessary: User:“How can I get to campus?” System:“Are you going to a specific location on campus?” User:“I am trying to find engineering.” System:“Which department in engineering?” SYSTEM ARCHITECTURE DIALOG MANAGER

Derived from CU toolkit. Bulk of development lies in construction of domain-specific frames, rules, and slots. Example frames and associated queries: Drive_Direction:“How can I get from Lee Boulevard to Kroger? Drive_Address:“Where is the campus bakery?” Drive_Distance:“How far is China Garden?” Drive_Quality:“Find me the most scenic route to Scott Field.” Drive_Turn:“I am on Nash Street. What’s my next turn?” SYSTEM ARCHITECTURE DIALOG MANAGER

Geographic Information System (GIS) contains map routing data for MSU and surrounding area. Dialog manager (DM) first determines the nature of query, then:  obtains route data from the GIS database  handles presentation of the data to the user APPLICATION DEVELOPMENT GIS BACKEND

Obtained domain-specific data by conducting three pilot experiments, each consisting of two phases: 1.Initial data gathering and system testing Tested 276 spontaneously input sentences against initial grammar 2.Retesting after enhancing LM and semantic grammar Expanded semantic frames from 2 to 9 Initial efforts focused on reducing OOV utterances and parsing errors for NLU module. APPLICATION DEVELOPMENT PILOT SYSTEM

Refinements to NLU System: Overall System Enhancements : Vers TestPrePostPrePostPrePost OOV25%0% 36% 0%4%0% Parser80%3%60%5%46%11% Test No. NLU Parser Error Rate DM Error Rate 143%49% 26%3% APPLICATION DEVELOPMENT RESULTS

Users participate in multiple scenarios in which they use the system to query for information (e.g., xxx yyy zzz qqq lll). Tasks vary in scenarios according to role user plays:  First-time visitors  New residents  Long-time residents SUMMARY AND CONCLUSIONS WIZARD OF OZ DATA

SUMMARY AND CONCLUSIONS FURTHER DEVELOPMENT Established capability for further research in workforce training and other related domains. Confirmed cost of domain- specific development for dialog systems. Need workforce training pict here

SUMMARY RELEVANT RESOURCES CAVS Dialog System: review our experimental results and download the in-vehicle prototype architecture and associated components. Natural Language and Dialog Management Toolkits (CU): explore tools to build NLU and DM components for a specific domain. Speech Recognition Toolkit (ISIP): examine a state of the art public domain ASR toolkit for integration in a dialog system.