Improved Name Recognition with Meta-data Dependent Name Networks published by Sameer R. Maskey, Michiel Bacchiani, Brian Roark, and Richard Sproat presented.

Slides:



Advertisements
Similar presentations
1 Discriminative Learning for Hidden Markov Models Li Deng Microsoft Research EE 516; UW Spring 2009.
Advertisements

Connecting with Students of Alaska Native Heritage From a Non-Native Perspective Laury Roberts Scandling From a Non-Native Perspective Laury Roberts Scandling.
A Probabilistic Representation of Systemic Functional Grammar Robert Munro Department of Linguistics, SOAS, University of London.
Language Technologies Reality and Promise in AKT Yorick Wilks and Fabio Ciravegna Department of Computer Science, University of Sheffield.
Knowledge Extraction from Technical Documents Knowledge Extraction from Technical Documents *With first class-support for Feature Modeling Rehan Rauf,
Tuning Jenny Burr August Discussion Topics What is tuning? What is the process of tuning?
Université du Québec École de technologie supérieure Face Recognition in Video Using What- and-Where Fusion Neural Network Mamoudou Barry and Eric Granger.
Use of the SPSSMR Data Model at ATP 12 January 2004.
Cache and Virtual Memory Replacement Algorithms
Arnd Christian König Venkatesh Ganti Rares Vernica Microsoft Research Entity Categorization Over Large Document Collections.
Non-Native Users in the Let s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine Raux & Maxine Eskenazi Language Technologies Institute.
Atomatic summarization of voic messages using lexical and prosodic features Koumpis and Renals Presented by Daniel Vassilev.
A question from last class: construct the predictive parsing table for this grammar: S->i E t S e S | i E t S | a E -> B.
1 VLDB 2006, Seoul Mapping a Moving Landscape by Mining Mountains of Logs Automated Generation of a Dependency Model for HUG’s Clinical System Mirko Steinle,
Markpong Jongtaveesataporn † Chai Wutiwiwatchai ‡ Koji Iwano † Sadaoki Furui † † Tokyo Institute of Technology, Japan ‡ NECTEC, Thailand.
Large-Scale Entity-Based Online Social Network Profile Linkage.
PHONEXIA Can I have it in writing?. Discuss and share your answers to the following questions: 1.When you have English lessons listening to spoken English,
Named Entity Recognition for Digitised Historical Texts by Claire Grover, Sharon Givon, Richard Tobin and Julian Ball (UK) presented by Thomas Packer 1.
Designing a Multi-Lingual Corpus Collection System Jonathan Law Naresh Trilok Pace University 04/19/2002 Advisors: Dr. Charles Tappert (Pace University)
What can humans do when faced with ASR errors? Dan Bohus Dialogs on Dialogs Group, October 2003.
The LC-STAR project (IST ) Objectives: Track I (duration 2 years) Specification and creation of large word lists and lexica suited for flexible.
Semantic and phonetic automatic reconstruction of medical dictations STEFAN PETRIK, CHRISTINA DREXEL, LEO FESSLER, JEREMY JANCSARY, ALEXANDRA KLEIN,GERNOT.
A Distributed and Privacy Preserving Algorithm for Identifying Information Hubs in Social Networks M.U. Ilyas, Z Shafiq, Alex Liu, H Radha Michigan State.
An Evaluation Framework for Natural Language Understanding in Spoken Dialogue Systems Joshua B. Gordon and Rebecca J. Passonneau Columbia University.
Julia Hirschberg, Michiel Bacchiani, Phil Isenhour, Aaron Rosenberg, Larry Stead, Steve Whittaker, Jon Wright, and Gary Zamchick (with Martin Jansche,
Experiments on Building Language Resources for Multi-Modal Dialogue Systems Goals identification of a methodology for adapting linguistic resources for.
Integrated Stochastic Pronunciation Modeling Dong Wang Supervisors: Simon King, Joe Frankel, James Scobbie.
Chapter 14 Speaker Recognition 14.1 Introduction to speaker recognition 14.2 The basic problems for speaker recognition 14.3 Approaches and systems 14.4.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Chinese Word Segmentation and Statistical Machine Translation Presenter : Wu, Jia-Hao Authors : RUIQIANG.
Towards an Intelligent Multilingual Keyboard System Tanapong Potipiti, Virach Sornlertlamvanich, Kanokwut Thanadkran Information Research and Development.
Discriminative Models for Spoken Language Understanding Ye-Yi Wang, Alex Acero Microsoft Research, Redmond, Washington USA ICSLP 2006.
Recognition of spoken and spelled proper names Reporter : CHEN, TZAN HWEI Author :Michael Meyer, Hermann Hild.
1 Using TDT Data to Improve BN Acoustic Models Long Nguyen and Bing Xiang STT Workshop Martigny, Switzerland, Sept. 5-6, 2003.
Tuning Your Application: The Job’s Not Done at Deployment Monday, February 3, 2003 Developing Applications: It’s Not Just the Dialog.
Automatic Speech Recognition: Conditional Random Fields for ASR Jeremy Morris Eric Fosler-Lussier Ray Slyh 9/19/2008.
1 Boostrapping language models for dialogue systems Karl Weilhammer, Matthew N Stuttle, Steve Young Presenter: Hsuan-Sheng Chiu.
1 Sentence Extraction-based Presentation Summarization Techniques and Evaluation Metrics Makoto Hirohata, Yousuke Shinnaka, Koji Iwano and Sadaoki Furui.
A Scalable Machine Learning Approach for Semi-Structured Named Entity Recognition Utku Irmak(Yahoo! Labs) Reiner Kraft(Yahoo! Inc.) WWW 2010(Information.
ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent Recognition of foreign names spoken by native speakers Frederik Stouten & Jean-Pierre Martens Ghent University.
Recurrent neural network based language model Tom´aˇs Mikolov, Martin Karafia´t, Luka´sˇ Burget, Jan “Honza” Cˇernocky, Sanjeev Khudanpur INTERSPEECH 2010.
1 DUTIE Speech: Determining Utility Thresholds for Information Extraction from Speech John Makhoul, Rich Schwartz, Alex Baron, Ivan Bulyko, Long Nguyen,
Stentor A new Computer-Aided Transcription software for French language.
Conditional Random Fields for ASR Jeremy Morris July 25, 2006.
© 2005, it - instituto de telecomunicações. Todos os direitos reservados. Arlindo Veiga 1,2 Sara Cadeias 1 Carla Lopes 1,2 Fernando Perdigão 1,2 1 Instituto.
Presented by: Fang-Hui Chu Discriminative Models for Speech Recognition M.J.F. Gales Cambridge University Engineering Department 2007.
Comparison of Tarry’s Algorithm and Awerbuch’s Algorithm CS 6/73201 Advanced Operating System Presentation by: Sanjitkumar Patel.
Copyright © 2013 by Educational Testing Service. All rights reserved. Evaluating Unsupervised Language Model Adaption Methods for Speaking Assessment ShaSha.
Blindfolded Record Linkage Presented by Gautam Sanka Susan C. Weber, Henry Lowe, Amar Das, Todd Ferris.
Virtual Examples for Text Classification with Support Vector Machines Manabu Sassano Proceedings of the 2003 Conference on Emprical Methods in Natural.
Integrating Multiple Knowledge Sources For Improved Speech Understanding Sherif Abdou, Michael Scordilis Department of Electrical and Computer Engineering,
1 ICASSP Paper Survey Presenter: Chen Yi-Ting. 2 Improved Spoken Document Retrieval With Dynamic Key Term Lexicon and Probabilistic Latent Semantic Analysis.
Creating interfaces Multi-language example Definition of computer information system VoiceXML example Project proposal presentations Homework: Post proposal,
Phone-Level Pronunciation Scoring and Assessment for Interactive Language Learning Speech Communication, 2000 Authors: S. M. Witt, S. J. Young Presenter:
Gaussian Mixture Language Models for Speech Recognition Mohamed Afify, Olivier Siohan and Ruhi Sarikaya.
A Simple English-to-Punjabi Translation System By : Shailendra Singh.
Cross-Dialectal Data Transferring for Gaussian Mixture Model Training in Arabic Speech Recognition Po-Sen Huang Mark Hasegawa-Johnson University of Illinois.
Audio Books for Phonetics Research CatCod2008 Jiahong Yuan and Mark Liberman University of Pennsylvania Dec. 4, 2008.
N-best list reranking using higher level phonetic, lexical, syntactic and semantic knowledge sources Mithun Balakrishna, Dan Moldovan and Ellis K. Cave.
Speaker : chia hua Authors : Long Qin, Ming Sun, Alexander Rudnicky
Intro to Machine Learning
Conditional Random Fields for ASR
Chair of Tech Committee, BetterGrids.org
Data Mining, Information Extraction and Search in Spoken Documents
Searching and Summarizing Speech
Changing Number of Rings Before Voic Is Answered Manually change the number of rings before voic is answered. To login to your Webportal go.
Searching and Summarizing Speech
Data Mining, Information Extraction and Search in Spoken Documents
Automatic Speech Recognition: Conditional Random Fields for ASR
Overview of Machine Learning
Presenter : Jen-Wei Kuo
Presentation transcript:

Improved Name Recognition with Meta-data Dependent Name Networks published by Sameer R. Maskey, Michiel Bacchiani, Brian Roark, and Richard Sproat presented by Irina Likhtina

Problem: General Name Transcription Improvement Solutions that use no prior knowledge require large increase in Size Size Complexity Complexity Ambiguity Ambiguity because any improvement accomplished before was possible only through an increase in lexicon used.

Paper’s Solution Use meta-data at runtime in the form of Use meta-data at runtime in the form of Caller ID string – “cname” Caller ID string – “cname” Name of mailbox owner – “mname” Name of mailbox owner – “mname” This meta-data is reasonably available due to the prevalence of caller identification given by phone companies.

Database Used Scanmail training corpus of 100 hours of voic messages from 140 employees of AT&T. Scanmail training corpus of 100 hours of voic messages from 140 employees of AT&T. Manually transcribed with “cname” and “mname” tags Manually transcribed with “cname” and “mname” tags Gender balanced Gender balanced ~12% non-native speakers ~12% non-native speakers 238 random messages for testing, everything else (~ 10,000 messages) for training 238 random messages for testing, everything else (~ 10,000 messages) for training

Approach Three steps of the algorithm: Three steps of the algorithm: create a class-based language model create a class-based language model create a name network that will give instances for the classes of the model create a name network that will give instances for the classes of the model replace the class-based language model at runtime with the name networks replace the class-based language model at runtime with the name networks

Class-Based Language Model Manual tags of mailbox name and caller name for each message replaced with “mname” and “cname” labels Manual tags of mailbox name and caller name for each message replaced with “mname” and “cname” labels “mname” and “cname” represented the 2 class tokens that can be substituted with any values in the future “mname” and “cname” represented the 2 class tokens that can be substituted with any values in the future

Name Network To get the values for “mname” and “cname”, an internal AT&T employee directory (~ 40,000 people) listing was used To get the values for “mname” and “cname”, an internal AT&T employee directory (~ 40,000 people) listing was used “cname” created from variations of static titles (Miss, Mr), full first names and nicknames (Alexander, Alex), and last names (Jones) “cname” created from variations of static titles (Miss, Mr), full first names and nicknames (Alexander, Alex), and last names (Jones)

Name Network (continued) Probability within class – training corpus Probability within class – training corpus Probability within first names – AT&T directory listing Probability within first names – AT&T directory listing

Replacement in the ASR network State of art before proposed algorithm: State of art before proposed algorithm: off-line composition, determinization and optimization of one existing grammar G and one lexicon L Proposed algorithm is impractical with off-line composition because of so many variations needed Proposed algorithm: Proposed algorithm: L by G optimization done for each class using G specific L by G optimization done for each class using G specific for that class for that class Small overhead of proposed algorithm compared to off-line optimization to off-line optimization

Experimental results Word Error Rates (WER) improvement small Word Error Rates (WER) improvement small Absolute reduction of 0.6% Absolute reduction of 0.6% Named Error Rate (NER) improvement significant Named Error Rate (NER) improvement significant Absolute reduction of 20 % Absolute reduction of 20 %

Conclusions Large reduction in NER is very critical: Large reduction in NER is very critical: Name transcription is the goal Name transcription is the goal Scanmail users expressed a strong desire for the system to recognize names correctly Scanmail users expressed a strong desire for the system to recognize names correctly Other improvements: Other improvements: Errors in OOV Errors in OOV In-vocabulary name recognition In-vocabulary name recognition No significant increase in complexity with good name coverage No significant increase in complexity with good name coverage No need for manual design when the system is moved to new environment No need for manual design when the system is moved to new environment