Latent Topic Modeling of Word Vicinity Information for Speech Recognition Kuan-Yu Chen, Hsuan-Sheng Chiu, Berlin Chen ICASSP 2010 Hao-Chin Chang Department.

Slides:

Advertisements

Similar presentations

PHONE MODELING AND COMBINING DISCRIMINATIVE TRAINING FOR MANDARIN-ENGLISH BILINGUAL SPEECH RECOGNITION Yanmin Qian, Jia Liu ICASSP2010 Pei-Ning Chen CSIE.

Advertisements

Improved Neural Network Based Language Modelling and Adaptation J. Park, X. Liu, M.J.F. Gales and P.C. Woodland 2010 INTERSPEECH Bang-Xuan Huang Department.

歡迎 IBM Watson 研究員詹益毅博士蒞臨國立台灣師範大學. Hai-Son Le, Ilya Oparin, Alexandre Allauzen, Jean-Luc Gauvain, Franc¸ois Yvon ICASSP 2011 許曜麒 Structured Output.

Automatic Prosodic Event Detection Using Acoustic, Lexical, and Syntactic Evidence Sankaranarayanan Ananthakrishnan, Shrikanth S. Narayanan IEEE 2007 Min-Hsuan.

Fei Xing1, Ping Guo1,2 and Michael R. Lyu2

Investigation on Mandarin Broadcast News Speech Recognition Mei-Yuh Hwang, Xin Lei, Wen Wang*, Takahiro Shinozaki University of Washington, *SRI 9/19/2006,

Investigation of Web Query Refinement via Topic Analysis and Learning with Personalization Department of Systems Engineering & Engineering Management The.

1 Language Model Adaptation in Machine Translation from Speech Ivan Bulyko, Spyros Matsoukas, Richard Schwartz, Long Nguyen, and John Makhoul.

1 Today  Tools (Yves)  Efficient Web Browsing on Hand Held Devices (Shrenik)  Web Page Summarization using Click- through Data (Kathy)  On the Summarization.

JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 30, (2014) BERLIN CHEN, YI-WEN CHEN, KUAN-YU CHEN, HSIN-MIN WANG2 AND KUEN-TYNG YU Department of Computer.

An Automatic Segmentation Method Combined with Length Descending and String Frequency Statistics for Chinese Shaohua Jiang, Yanzhong Dang Institute of.

Acoustic and Linguistic Characterization of Spontaneous Speech Masanobu Nakamura, Koji Iwano, and Sadaoki Furui Department of Computer Science Tokyo Institute.

Multi-Style Language Model for Web Scale Information Retrieval Kuansan Wang, Xiaolong Li and Jianfeng Gao SIGIR 2010 Min-Hsuan Lai Department of Computer.

A Survey of ICASSP 2013 Language Model Department of Computer Science & Information Engineering National Taiwan Normal University 報告者：郝柏翰 2013/06/19.

MediaEval Workshop 2011 Pisa, Italy 1-2 September 2011.

1 Bayesian Learning for Latent Semantic Analysis Jen-Tzung Chien, Meng-Sun Wu and Chia-Sheng Wu Presenter: Hsuan-Sheng Chiu.

Improving Utterance Verification Using a Smoothed Na ï ve Bayes Model Reporter : CHEN, TZAN HWEI Author :Alberto Sanchis, Alfons Juan and Enrique Vidal.

Topic Models in Text Processing IR Group Meeting Presented by Qiaozhu Mei.

NLP Language Models1 Language Models, LM Noisy Channel model Simple Markov Models Smoothing Statistical Language Models.

A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.

Summary  The task of extractive speech summarization is to select a set of salient sentences from an original spoken document and concatenate them to.

Chapter 6: Statistical Inference: n-gram Models over Sparse Data

A search-based Chinese Word Segmentation Method ——WWW 2007 Xin-Jing Wang: IBM China Wen Liu: Huazhong Univ. China Yong Qin: IBM China.

Statistical NLP: Lecture 8 Statistical Inference: n-gram Models over Sparse Data (Ch 6)

Combining Statistical Language Models via the Latent Maximum Entropy Principle Shaojum Wang, Dale Schuurmans, Fuchum Peng, Yunxin Zhao.

Japanese Spontaneous Spoken Document Retrieval Using NMF-Based Topic Models Xinhui Hu, Hideki Kashioka, Ryosuke Isotani, and Satoshi Nakamura National.

COMPARISON OF A BIGRAM PLSA AND A NOVEL CONTEXT-BASED PLSA LANGUAGE MODEL FOR SPEECH RECOGNITION Md. Akmal Haidar and Douglas O’Shaughnessy INRS-EMT,

Yun-Nung (Vivian) Chen, Yu Huang, Sheng-Yi Kong, Lin-Shan Lee National Taiwan University, Taiwan.

DISCRIMINATIVE TRAINING OF LANGUAGE MODELS FOR SPEECH RECOGNITION Hong-Kwang Jeff Kuo, Eric Fosler-Lussier, Hui Jiang, Chin-Hui Lee ICASSP 2002 Min-Hsuan.

Context-Sensitive Information Retrieval Using Implicit Feedback Xuehua Shen : department of Computer Science University of Illinois at Urbana-Champaign.

Style & Topic Language Model Adaptation Using HMM-LDA Bo-June (Paul) Hsu, James Glass.

1 CSC 594 Topics in AI – Text Mining and Analytics Fall 2015/16 7. Topic Extraction.

Web Image Retrieval Re-Ranking with Relevance Model Wei-Hao Lin, Rong Jin, Alexander Hauptmann Language Technologies Institute School of Computer Science.

Boosting Training Scheme for Acoustic Modeling Rong Zhang and Alexander I. Rudnicky Language Technologies Institute, School of Computer Science Carnegie.

1 Boostrapping language models for dialogue systems Karl Weilhammer, Matthew N Stuttle, Steve Young Presenter: Hsuan-Sheng Chiu.

Round-Robin Discrimination Model for Reranking ASR Hypotheses Takanobu Oba, Takaaki Hori, Atsushi Nakamura INTERSPEECH 2010 Min-Hsuan Lai Department of.

Efficient Language Model Look-ahead Probabilities Generation Using Lower Order LM Look-ahead Information Langzhou Chen and K. K. Chin Toshiba Research.

1 Sentence Extraction-based Presentation Summarization Techniques and Evaluation Metrics Makoto Hirohata, Yousuke Shinnaka, Koji Iwano and Sadaoki Furui.

Summary  Extractive speech summarization aims to automatically select an indicative set of sentences from a spoken document to concisely represent the.

1 Modeling Long Distance Dependence in Language: Topic Mixtures Versus Dynamic Cache Models Rukmini.M Iyer, Mari Ostendorf.

Recurrent neural network based language model Tom´aˇs Mikolov, Martin Karafia´t, Luka´sˇ Burget, Jan “Honza” Cˇernocky, Sanjeev Khudanpur INTERSPEECH 2010.

Yuya Akita , Tatsuya Kawahara

National Taiwan University, Taiwan

Subproject II: Robustness in Speech Recognition. Members (1/2) Hsiao-Chuan Wang (PI) National Tsing Hua University Jeih-Weih Hung (Co-PI) National Chi.

From Text to Image: Generating Visual Query for Image Retrieval Wen-Cheng Lin, Yih-Chen Chang and Hsin-Hsi Chen Department of Computer Science and Information.

A Word Clustering Approach for Language Model-based Sentence Retrieval in Question Answering Systems Saeedeh Momtazi, Dietrich Klakow University of Saarland,Germany.

Weakly Supervised Training For Parsing Mandarin Broadcast Transcripts Wen Wang ICASSP 2008 Min-Hsuan Lai Department of Computer Science & Information Engineering.

Search and Decoding Final Project Identify Type of Articles Using Property of Perplexity By Chih-Ti Shih Advisor: Dr. V. Kepuska.

MINIMUM WORD CLASSIFICATION ERROR TRAINING OF HMMS FOR AUTOMATIC SPEECH RECOGNITION Yueng-Tien, Lo Speech Lab, CSIE National.

A DYNAMIC APPROACH TO THE SELECTION OF HIGH ORDER N-GRAMS IN PHONOTACTIC LANGUAGE RECOGNITION Mikel Penagarikano, Amparo Varona, Luis Javier Rodriguez-

Jen-Tzung Chien, Meng-Sung Wu Minimum Rank Error Language Modeling.

Local Linear Matrix Factorization for Document Modeling Institute of Computing Technology, Chinese Academy of Sciences Lu Bai,

Relevance Language Modeling For Speech Recognition Kuan-Yu Chen and Berlin Chen National Taiwan Normal University, Taipei, Taiwan ICASSP /1/17.

Comparing Word Relatedness Measures Based on Google n-grams Aminul ISLAM, Evangelos MILIOS, Vlado KEŠELJ Faculty of Computer Science Dalhousie University,

1 Adaptive Subjective Triggers for Opinionated Document Retrieval (WSDM 09’) Kazuhiro Seki, Kuniaki Uehara Date: 11/02/09 Speaker: Hsu, Yu-Wen Advisor:

Copyright © 2013 by Educational Testing Service. All rights reserved. Evaluating Unsupervised Language Model Adaption Methods for Speaking Assessment ShaSha.

Query Suggestions in the Absence of Query Logs Sumit Bhatia, Debapriyo Majumdar,Prasenjit Mitra SIGIR’11, July 24–28, 2011, Beijing, China.

A Generation Model to Unify Topic Relevance and Lexicon-based Sentiment for Opinion Retrieval Min Zhang, Xinyao Ye Tsinghua University SIGIR

St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences Recurrent Neural Network-based Language Modeling for an Automatic.

1 ICASSP Paper Survey Presenter: Chen Yi-Ting. 2 Improved Spoken Document Retrieval With Dynamic Key Term Lexicon and Probabilistic Latent Semantic Analysis.

A Maximum Entropy Language Model Integrating N-grams and Topic Dependencies for Conversational Speech Recognition Sanjeev Khudanpur and Jun Wu Johns Hopkins.

Maximum Entropy techniques for exploiting syntactic, semantic and collocational dependencies in Language Modeling Sanjeev Khudanpur, Jun Wu Center for.

Gaussian Mixture Language Models for Speech Recognition Mohamed Afify, Olivier Siohan and Ruhi Sarikaya.

N-Gram Model Formulas Word sequences Chain rule of probability Bigram approximation N-gram approximation.

Discriminative n-gram language modeling Brian Roark, Murat Saraclar, Michael Collins Presented by Patty Liu.

Recent Paper of Md. Akmal Haidar Meeting before ICASSP 2013 報告者：郝柏翰 2013/05/23.

NTNU SPEECH AND MACHINE INTELEGENCE LABORATORY Discriminative pronunciation modeling using the MPE criterion Meixu SONG, Jielin PAN, Qingwei ZHAO, Yonghong.

H ADVANCES IN MANDARIN BROADCAST SPEECH RECOGNITION Overview Goal Build a highly accurate Mandarin speech recognizer for broadcast news (BN) and broadcast.

Online Multiscale Dynamic Topic Models

LTI Student Research Symposium 2004 Antoine Raux

Presentation transcript:

Latent Topic Modeling of Word Vicinity Information for Speech Recognition Kuan-Yu Chen, Hsuan-Sheng Chiu, Berlin Chen ICASSP 2010 Hao-Chin Chang Department of Computer Science & Information Engineering National Taiwan Normal University

2 Outline Introduction Related work Experiment Conclusion

Introduction A new topic language model, named word vicinity model (WVM), is proposed to explore the co-occurrence relationship between words, as well as the long-span latent topical information for language model adaptation. WVM is close in spirit to WTM, but has a more concise parameterization leading to more reliable model estimation, particularly when available training data is of limited size We present a new modeling approach to WTM, and also design an efficient way to simultaneously extract “word-document” and “word-word” co-occurrence dependence inherent in the adaptation corpus through the sharing of a common set of latent topics. 3

DTM We can interpret each of its corresponding search histories Hwi as a document (or history) topic model used for predicting the occurrence probability of Wi The probabilities of PLSA and the background n-gram (e.g: trigram) language model can be combined through a simple linear interpolation 4

WTM 5 The WTM model of each word wj for predicting the occurrence of a particular word wi can be expressed by During speech recognition, for a search history Hwi=w1,w2,…,wi-1 of a decoded word wi, we can linearly combine the associated WTM models of the words occurring in Hwi to form a composite WTM model for predicting wi

WVM The relationship between words, projected into a low dimensional probability space characterized by the shared set of topic distributions Historical word wj predicting another decoded word wi 6

Comparison 7 DTMWTM WVM 1. Relations Word and DoucmentWord and Word 2. Topic weight Expectation Maximization Basic Topic DTM WTMWVM 3. Parameterization ComplexConcise

Hybrid We present a novel approach to simultaneously capturing “word- document” and “word-word” co-occurrence relationship inherent in the corpus through using a common set of latent topics During speech recognition, we can linearly combine PLSA and WTM to form a hybrid topic language model: 8

Experiment 9 The speech corpus consists of about 200 hours of MATBN Mandarin broadcast news A subset of 25-hour speech compiled during November 2001 to December 2002 was used to bootstrap the acoustic training with the minimum phone error rate (MPE) criterion and the training data selection scheme Another subset of 3-hour speech data collected within 2003 is reserved for development (1.5 hours) and testing (1.5 hours) The vocabulary size is about 72 thousand words The n-gram language models used in this paper consist of trigram and bigram models, which a background text corpus consisting of 170 million Chinese characters collected from Central News Agency (CNA) in 2001 and 2002 (the Chinese Gigaword Corpus released by LDC) using the SRI Language Modeling Toolkit (SRILM) The adaptation text corpus used for training PLSA, LDA, WTM and WVM was collected from MATBN 2001, 2002, and 2003, which consists of one million Chinese characters of the orthographic broadcast news transcripts.

Experimental ReductionPLSALDAWTMWVM Perplexity23.1%21.8%26.1%24.2% Character Error rate 4.1%5.5% 4.6% 10

11 Conclusion 1. Investigate more elaborate window functions for WVM and WTM 2. Seek the use of discriminative training algorithms for training WVM and WTM 3. Apply WVM to speech retrieval and summarization tasks