Speed-up Facilities in s3.3 GMM Computation Seach Frame-Level Senone-Level Gaussian-Level Component-Level Not implemented SVQ-based GMM Selection Sub-vector.

Slides:



Advertisements
Similar presentations
LABORATOIRE DINFORMATIQUE CERI 339 Chemin des Meinajariès BP AVIGNON CEDEX 09 Tél (0) Fax (0)
Advertisements

Software & Services Group, Developer Products Division Copyright© 2010, Intel Corporation. All rights reserved. *Other brands and names are the property.
Register Allocation Mooly Sagiv Schrierber Wed 10:00-12:00 html://
Sphinx-3 to 3.2 Mosur Ravishankar School of Computer Science, CMU Nov 19, 1999.
Offline Adaptation Using Automatically Generated Heuristics Frédéric de Mesmay, Yevgen Voronenko, and Markus Püschel Department of Electrical and Computer.
CALO Decoder Progress Report for March Arthur (Decoder and ICSI Training) Jahanzeb (Decoder) Ziad (ICSI Training) Moss (ICSI Training) Carnegie Mellon.
Speech Recognition Part 3 Back end processing. Speech recognition simplified block diagram Speech Capture Speech Capture Feature Extraction Feature Extraction.
1 Wireless Communication Low Complexity Multiuser Detection Rami Abdallah University of Illinois at Urbana Champaign 12/06/2007.
Acoustic / Lexical Model Derk Geene. Speech recognition  P(words|signal)= P(signal|words) P(words) / P(signal)  P(signal|words): Acoustic model  P(words):
Brief Overview of Different Versions of Sphinx Arthur Chan.
Progress of Sphinx 3.X From X=5 to X=6 Arthur Chan Evandro Gouvea David J. Huggins-Daines Alex I. Rudnicky Mosur Ravishankar Yitao Sun.
Application of HMMs: Speech recognition “Noisy channel” model of speech.
Motion Editing and Retargetting Jinxiang Chai. Outline Motion editing [video, click here]here Motion retargeting [video, click here]here.
2 nd Progress Meeting For Sphinx 3.6 Development Arthur Chan, David Huggins-Daines, Yitao Sun Carnegie Mellon University Jun 7, 2005.
Application of Generalized Representations for Image Compression Application of Generalized Representations for Image Compression using Vector Quantization.
Efficient Statistical Pruning for Maximum Likelihood Decoding Radhika Gowaikar Babak Hassibi California Institute of Technology July 3, 2003.
Part 6 HMM in Practice CSE717, SPRING 2008 CUBS, Univ at Buffalo.
Conventional AI Machine Learning Techniques Environmental Mapping Case Statements Statistical Analysis Computational Intelligence Pattern Recognition.
Progress of Sphinx 3.X, From X=4 to X=5 By Arthur Chan Evandro Gouvea Yitao Sun David Huggins-Daines Jahanzeb Sherwani.
Almost-Spring Short Course on Speech Recognition Instructors: Bhiksha Raj and Rita Singh Welcome.
Progress Presentation of Sphinx 3.6 (2005 Q2) Arthur Chan Carnegie Mellon University Jun 7, 2005.
專題研究 WEEK3 LANGUAGE MODEL AND DECODING Prof. Lin-Shan Lee TA. Hung-Tsung Lu.
Sphinx 3.4 Development Progress Arthur Chan, Jahanzeb Sherwani Carnegie Mellon University Mar 4, 2004.
CALO Decoder Progress Report for June Arthur (Decoder, Trainer, ICSI Training) Yitao (Live-mode Decoder) Ziad (ICSI Training) Carnegie Mellon University.
Sphinx 3.4 Development Progress Report in February Arthur Chan, Jahanzeb Sherwani Carnegie Mellon University Mar 1, 2004.
專題研究 WEEK3 LANGUAGE MODEL AND DECODING Prof. Lin-Shan Lee TA. Hung-Tsung Lu,Cheng-Kuan Wei.
HARDEEPSINH JADEJA UTA ID: What is Transcoding The operation of converting video in one format to another format. It is the ability to take.
Development in hardware – Why? Option: array of custom processing nodes Step 1: analyze the application and extract the component tasks Step 2: design.
The 2000 NRL Evaluation for Recognition of Speech in Noisy Environments MITRE / MS State - ISIP Burhan Necioglu Bryan George George Shuttic The MITRE.
Neural Networks Chapter 6 Joost N. Kok Universiteit Leiden.
CMU Shpinx Speech Recognition Engine Reporter : Chun-Feng Liao NCCU Dept. of Computer Sceince Intelligent Media Lab.
17.0 Distributed Speech Recognition and Wireless Environment References: 1. “Quantization of Cepstral Parameters for Speech Recognition over the World.
1M4 speech recognition University of Sheffield M4 speech recognition Vincent Wan, Martin Karafiát.
1 Improved Speaker Adaptation Using Speaker Dependent Feature Projections Spyros Matsoukas and Richard Schwartz Sep. 5, 2003 Martigny, Switzerland.
Parallel muiticategory Support Vector Machines (PMC-SVM) for Classifying Microarray Data 研究生 : 許景復 單位 : 光電與通訊研究所.
HIERARCHICAL SEARCH FOR LARGE VOCABULARY CONVERSATIONAL SPEECH RECOGNITION Author :Neeraj Deshmukh, Aravind Ganapathiraju and Joseph Picone.
8.0 Search Algorithms for Speech Recognition References: of Huang, or of Becchetti, or , of Jelinek 4. “ Progress.
Temple University Training Acoustic Models Using SphinxTrain Jaykrishna Shukla, Mubin Amehed, and Cara Santin Department of Electrical and Computer Engineering.
Powerpoint Templates Page 1 Powerpoint Templates Scalable Text Classification with Sparse Generative Modeling Antti PuurulaWaikato University.
Training Tied-State Models Rita Singh and Bhiksha Raj.
CHAPTER 8 DISCRIMINATIVE CLASSIFIERS HIDDEN MARKOV MODELS.
Benchmarks of a Weather Forecasting Research Model Daniel B. Weber, Ph.D. Research Scientist CAPS/University of Oklahoma ****CONFIDENTIAL**** August 3,
Chapter 1 Introduction Major Data Structures in Compiler
Computer Science 210 Computer Organization Control Circuits Decoders and Multiplexers.
The Use of Virtual Hypothesis Copies in Decoding of Large-Vocabulary Continuous Speech Frank Seide IEEE Transactions on Speech and Audio Processing 2005.
Machine Learning in Compiler Optimization By Namita Dave.
University of Maryland at College Park Smart Dust Digital Processing, 1 Digital Processing Platform Low power design and implementation of computation.
Highly Parallel Mode Decision Method for HEVC Jun Zhang, Feng Dai, Yike Ma, and Yongdong Zhang Picture Coding Symposium (PCS),
EXAMPLE : Principle of Optimality and Dynamic Programming.
NTNU SPEECH AND MACHINE INTELEGENCE LABORATORY Discriminative pronunciation modeling using the MPE criterion Meixu SONG, Jielin PAN, Qingwei ZHAO, Yonghong.
An overview of decoding techniques for LVCSR
More Devices: Control (Making Choices)
專題研究 week3 Language Model and Decoding
Feature Selection for Pattern Recognition
Discrete Math 2 Weighted Graph Search Tree
Overview Control Memory Comparison of Implementations
8.0 Search Algorithms for Speech Recognition
For Example: User level quicksort program Three address code.
Digital Processing Platform
Digital Processing Platform
CALO Decoder Progress Report for April/May
Computer Organization and Design
Team 2: Graham Leech, Austin Woods, Cory Smith, Brent Niemerski
Sphinx 3.X (X=4) Four-Layer Categorization Scheme of Fast GMM Computation Techniques in Large Vocabulary Continuous Speech Recognition Systems
Progress Report of Sphinx in Q (Sep 1st to Dec 30th)
Sphinx Recognizer Progress Q2 2004
Dynamic Programming Search
A word graph algorithm for large vocabulary continuous speech recognition Stefan Ortmanns, Hermann Ney, Xavier Aubert Bang-Xuan Huang Department of Computer.
Presenter : Jen-Wei Kuo
Presentation transcript:

Speed-up Facilities in s3.3 GMM Computation Seach Frame-Level Senone-Level Gaussian-Level Component-Level Not implemented SVQ-based GMM Selection Sub-vector constrained to 3 SVQ code removed Lexicon Structure Pruning Heuristic Search Speed-up Tree. Standard Not Implemented

Summary of Speed-up Facilities in s3.4 GMM Computation Seach Frame-Level Senone-Level Gaussian-Level Component-Level (New) Naïve Down-Sampling (New) Conditional Down-Sampling (New) CI-based GMM Selection (New) VQ-based GMM Selection (New) Unconstrained no. of sub- vectors in SVQ-based GMM Selection (New) SVQ code enabled Lexicon Structure Pruning Heuristic Search Speed-up Tree (New) Improved Word-end Pruning (New) Phoneme- Look-ahead

Near Term Improvement of Decoder  Improve LM facilities (Avail at Mar 31)  Improve speed-up techniques (Avail at Mar 31) Complete phoneme look-ahead research Complete machine optimization in Intel platform  Enable speed-up in live-mode recognition. (Avail at Mar 31)  Improved search structure Modify code to use lexical tree copies (Apr 15) Modify code to handle cross-word triphones (Apr 30)

Training Plan  Text-Processing (Avail at Mar 31)  First Pass of Acoustic/Language Modeling (Avail at Apr 15) With the help of the new 4 cpus machine. Training using standard recipe CD + CI mode first pass models. Trigram models.  Second Pass of Acoustic/Language Modeling Improved training.  Decide what we should do after we get the results.  AM/LM Adaptation? (Don ’ t know yet)