Download presentation
Presentation is loading. Please wait.
Published byFrancis Bond Modified over 9 years ago
1
The HTK Book (for HTK Version 3.2.1) Young et al., 2002
2
Chapter 1 The Fundamentals of HTK HTK is a toolkit for building hidden Markov models (HMMs). Primarily used to build ASRs, but also other HMM systems: speaker and image recognition, automatic text summarization etc. HTK has tools (modules) for both training and testing HMM systems.
3
How to Train and Test an ASR? Things needed: A labeled speech corpus and a dictionary (+ grammar). Procedure: 1. Divide corpus into training, development and test sets. 2. Train acoustic models. 3. Test, retrain, test … on the development set. 4. Test on the test data.
4
How to Build an ASR Using HTK? Goal: A recognizer for voice dialing. ( SENT-START ( DIAL | (PHONE|CALL) $name) SENT- END )
5
Creating a Dictionary HDMan a list of the phones. An HMM will be estimated for each of these phones.
6
Recording the Data HSLab noname HSGen (wdnet dict) testprompts
7
Transcribing the Data HMM training is supervised learning.
8
Coding the Data HTK supports frame-based FFTs, LPCs, MFCCs, user-defined etc.
9
Output Probability Specification Most common one is CDHMM. HTK also allows discrete probabilities (for VQ data).
10
Flat Start Training Build a prototype HMM with reasonable initial guesses of its parameters (HCompV). Specify the topology – usually left to right and 3 states w/ no skips. Create a MMF. Now use HRest or HERest for training.
11
Realigning and Creating Triphones. Use pseudo-recognition to force align training data w/ multiple pronunciations.
12
Evaluation
13
Other Issues HTK supports supervised and unsupervised speaker adaptation (HVite). Language model: n-gram language models.
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.