Presentation is loading. Please wait.

Presentation is loading. Please wait.

The HTK Book (for HTK Version 3.2.1) Young et al., 2002.

Similar presentations


Presentation on theme: "The HTK Book (for HTK Version 3.2.1) Young et al., 2002."— Presentation transcript:

1 The HTK Book (for HTK Version 3.2.1) Young et al., 2002

2 Chapter 1 The Fundamentals of HTK HTK is a toolkit for building hidden Markov models (HMMs). Primarily used to build ASRs, but also other HMM systems: speaker and image recognition, automatic text summarization etc. HTK has tools (modules) for both training and testing HMM systems.

3 How to Train and Test an ASR? Things needed: A labeled speech corpus and a dictionary (+ grammar). Procedure: 1. Divide corpus into training, development and test sets. 2. Train acoustic models. 3. Test, retrain, test … on the development set. 4. Test on the test data.

4 How to Build an ASR Using HTK? Goal: A recognizer for voice dialing. ( SENT-START ( DIAL | (PHONE|CALL) $name) SENT- END )

5 Creating a Dictionary HDMan a list of the phones. An HMM will be estimated for each of these phones.

6 Recording the Data HSLab noname HSGen (wdnet dict) testprompts

7 Transcribing the Data HMM training is supervised learning.

8 Coding the Data HTK supports frame-based FFTs, LPCs, MFCCs, user-defined etc.

9 Output Probability Specification Most common one is CDHMM. HTK also allows discrete probabilities (for VQ data).

10 Flat Start Training Build a prototype HMM with reasonable initial guesses of its parameters (HCompV). Specify the topology – usually left to right and 3 states w/ no skips. Create a MMF. Now use HRest or HERest for training.

11 Realigning and Creating Triphones. Use pseudo-recognition to force align training data w/ multiple pronunciations.

12 Evaluation

13 Other Issues HTK supports supervised and unsupervised speaker adaptation (HVite). Language model: n-gram language models.


Download ppt "The HTK Book (for HTK Version 3.2.1) Young et al., 2002."

Similar presentations


Ads by Google