專題研究 (4) HDecode_live Prof. Lin-Shan Lee, TA. Yun-Chiao Li 1.

Slides:



Advertisements
Similar presentations
Mike Scott University of Texas at Austin
Advertisements

Learning to Take Good Notes
University of Sheffield NLP Exercise I Objective: Implement a ML component based on SVM to identify the following concepts in company profiles: company.
COMPUTER PROGRAMMING Task 1 LEVEL 6 PROGRAMMING: Be able to use a text based language like Python and JavaScript & correctly use procedures and functions.
數位語音處理概論 HW#2-1 HMM Training and Testing
專題研究 WEEK 4 - LIVE DEMO Prof. Lin-Shan Lee TA. Hsiang-Hung Lu,Cheng-Kuan Wei.
Copyright 2007, Paradigm Publishing Inc. POWERPOINT 2007 CHAPTER 1 BACKNEXTEND 1-1 LINKS TO OBJECTIVES Create Presentation Open, Save, Run, Print, Close,Delete.
專題研究 WEEK3 LANGUAGE MODEL AND DECODING Prof. Lin-Shan Lee TA. Hung-Tsung Lu.
CS1020: Intro Workshop. Topics CS1020Intro Workshop Login to UNIX operating system 2. …………………………………… 3. …………………………………… 4. …………………………………… 5. ……………………………………
Course Introduction and Getting Started with C 1 USF - COP C for Engineers Summer 2008.
專題研究 WEEK3 LANGUAGE MODEL AND DECODING Prof. Lin-Shan Lee TA. Hung-Tsung Lu,Cheng-Kuan Wei.
1 SEEM3460 Tutorial Unix Introduction. 2 Introduction What is Unix? An operation system (OS), similar to Windows, MacOS X Why learn Unix? Greatest Software.
01-Intro-Object-Oriented-Prog-Alice1 Barb Ericson Georgia Institute of Technology Aug 2009 Introduction to Object-Oriented Programming in Alice.
12/13/2007Chia-Ho Ling1 SRILM Language Model Student: Chia-Ho Ling Instructor: Dr. Veton Z. K ë puska.
Name:Venkata subramanyan sundaresan Instructor:Dr.Veton Kepuska.
CMU-Statistical Language Modeling & SRILM Toolkits
Introduction to VB.NET Tonga Institute of Higher Education.
Google Training By: Amy Shannon and Dave Auwerda.
Arthur Kunkle ECE 5525 Fall Introduction and Motivation  A Large Vocabulary Speech Recognition (LVSR) system is a system that is able to convert.
SqlReports Dean Dahlvang PSUG-MO March About Dean Dean Dahlvang Director of Administrative Technology for the Proctor.
Rosetta Stone!! What language are you learning??.
IT 211 Project Integration and Deployment Lab #11.
{ flS Tutorial By  flS uses SMTP protocol to send mails, so your SMTP information is needed.  The first time you launch flS, you will be.
How to create a Splash Screen in MS Access Carlos Coronel.
Playing Music in Alice By David Yan Under the direction of Professor Susan Rodger July 2015.
Cygwin Tutorial 1. What is Cygwin? Cygwin offers a UNIX like environment on top of MS-Windows. Gives the ability to use familiar UNIX tools without losing.
DSP homework 1 HMM Training and Testing
CPSC 217 T03 Week I Part #1: Unix and HELLO WORLD Hubert (Sathaporn) Hu.
Downloading and Installing Autodesk Revit 2016
Project Deployment IT [211 CAP] How to convert your project to a full application.
Digital Stories Using Microsoft Photo Story 3 for Windows Carrie Roth (248)
VIM  This is the text editor you will use on the workstation.  You can also edit the text files under windows environment and upload it to the workstation.
Audio Check 1. Wait for the support person to call your name 2. Say “Hello”. To talk, either: Click the TALK button on the screen, OR Press CTRL+F2 (COMMAND+F2.
Playing Music in Alice By David Yan Under the direction of Professor Susan Rodger July 2015.
Loading Audacity and the LAME encoder for MP3 exports.
Downloading and Installing Autodesk Inventor Professional 2015 This is a 4 step process 1.Register with the Autodesk Student Community 2.Downloading the.
Configuring IQmol for Windows machines, use version!
Setting up Cygwin Computer Organization I 1 May 2010 ©2010 McQuain Cygwin: getting the setup tool Free, almost complete UNIX environment emulation.
專題研究 (2) Feature Extraction, Acoustic Model Training WFST Decoding
MySQL Getting Started BCIS 3680 Enterprise Programming.
Windows Installation Tutorial NASA ARSET For Python help, contact: Justin Roberts-Pierel
Today's Ninja Challenge: Write Your First Computer Game!
General Computer Science for Engineers CISC 106 Lecture 03 James Atlas Computer and Information Sciences 6/15/2009.
Debugging Lab Antonio Gómez-Iglesias Texas Advanced Computing Center.
Using Audacity Let’s get Started Open Audacity. Getting started…
PROGRAMMING USING PYTHON LANGUAGE ASSIGNMENT 1. INSTALLATION OF RASPBERRY NOOB First prepare the SD card provided in the kit by loading an Operating System.
Linux CSE 1222 CSE1222: Lecture 1BThe Ohio State University1.
9/21/04 James Gallagher Server Installation and Testing: Hands-on ● Install the CGI server with the HDF and FreeForm handlers ● Link data so the server.
Cygwin: getting the setup tool
Installing Cygwin from
CS1010: Intro Workshop.
Using Panopto to Record Presentations
Prof. Lin-shan Lee TA. Roy Lu
WORKSHOP 19 HATCHBACK III
專題研究 week3 Language Model and Decoding
Prof. Lin-shan Lee TA. Lang-Chi Yu
Microsoft Access 2003 Illustrated Complete
(c) 2004 MCSC Technology Training
Andy Wang Object Oriented Programming in C++ COP 3330
Young Joon Kim SPL basic – Quick Start SPL First Beginner Course – 01 Young Joon Kim
Prof. Lin-shan Lee TA. Po-chun, Hsu
專題研究 WEEK 5 – Deep Neural Networks in Kaldi
Yung-Hsiang Lu Purdue University
How to Execute TSR Program
YOUR text YOUR text YOUR text YOUR text
專題研究 WEEK 5 – Deep Neural Networks in Kaldi
How to Execute TSR Program
Running a Java Program using Blue Jay.
Cygwin: getting the setup tool
Prof. Lin-shan Lee TA. Roy Lu
Presentation transcript:

專題研究 (4) HDecode_live Prof. Lin-Shan Lee, TA. Yun-Chiao Li 1

Additional Information about Kaldi Part 1 2

Kaldi – some practices (1/2)  In 03.01:  Try to modify the total number of Gaussian by modifying “totgauss”  In 04.01:  Try to modify the number of leaves of decision tree by modifying “numleaves”  Try to modify the total number of Gaussian by modifying “totgauss”  run through the scripts and see the changes in performance and the optimal weight 3

Kaldi – some practices (2/2)  Some tips:  you can change “numleaves” up to around 4500  keeping the number of Gaussian less than 20 times of “numleaves” is more stable  Try to modify other parameters if you have time:  numiters: number of iterations  realign_iters: those iterations to realign the feature to state 4

Simple Live Recognition System (HDecode_live) Part 2 5

Simple Recognition System  Make sure the microphone is functional  和 HDecode 用法相同 (hdecode.sh)  HDecode -> Hdecode_live  Make sure HDecode, record, HCopy is under the same directory  Work on cygwin  Use bi-gram language model  -a 0.5 (acoustic model weight)  -s 8.0 (language model weight)  -t 75.0 (beamwidth) 6 You can change these parameters and see what will happen

Setup  Cygwin  The purpose to use Cygwin is to simulate the unix operating system in windows  Install Cygwin  (x86 only!!)  Download /share/HDecode_live/  to C:\cygwin\home\youraccount\HDecode_live  leave all the options default and click next 7

Lecture AM / tiedlist am.lecture.speaker- dependent.mmf / tiedlist.news LM trained by yourself Lexicon lexicon.lecture News AM / tiedlist am.news.mmf / tiedlist.news LM trained by yourself Lexicon lexicon.news There are two sets of recognition system Lecture AM here is trained by Prof. Lee’s sound News AM here is trained by several news reporter’s sound The News system provides better performance

Acoustic Model  Training AM by HTK is time consuming  We’ve trained it for you  final.mmf is the speaker dependent AM trained by Prof. Lee’s voice  Therefore, it is suitable to recognize the professor’s voice  it is the same as what we used in Kaldi 9

Acoustic Model Example 10 Here is the HMM model for each phone Here is the Gaussian mixture model for each state

Language model training (1/2)  remove the first column in material/train.text, and rename it as train.lecture  hint: vim visual block + “d”  train.lecture:  OKAY [A66E] [A655][A6EC] [A6AD]  [B36F][AAF9][BDD2] [AC4F] [BCC6][A6EC] [BB79][ADB5][B342][B27A] EMPH_A  [A8BA] [B36F][AC4F] [A8E2] [ADD3] [A5D8][AABA]  Change encoding:  /share/tool/chencoding -f ascii -t utf8 train.lecture > train.lecture.utf8  OKAY 好 各位 早  這門課 是 數位 語音處理 EMPH_A  那 這是 兩 個 目的 11

Language model training (2/2)  We prepare another language model too  Use the news corpus to train language model  copy it to your folder cp /share/corpus/train.*. cp /share/corpus/lexicon.*.  /share/tool/ngram-count  -order 2 (you can modify it from 1~3!)  -kndiscount (modified Kneser-Ney)  -text train.lecture (training data, also try train.news!)  -vocab lexicon.lecture (lexicon, also try lexicon.news!)  -lm languagemodel (output language model name) 12

Simple Recognition System  Execute Cygwin Terminal in Windows  Edit hdecode.lecture.sh/hdecode.news.sh  change the language model to your’s  Execute “bash hdecode.lecture.sh/hdecode.news.sh”  Wait until “Ready…” appears in the terminal  Click “Enter” and say something  Click “Enter” again and wait for the result  Type “exit” if you want to leave 13

Some hint  If you have any problem training LM:  scripts are here: /share/scripts/ 14