Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sphinx 3.4 Development Progress Report in February Arthur Chan, Jahanzeb Sherwani Carnegie Mellon University Mar 1, 2004.

Similar presentations


Presentation on theme: "Sphinx 3.4 Development Progress Report in February Arthur Chan, Jahanzeb Sherwani Carnegie Mellon University Mar 1, 2004."— Presentation transcript:

1 Sphinx 3.4 Development Progress Report in February Arthur Chan, Jahanzeb Sherwani Carnegie Mellon University Mar 1, 2004

2 This Presentation  S3.4 Development Progress Speed-up Language Model facilities  CALO and S3.5 Development Which features should be there to make CALO better? Schedule for next three months

3 Review of Last Month Progress  Last month Wrote a speed-up version of s3. Completed some coding of s3.4 speed-up task.  This month Backbone of speed-up functionalities s3.4 completed and tested. Basic LM facilities completed and smoked-tested.

4 Current Systems Specifications (without Gaussian Selection) Sphinx 3Sphinx 3.3 Speed in P4-1G Tested in Communicator Task ERR 17.2% 11xRT GMM, 3xRT Srch ERR 18.6% 6xRT GMM, 1xRT Srch GMM ComputationsNot optimized (few code optimization) Can applied Sub-VQ-based Gauss. Selection LexiconFlatTree SearchBeam on search, no beam on GMM Beam on Search Beam on GMM.

5 Speed-up Facilities in s3.3 GMM Computation Seach Frame-Level Senone-Level Gaussian-Level Component-Level Not implemented SVQ-based GMM Selection Sub-vector constrained to 3 SVQ code removed Lexicon Structure Pruning Heuristic Search Speed-up Tree. Standard Not Implemented

6 Speed-up Facilities in s3.4 GMM Computation Seach Frame-Level Senone-Level Gaussian-Level Component-Level (New) Naïve Down-Sampling (New) Conditional Down-Sampling (New) CI-based GMM Selection (New) VQ-based GMM Selection (New) Unconstrained no. of sub- vectors in SVQ-based GMM Selection (New) SVQ code enabled Lexicon Structure Pruning Heuristic Search Speed-up Tree (New) Improved Word-end Pruning (New) Phoneme- Look-ahead

7 S3.4 Speed Performance in Communicator Task Sphinx 3.3Sphinx 3.4 Error RateERR: 18.6%ERR: 18.7% Speed (P4-1G)6xRT GMM, 1xRT Search 1.2xRT GMM, 1.5xRT Search Speed (P4-2G)1.6xRT GMM, 0.6xRT Search 0.4xRT GMM 0.9xRT Search Techniques used--CI-based GMM Selection -Word-end pruning

8 Issues in Speed Optimization  Implementation Issues: Beams applied on GMM causing many techniques hard to be implemented Some facilities were hardwired for specific purpose.  Performance Issues Each techniques reduced computation by 40- 50% with <5% degradation. However, they didn ’ t add-up ……  Reduction in computation has certain lower bound (usually 75%-80% reduction is max.) Overhead is huge in some techniques  E.g. VQ-based Gaussian Selection take 0.25xRT

9 Language Model Facilities  S3.3 only accept single LM without class in binary format  So far, S3.4 is able to accept multiple class- based LMs in binary format. One major modification of codes  Affect 6-7 files. Caveats:  Not perfect implementation.  Text format is not yet supported. Backward compatibility is an issue.  Lack of test-cases. Only slightly smoke-tested ~1 more week work

10 Problems with s3.4 (valid for Feb 29th, 2004)  Only accept DMP file. Txt format reader is very complex in Sphinx 2. Straight conversion is not clean.  LMs are all loaded into memory We can work on this.  Lexical tree are all built at the beginning We tried to avoid the overhead of rebuilding tree in every utterance.

11 Summary in Sphinx 3.4 Development  Derivative s3.3 With Speed Optimization Better LM facilities  Algorithmic Optimization is 90% completed Still need to improve overhead performance. Tree-based GMM selection is desirable. Improvement for individual technique.  Go-through the major hurdle of multiple LMs and class-based LMs. Need more time to make it more stable.  Expected internal release time : March 8, 2004

12 Sphinx 3.4 and CALO  Which pieces are missing? Sphinx 3.4 ’ s decoding is still not streamlined => Continuous Listening is not yet enabled. Sphinx ’ s speed may still not be ideal. From s3 to s3.3, ~10% degradation. Sphinx 3.4 doesn ’ t learn from data yet.

13 Sphinx 3.5. What should we do in next 3 months?  Expected release time (May – June)  Interfaces: Streamlined front-end and decoding (?) Portaudio based audio routine.  Speed/Accuracy Improved lexical tree search Machine optimization of Gaussian computation. Combination of multiple recognizers  Learning Acoustic Model adaptation (?) Language Model adaptation (In Phoenix) Better semantic parsing  Resource Acquisition and Load Balancing

14 Highlight I: Speed/Accuracy  Improved lexical tree search Current implementation used single lexical tree. May be desirable to create tree copies.  Machine Optimization of Gaussian Computation SIMD (Single Implementation Multiple Data) Require help from assembly language experts. (Jason/Thomas)

15 Highlight II: Multiple Recognizer Combination and Resource Acquisition  Research by Rong suggests combination of multiple recognizer can improve accuracy  Speed worsen by 100% if we run two recognizers.  An interesting solution: Computation can be shared by other machines in the meeting. Inspired by routing implementation. A very natural solution in meeting scenario because usually only one person will be speaking.  Challenges : Bandwidth and Load Balancing

16 Highlight III:  Learning Acoustic Model  Maximum Likely Linear Regression (MLLR)  Will be responsible by Jahanzeb (?)Language Model  How?  Cached-based LM? (?)Improved Robust Parsing  Better parsing based on previous command history  Phoenix ’ s source code is not easy to trace  Thomas Harris ’ s implementation may be a good place to start.

17 Arthur and Jahanzeb ’ s Proposed Schedule ArthurJahanzeb Mar 1 – Mar 15 Windows Port+ Stream-line S3.4 decoding Regression- test + Adaptation Milestone 1 Mar 15- Apr 1 Multiple recognizers Experiments Apr 1 – Apr 15 Preparation for Demo + if (we want) {write-up paper ICSLP}

18 Cont. ArthurJahanzeb Apr 16 – May 7 Search modification: tree copies implementation Regression- test + Adaptation Milestone 2 May 7 – June 1 Sphinx 3.5 Learning code development + s3.5 release (?)


Download ppt "Sphinx 3.4 Development Progress Report in February Arthur Chan, Jahanzeb Sherwani Carnegie Mellon University Mar 1, 2004."

Similar presentations


Ads by Google