Presentation is loading. Please wait.

Presentation is loading. Please wait.

LABORATOIRE DINFORMATIQUE CERI 339 Chemin des Meinajariès BP 1228 84911 AVIGNON CEDEX 09 Tél. + 33 (0)4 90 84 35 09 Fax. + 33 (0)4 90 84 35 01

Similar presentations


Presentation on theme: "LABORATOIRE DINFORMATIQUE CERI 339 Chemin des Meinajariès BP 1228 84911 AVIGNON CEDEX 09 Tél. + 33 (0)4 90 84 35 09 Fax. + 33 (0)4 90 84 35 01"— Presentation transcript:

1 LABORATOIRE DINFORMATIQUE CERI 339 Chemin des Meinajariès BP AVIGNON CEDEX 09 Tél (0) Fax (0) The SPEERAL Decoder NOCERA Pascal Laboratoire d Informatique d Avignon AGROPARC BP 1228, AVIGNON Cedex 9 Tel :

2 LABORATOIRE DINFORMATIQUE CERI 339 Chemin des Meinajariès BP AVIGNON CEDEX 09 Tél (0) Fax (0) FLAVOR workshop The SPEERAL System Stochastic approach Find the best hypothesis among all the possible hypotheses with the A* algorithm.

3 LABORATOIRE DINFORMATIQUE CERI 339 Chemin des Meinajariès BP AVIGNON CEDEX 09 Tél (0) Fax (0) FLAVOR workshop The SPEERAL System Stochastic approach

4 LABORATOIRE DINFORMATIQUE CERI 339 Chemin des Meinajariès BP AVIGNON CEDEX 09 Tél (0) Fax (0) FLAVOR workshop Acoustic Models Hidden Markov Models Gaussian Mixture Models Contextual Models (Phonemes)

5 LABORATOIRE DINFORMATIQUE CERI 339 Chemin des Meinajariès BP AVIGNON CEDEX 09 Tél (0) Fax (0) FLAVOR workshop Acoustic Model Toolkit Parameterization program Text to phone program Alignment program HMM learning program Supervised and unsupervised Model Adaptation –MLLR –MAP –Structural Model Space Transformation

6 LABORATOIRE DINFORMATIQUE CERI 339 Chemin des Meinajariès BP AVIGNON CEDEX 09 Tél (0) Fax (0) FLAVOR workshop Linguistic Models Stochastic Language Models –N-grams –Class based language models

7 LABORATOIRE DINFORMATIQUE CERI 339 Chemin des Meinajariès BP AVIGNON CEDEX 09 Tél (0) Fax (0) FLAVOR workshop Linguistic Model Toolkit Text Normalization Tools Language Model Training –CMU toolkit –SRI toolkit –AT&T toolkit Language Model Compilation Lexicon Compilation

8 LABORATOIRE DINFORMATIQUE CERI 339 Chemin des Meinajariès BP AVIGNON CEDEX 09 Tél (0) Fax (0) FLAVOR workshop Standard A* algorithm « best-first » search algorithm –Extend the best path to generate new candidates –Assign a score F(x) to all explored path g(x) combines Language Model and acoustic scores h(x) estimates the probability of the best extension –Keep the list of explored paths as a priority queue –When the best path reaches end then stop

9 LABORATOIRE DINFORMATIQUE CERI 339 Chemin des Meinajariès BP AVIGNON CEDEX 09 Tél (0) Fax (0) FLAVOR workshop Requires an admissible heuristic function –h(x) underestimates the true remaining cost path (the more accurate the better). H euristics samples –h(x) = 0 Breadth-First search –h(x) = true remaining cost (i.e. F(x) never changes) Deterministic search Standard A* algorithm (2/2)

10 LABORATOIRE DINFORMATIQUE CERI 339 Chemin des Meinajariès BP AVIGNON CEDEX 09 Tél (0) Fax (0) FLAVOR workshop The SPEERAL System Language model –Stochastic n-gram LM (n=3) Lexical, phonetic and acoustic knowledge source –Acoustic model (HMM, …) –Decoding vocabulary (lexicon) –Input signal Phoneme lattice ( p, beg, end, sc ) with score sc = P(X beg..end /p) + …/…

11 LABORATOIRE DINFORMATIQUE CERI 339 Chemin des Meinajariès BP AVIGNON CEDEX 09 Tél (0) Fax (0) FLAVOR workshop Sounding function h Remaining path estimation –Acoustic score only –Computed with a backward Viterbi, during the phoneme lattice generation Heuristic admissibility –Underestimate remaining cost : no LM information –Cannot be true cost (lack of LM information)

12 LABORATOIRE DINFORMATIQUE CERI 339 Chemin des Meinajariès BP AVIGNON CEDEX 09 Tél (0) Fax (0) FLAVOR workshop Lexicon Prefix-tree organization –Widely applied –Compact representation search effort occurs at word begin W 1 : p 1 p 2 p 3 W 2 1 p 3 W 3 2 p 1 Lexicon p 1 p 1 p 2 p 2 p 3 p 3 W 1 W 2 W 3

13 LABORATOIRE DINFORMATIQUE CERI 339 Chemin des Meinajariès BP AVIGNON CEDEX 09 Tél (0) Fax (0) FLAVOR workshop Search space Phoneme lattice Concatenation of lexical trees W 1 W 2 W 3 Lexicon: W 2 W 1 W 1 W 2 W 1 W 2 W 3 W 2 W 2 W 3 W 2 W 1 W 1 W 1 Sentence beginning

14 LABORATOIRE DINFORMATIQUE CERI 339 Chemin des Meinajariès BP AVIGNON CEDEX 09 Tél (0) Fax (0) FLAVOR workshop LM look-ahead Word anticipation –n is a lexicon node –w n is any leaf (i.e. word) of the sub-tree starting at n P(n/...w i-2 w i-1 ) = Part_LM(n, w i-2 w i-1 ) Part_LM(n, w i-2 w i-1 ) = max Wn [P(w n /w i-2 w i-1 )] Paths leading to improbable words are early penalized p 1 p 1 p 2 p 2 p 3 p 3 W 1 W 2 W 3

15 LABORATOIRE DINFORMATIQUE CERI 339 Chemin des Meinajariès BP AVIGNON CEDEX 09 Tél (0) Fax (0) FLAVOR workshop Start-synchronous tree Asynchronous search –The search processes the same part (lexicon) with a different history. With start-synchronous capabilities –Most advanced path can be reused when encountered twice. For each frame x, the lexicon starting at x is stored. Only the deepest nodes (or leaves) are stored.

16 LABORATOIRE DINFORMATIQUE CERI 339 Chemin des Meinajariès BP AVIGNON CEDEX 09 Tél (0) Fax (0) FLAVOR workshop Principle (1/5) p 1 p 1 p 2 W 3 p 1 p 1 p 2 p 2 p 3 p 3 Frame tFrame 0 Deepest lexicon nodes at frame 0 Deepest lexicon nodes at frame t p 1 p 1 p 2 p 2 p 3 p 3 W 1 W 2 W 3 W 1

17 LABORATOIRE DINFORMATIQUE CERI 339 Chemin des Meinajariès BP AVIGNON CEDEX 09 Tél (0) Fax (0) FLAVOR workshop Principle (2/5) p 1 p 1 p 2 p 2 p 3 W 3 p 1 p 1 p 2 p 2 p 3 p 3 Frame tFrame 0 W 1

18 LABORATOIRE DINFORMATIQUE CERI 339 Chemin des Meinajariès BP AVIGNON CEDEX 09 Tél (0) Fax (0) FLAVOR workshop Principle (3/5) p 1 p 1 p 2 p 2 p 3 W 3 Frame tFrame 0 W 2 p 1 p 1 p 2 p 2 p 3 p 3 W 1

19 LABORATOIRE DINFORMATIQUE CERI 339 Chemin des Meinajariès BP AVIGNON CEDEX 09 Tél (0) Fax (0) FLAVOR workshop Principle (4/5) p 1 p 1 p 2 p 2 p 3 W 3 Frame tFrame 0 W 2 p 1 p 1 p 2 p 2 p 3 p 3 W 1

20 LABORATOIRE DINFORMATIQUE CERI 339 Chemin des Meinajariès BP AVIGNON CEDEX 09 Tél (0) Fax (0) FLAVOR workshop Principle (5/5) …. p 1 p 1 p 2 p 2 p 3 W 3 Frame tFrame 0 W 2 p 1 p 1 p 2 p 2 p 3 p 3 W 1 p 1 p 2 p 1 p 2 Frame t+n

21 LABORATOIRE DINFORMATIQUE CERI 339 Chemin des Meinajariès BP AVIGNON CEDEX 09 Tél (0) Fax (0) FLAVOR workshop Search space pruning Optimization –If two candidates end with the same 3 words, only the best is kept. Cut –Short candidates are dropped when their distance increase too much with the deepest.

22 LABORATOIRE DINFORMATIQUE CERI 339 Chemin des Meinajariès BP AVIGNON CEDEX 09 Tél (0) Fax (0) FLAVOR workshop ASR Output –1 best hypothesis –N best hypothesis –word graph Applications –Transcription –Question answering –Named entities extraction –Information Retrieval –Call-type classification –…

23 LABORATOIRE DINFORMATIQUE CERI 339 Chemin des Meinajariès BP AVIGNON CEDEX 09 Tél (0) Fax (0) FLAVOR workshop French Broadcast News Campain ESTER Acoustic Segmentation Broadcast News (1h long show) Speaker Segmentation Information Extraction Speech transcription Acoustic models Language models

24 LABORATOIRE DINFORMATIQUE CERI 339 Chemin des Meinajariès BP AVIGNON CEDEX 09 Tél (0) Fax (0) FLAVOR workshop System Description Acoustic Models : 10k HMM contextual 3.6k states 230k gaussian Lexicon : 65K Words Language model Combination : (Le Monde 87-02, 0.41) (Le Monde 02-03, 0.24) (ESTER, 0.35)

25 LABORATOIRE DINFORMATIQUE CERI 339 Chemin des Meinajariès BP AVIGNON CEDEX 09 Tél (0) Fax (0) FLAVOR workshop Results and Demonstration WER 25 % (10 RT) Demonstration on TV


Download ppt "LABORATOIRE DINFORMATIQUE CERI 339 Chemin des Meinajariès BP 1228 84911 AVIGNON CEDEX 09 Tél. + 33 (0)4 90 84 35 09 Fax. + 33 (0)4 90 84 35 01"

Similar presentations


Ads by Google