Presentation is loading. Please wait.

Presentation is loading. Please wait.

Advances in WP2 Chania Meeting – May 2007 www.loquendo.com.

Similar presentations


Presentation on theme: "Advances in WP2 Chania Meeting – May 2007 www.loquendo.com."— Presentation transcript:

1 Advances in WP2 Chania Meeting – May 2007 www.loquendo.com

2 2 Summary Unsupervised Adaptation Adaptation on Hiwire DB

3 Supervised vs Unsupervised Adaptation Chania Meeting – May 2007 www.loquendo.com

4 4 Supervised Adaptation Gen. models Adapted models transcriptions ASR Forced segmentation Adaptation Module forced segmentations Speech parameters Adaptation set

5 5 Unsupervised Adaptation transcriptions Adaptation Module Gen. models Adapted models ASR Forced segmentation Speech parameters ASR Recognition Confidence based selection Adaptation set forced segmentations ASR segmentations

6 Adaptation on HIWIRE DB Chania Meeting – May 2007 www.loquendo.com

7 7 Kinds of Adaptation Two kind of adaptation were performed: Multi-Condition: the adaptation data of all the speakers and all noise conditions are pooled. The models are adapted to channel, noise conditions, and non-native common aspects. Speaker-Dependent: Adaptation and tests are performed for each speaker separately, and all results are finally averaged. The models are adapted mainly to speaker’s voice, but also to channel and noise conditions.

8 8 Adaptation Types Two type of adaptation are experimented: Supervised: the transcriptions of the sentences available in HDB are employed to perform forced segmentation of the adaptation utterances, providing the labels needed by the adaptation process, which is intrinsically supervised. Unsupervised: the transcriptions of the sentences are not employed, to simulate an “on-the-field” adaptation, and are approximated by the ASR outputs. Only the adaptation utterances recognized with a certain degree of confidence are used in the adaptation process, to avoid divergence due to incorrectly labeled data.

9 9 Multi-Condition Adaptation Multi-Condition Adaptation Denoising method Noise Condition AVG E.R. % MethodType CleanLNMNHN No- 90.549.127.55.043.0 - LHN consSupv 97.581.159.213.462.8 34.7 LHN specSupv 98.290.979.634.875.9 57.7 No- EM 90.271.955.016.658.4 27.0 LHN consSupv 90.697.179.331.174.5 55.3 LHN specSupv 98.093.283.735.577.6 60.7 LHN consUnsupv EM 94.387.276.831.572.5 51.7 LHN specUnsupv 93.785.573.727.170.0 47.4 Adaptation is done with all the speakers and noise conditions together It adapts to channel, noise conditions, and non-native common aspects

10 10 Multi-Condition Adaptation Adaptation is done with all the speakers and noise conditions together It adapts to channel, noise conditions, and non-native common aspects

11 11 Comments supervised multi-condition adaptation gives good performance improvement. It operates well even without denoising, since it incorporates information of channel, noise and non-native accents in the models. The average best results are obtained with supervised adaptation in conjunction with denoising (60.7% E.R.) As expected, unsupervised adaptation is inferior to supervised adaptation (51.7% vs. 60.7% E.R.), but it proves to be an effective technique for adaptation in real life applications, when transcriptions of vocal material are not available.

12 12 Speaker Adaptation Noise Condition AVG E.R. % MethodType CleanLNMNHN No- 90.271.955.016.658.4 27.0 LHN consSupv 95.490.181.233.775.1 56.3 LHN consUnsupv 93.785.874.830.771.3 49.6 Adaptation is done speaker by speaker Starting Models: Microphone 16kHz Denoising method is SNR dep. Ephraim-Malah spectral attenuation

13 13 Speaker Adaptation Adaptation is done speaker by speaker Starting Models: Microphone 16kHz Denoising method is SNR dep. Ephraim-Malah spectral attenuation

14 14 Comments Speaker adaptation is very effective on HDB. The error reduction achieved by Supervised Adaptation plus Ephraim-Malah noise reduction is quite large The main improvements are in noisy conditions As expected, unsupervised adaptation is inferior to supervised adaptation, due to the errors introduced by the ASR transcriptions, but still it is very relevant.

15 15 Workplan Selection of suitable benchmark databases (m6) Baseline set-up for the selected databases (m8) LIN adaptation method implemented and experimented on the benchmarks (m12) Experimental results on Hiwire database with LIN (m18) Innovative NN adaptation methods and algorithms for acoustic modeling and experimental results (m21) Further advances on new adaptation methods (m24) Unsupervised Adaptation: algorithms and experimentation (m33)


Download ppt "Advances in WP2 Chania Meeting – May 2007 www.loquendo.com."

Similar presentations


Ads by Google