Presentation is loading. Please wait.

Presentation is loading. Please wait.

SIPCom 8-4 Speech Processing, MM7 - Speech Synthesis - Speech Recognition (Part 1 of 3) Børge Lindberg

Similar presentations


Presentation on theme: "SIPCom 8-4 Speech Processing, MM7 - Speech Synthesis - Speech Recognition (Part 1 of 3) Børge Lindberg"— Presentation transcript:

1 SIPCom 8-4 Speech Processing, MM7 - Speech Synthesis - Speech Recognition (Part 1 of 3) Børge Lindberg

2 Text-to speech Synthesis Text analysis Prosody generation Sound generation Text Synthetic speech Lexicon & Rules Pitch & duration (stød) Diphone- database

3 Why is it so difficult ? –Text nomalisation “kl 12-14”, “8-3=5”, “ ”, “mio”, “USA” –Morphological analysis “periferien” vs. “skoleferien”, “hul” –Syntactic analysis “en mand med hul røst dør bag en dør med hul i” –Semantic analysis “The man fed her dog biscuits” –Sound generation Transitions, time- and pitch scaling

4 Concatenative synthesis test = /tEsd/ = /#t/ + /tE/ + /Es/ + /sd/ + /d#/ /#t//tE//Es//sd//d#/

5 Di-(tri)phone Database database of male speaker Approx subword units (di- & triphones) Requires pitch-, di- and triphone segmentation

6 Input to the sound generator

7 Effect of scaling No scaling Time scaled + pitch scaled + energy + stød

8 (aalb.wav) Normal More examples (fast.wav) High speaking rate, normal pitch (slow.wav) Low speaking rate, normal pitch (light.wav) Normal speaking rate, high pitch (dark.wav) Normal speaking rate, low pitch

9 Evaluation - intelligibility 32 test persons 156 stimuli in carrier sentence: “Det er, de siger“

10 Evaluation - naturalness 32 test persons 155 stimuli


Download ppt "SIPCom 8-4 Speech Processing, MM7 - Speech Synthesis - Speech Recognition (Part 1 of 3) Børge Lindberg"

Similar presentations


Ads by Google