Presentation is loading. Please wait.

Presentation is loading. Please wait.

SIPCom 8-4 Speech Processing, MM7 - Speech Synthesis - Speech Recognition (Part 1 of 3) Børge Lindberg lindberg@kom.aau.dk.

Similar presentations


Presentation on theme: "SIPCom 8-4 Speech Processing, MM7 - Speech Synthesis - Speech Recognition (Part 1 of 3) Børge Lindberg lindberg@kom.aau.dk."— Presentation transcript:

1 SIPCom 8-4 Speech Processing, MM7 - Speech Synthesis - Speech Recognition (Part 1 of 3) Børge Lindberg

2 Text-to speech Synthesis
Text analysis Prosody generation Sound generation Synthetic speech Lexicon & Rules Pitch & duration (stød) Diphone-database

3 Why is it so difficult ? Text nomalisation Morphological analysis
“kl 12-14”, “8-3=5”, “ ”, “mio”, “USA” Morphological analysis “periferien” vs. “skoleferien”, “hul” Syntactic analysis “en mand med hul røst dør bag en dør med hul i” Semantic analysis “The man fed her dog biscuits” Sound generation Transitions, time- and pitch scaling

4 Concatenative synthesis
test = /tEsd/ = /#t/ + /tE/ + /Es/ + /sd/ + /d#/ /#t/ /tE/ /Es/ /sd/ /d#/

5 Di-(tri)phone Database
database of male speaker Approx subword units (di- & triphones) Requires pitch-, di- and triphone segmentation

6 Input to the sound generator

7 Effect of scaling No scaling Time scaled + pitch scaled
+ energy + stød

8 More examples Normal High speaking rate, normal pitch
(aalb.wav) High speaking rate, normal pitch (fast.wav) Low speaking rate, normal pitch (slow.wav) Normal speaking rate, high pitch (light.wav) Normal speaking rate, low pitch (dark.wav)

9 Evaluation - intelligibility
32 test persons 156 stimuli in carrier sentence: “Det er <keyword>, de siger“

10 Evaluation - naturalness
32 test persons 155 stimuli


Download ppt "SIPCom 8-4 Speech Processing, MM7 - Speech Synthesis - Speech Recognition (Part 1 of 3) Børge Lindberg lindberg@kom.aau.dk."

Similar presentations


Ads by Google