Presentation is loading. Please wait.

Presentation is loading. Please wait.

Spectrogram & its reading

Similar presentations


Presentation on theme: "Spectrogram & its reading"— Presentation transcript:

1 Spectrogram & its reading
by Tae-Yeoub Jang

2 What is spectrogram? Begin to be used since 1940s
Another representation of frequency domain analysis The most popular way of representing spectral information 3 dimensional representation X-axis: Time Y-axis: Frequency Darkness (or color): Energy Reviving Sonus

3 Spectrogram example (color resolution of word “compute”)
Reviving Sonus

4 Spectrogram example (grayscale of word “compute”)
Reviving Sonus

5 Wideband vs. Narrowband spectrograms of the question "Is Pat sad, or mad?" The 5th, 10th and 15th harmonics have been marked by white squares in two of the vowels Reviving Sonus

6 Types of spectrogram Wideband spectrogram Narrowband spectrogram
better time resolution eg) 15 msec window, 1 msec shift, 125 Hz bandwidth Narrowband spectrogram better frequency resolution eg) 50 msec window, 1 msec shift, 40 Hz bandwidth Reviving Sonus

7 Advantages & Disadvantages
Time alignment Disadvantages Less reliable than waveform Reviving Sonus

8 Vowel Spectrogram Formant frequencies are critical cues for vowel distinction F1: Height high vowels: low F1 F2: Backness back vowels: low F2 Reviving Sonus

9 Example formant frequencies of English monophthongs
F3 2900 2550 2490 2640 2380 2300 2500 2390 F2 2250 1900 1770 1660 1100 1030 870 1500 1190 F1 280 400 550 690 710 450 310 900 640 Reviving Sonus

10 "heed, hid, head, had, hod, hawed, hood, who'd" (a male speaker, American English)
Reviving Sonus

11 Consonant Spectrogram
General Acoustic structure more complicated than vowels Adjacent sounds (especially vowels) convey important information  locus High frequency characteristics  especially for fricatives and affricates Reviving Sonus

12 What is LOCUS Information of formant transition from vowels into obstruents or from obstruents into vowels The target frequency that each formant transition is heading toward as an obstruction is made, or the frequency the transition comes as the obstruction is released The characteristic of the consonantal place and manner  roughly the same in different vowel contexts Reviving Sonus

13 Stops General Fairly distinct locus for each place Burst
Silence during the closure (only at syllable onset position) Virtually no difference during the closure Reviving Sonus

14 Stops (cntd.) Voicing distinction
voiced: vertical striations for voiced sounds, less abrupt burst, frequently weakened to be like fricatives or approximants voiceless: generally abrupt burst at higher frequency area Reviving Sonus

15 Stops (cntd.) Place distinction bilabial alveolar velar
relatively low F2, F3 locus  rising into and falling out of vowel weak and spread vertical lines alveolar F2 locus about 1800 Hz Strong vertical lines velar Velar pinch: vowels F2, F3 merging often double burst long formant transitions Reviving Sonus

16 Stops (cntd.) Manner distinction Silence duration, VOT, vowel F0
aspirated short long high tense lax med low Reviving Sonus

17 Examples -- “a bab, a dad, a gag”
Reviving Sonus

18 Place dependent loci Reviving Sonus

19 Fricatives General Random noise pattern especially in high frequency regions Place distinction Labiodental [f, v]: rising locus into the following vowel Dental [, ð]: major energy above 6000Hz Alveolar [s, z]: major energy above 4000Hz Alveopalatal [š, ž ]: major energy above 6000Hz Glottal [h]: the trace of formant frequencies of neighbouring vowels Reviving Sonus

20 Fricatives (cntd.) Weak vs. strong Strong [s, z, š, ž ]: darker bands
Weak [f, v, , ð ]: spread and fainter Voiced [v, ð ]: often so weak and confused with nasals or approximants Cues to tell [] from [f]: higher formants of [] fall into adjacent vowels Reviving Sonus

21 Example – “fie, thigh, sigh, shy”
Reviving Sonus

22 Example – “ever, weather, fizzer, pleasure”
Reviving Sonus

23 Nasals General Place distinction
Formants similar to vowels but fainter Very low F1 (about 250Hz), F2 (about 2500Hz), and F3 (about 3250Hz) Place distinction bilabial [m]: downward F2, F3 locus alveolar [n]: less amount of F2 transition velar [ŋ ]: velar pinch Reviving Sonus

24 Examples -- “a Pam, a tan, a kang”
Reviving Sonus

25 Liquies & Approximants
General Formants similar to vowels but fainter (especially at high frequency regions) Approximately F1(250Hz), F2(1200Hz), F3(2400Hz) Change in formant structure Reviving Sonus

26 Liquids & Approximants (cntd.)
Phone specific properties Labial glide [w]: very low F1, F2 ( Hz|) and gets too close to each relatively low F3 rapid falloff of spectral amplitude Palatal glide [y]: extremely low F1 extremely high F2, F3 Reviving Sonus

27 Liquids & Approximants (cntd.)
Phone specific properties (cntd.) Flap [Ր]: soft burst, short duration Retroflex [r]: F3 dipping down close to F2 General lowering of F3, F4 Lateral [l]: Low F1, F2 (approx. F1 250Hz, F2 1200Hz) usually substantial energy in the high F region Reviving Sonus

28 Example – “led, red, wed, yell”
Reviving Sonus

29 Final remarks Spectrogram is not the only cue for acoustic distinction of speech sounds Very often, the waveform is more reliable Reviving Sonus

30 References & Links Reviving Sonus


Download ppt "Spectrogram & its reading"

Similar presentations


Ads by Google