Presentation is loading. Please wait.

Presentation is loading. Please wait.

EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 1 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision.

Similar presentations


Presentation on theme: "EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 1 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision."— Presentation transcript:

1 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 1 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing EE2F1 Speech & Audio Technology Lecture 4 Martin Russell Electronic, Electrical & Computer Engineering School of Engineering The University of Birmingham

2 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 2 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing The human auditory system taken from J N Holmes, “Speech Synthesis and Recognition”, Van Nostrand Reinhold (1988)

3 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 3 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing The cochlea Australian National University – http::/online.anu.edu.au/ITA/ACAT/drw/PPofM/hearing/hearing3.html

4 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 4 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Basilar membrane dynamics School for advanced studies, Triste, Italy – http::/poirot.sissa.it/multidisc/cochlea/utils/basilar.htm

5 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 5 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing An experiment  First play two tones: A and B  Then play a third and fourth tone: C and D i  Vary D i  When do you perceive the difference between C and D i to be the same as between A and B ???

6 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 6 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Experiment  A B C D 1  A B C D 2  A B C D 3  A B C D 4

7 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 7 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Answer:  In theory, should have chosen: –A (500Hz) B (600Hz) C (1500Hz) D 2 (1680Hz)  Equal distance between A – B and C – D 2 on the perceptual mel frequency scale  A B C D 2

8 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 8 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing The mel scale A B C D 2

9 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 9 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Masking  Frequency resolution of the ear  Loud sounds mask perception of quieter sounds with similar frequency  Many different psycho-acoustic experiments  Exploited in MP3 coding

10 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 10 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Masking Experiment  Low level pure tone (sinusoid) mixed with narrow band of random noise with higher level and same centre frequency  Perception of tone masked by noise  Now move centre frequency of noise  How loud does the noise need to be to mask the tone? frequency ?

11 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 11 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Masking experiment 1kHz frequency Level dB SPL Psycho-physical tuning curve Auditory filter

12 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 12 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Auditory filterbank Frequency (Hz) 1kHz BW ~ 200Hz 4kHz BW ~ 1kHz

13 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 13 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Lessons from psycho- acoustics  Human speech perception begins with frequency analysis on the basilar membrane  Individual point on the basilar membrane can be modelled as band-pass filter – critical bandwidths  Masking effects: loud sounds mask quieter sounds with similar frequency  Frequency is not perceived on a linear scale – hence use of non-linear perceptual frequency scales: mel scale, bark scale,…  Loudness perceived on logarithmic scale

14 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 14 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Introduction to acoustics  The loudness of a sound, or its intensity is perceived on an approximately logarithmic scale  So, we measure it on a log scale, called decibels (after A.G. Bell):

15 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 15 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing What are sound waves?  Sound waves are small pressure fluctuations  Propagate at the speed of sound  In free space, spread according to the inverse-square law  In a duct, travel as plane waves

16 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 16 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Wave travelling in a tube

17 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 17 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Wave travelling in a tube

18 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 18 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Wave travelling in a tube

19 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 19 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Wave travelling in a tube

20 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 20 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Wave travelling in a tube

21 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 21 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Resonances of a closed tube

22 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 22 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Resonances of an open tube

23 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 23 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Experiment: tin whistle  c 0 = 343.4 m/s (speed of sound in air at 20 o C)  l=0.27m (length of tube)  Predicted resonances at:

24 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 24 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Tin whistle

25 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 25 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Tin whistle experiment Measured f 1 = 1000/1.55 = 645.16Hz

26 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 26 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Resonance in cavities V L r PVPV PAPA

27 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 27 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Example: wine bottle  V = 0.75l=0.00075m 3  L = 0.007m  r = 0.0085m  c = 343.4m/s V L r

28 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 28 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Wine bottle experiment

29 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 29 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Wine bottle experiment F = 1000/9 =111Hz

30 EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 30 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision Processing Summary  Review of human hearing  Basic acoustics  Open and closed acoustic tubes  Cavity resonators


Download ppt "EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 1 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision."

Similar presentations


Ads by Google