Presentation is loading. Please wait.

Presentation is loading. Please wait.

Meena Ramani 04/10/06 EEL 6586 Automatic Speech Processing.

Similar presentations


Presentation on theme: "Meena Ramani 04/10/06 EEL 6586 Automatic Speech Processing."— Presentation transcript:

1

2 Meena Ramani 04/10/06 EEL 6586 Automatic Speech Processing

3 Topics to be covered The incredible sense of hearing 1 Lecture 1: The incredible sense of hearing 1 Anatomy Perception of Sound The incredible sense of hearing 2 Lecture 2: The incredible sense of hearing 2 Psychoacoustics Hearing aids and cochlear implants

4 Lecture 1: The incredible sense of hearing “ Behind these unprepossessing flaps lie structures of such delicacy that they shame the most skillful craftsman" -Stevens, S.S. [Professor of Psychophysics, Harvard University]

5 Why study hearing? Best example of speech recognition –Mimic human speech processing Hearing aids/ Cochlear implants Speech coding

6 The stapes or stirrup is the smallest bone in our body. –It is roughly the size of a grain of rice ~2.5mm Eardrum moves less than the diameter of a hydrogen atom –For minimum audible sounds Inner ear reaches its full adult size when the fetus is 20-22 weeks old. The ears are responsible for keeping the body in balance Hearing loss is the number one disability in the world. –76.3% of people loose their hearing at age 19 and over Interesting facts

7 Specifications Frequency range: 20Hz-20kHz Dynamic range: 0-130 dB JND frequency: 5 cents JND intensity: ~1dB Size of cochlea: smaller than a dime

8 ANATOMYANATOMY

9 Outer ear Focuses sound waves (variations in pressure) into the ear canal Pinna size: Inverse Square Law Larger pinna captures more of the wave Elephants: hear low frequency sound from up to 5 miles away Human Pinna structure: Pointed forward & has a number of curves Helps in sound localization More sensitive to sounds in front Dogs/ Cats- Movable Pinna => focus on sounds from a particular direction Pinna /Auricle Auditory Canal

10 Interaural Time Difference (ITD) Interaural Intensity Difference (IID) Horizontal localization Vertical localization Sound Localization Outer ear Pinna /Auricle Auditory Canal Is sound on your right or left side?

11 Interaural differences - The signal needs to travel further to more distant ear - More distant ear partially occluded by the head Two types of interaural difference will emerge - Interaural time difference (ITD) - Interaural intensity difference (IID)

12 Illustration of interaural differences Left ear Right ear sound onset leftright time

13 Left ear Right ear sound onset time arrival time difference Illustration of interaural differences

14 Left ear Right ear sound onset time ongoing time difference Illustration of interaural differences

15 Left ear Right ear sound onset time intensity difference Illustration of interaural differences

16 Interaural time differences (ITDs) Threshold ITD  10-20  s (~ 0.7 cm) Interaural intensity differences (IIDs) Threshold IID  1 dB Thresholds

17 Interaural time differences (ITDs)  Low frequencies Up to around 1500 Hz; sensitivity declines rapidly above 1000 Hz Smallest phase difference corresponds to the true ITD Interaural intensity differences (IIDs)  High Frequencies The amount of attenuation varies across frequency below 500 Hz, IIDs are negligible (due to diffraction) IIDs can reach up to 20 dB at high frequencies DUPLEXTHEORYDUPLEXTHEORY

18 Pinna Directional Filtering Horizontal localization Vertical localization Sound Localization Pinna amplifies sound above and below differently Curves in structure selective amplifies certain parts of the sound spectrum Outer ear Pinna /Auricle Auditory Canal Is sound above or below?

19 Pinna /Auricle Auditory Canal Closed tube resonance: ¼ wave resonator Auditory canal length 2.7cm Resonance frequency ~3Khz Boosts energy between 2-5Khz upto 15dB Outer ear

20 ANATOMYANATOMY

21 Middle Ear Impedance matching –Acoustic impedance of the fluid is 4000 x that of air –All but 0.1% would be reflected back Amplification –By lever action < 3x –Area amplification [55mm 2  3.2mm 2 ] 15x Stapedius reflex –Protection against low frequency loud sounds –Tenses muscles  stiffens vibration of Ossicles –Reduces sound transmitted (20dB) Eardrum Ossicles Oval window Pressure variations are converted to mechanical motion Eardrum  Ossicles  Oval Window Ossicles: Malleus, Incus, Stapes

22 ANATOMYANATOMY

23 Inner Ear Semicircular Canals Cochlea Body's balance organs Accelerometers in 3 perpendicular planes Hair cells detect fluid movements Connected to the auditory nerve

24 Cochlea is a snail-shell like structure 2.5 turns 3 fluid-filled parts: Scala tympani Scala Vestibuli Cochlear duct (Organ of Corti) (1)Organ of Corti (2)Scala tympani (3)Scala vestibulli (4)Spiral ganglion (5)auditory nerve fibres Semicircular Canals Cochlea Inner Ear

25 Semicircular Canals Cochlea Organ of Corti Basilar membrane Inner hair cells and outer hair cells (16,000 -20,000) IHC:100 tiny stereocilia The body's microphone: Vibrations of the oval window causes the cochlear fluid to vibrate Basilar membrane vibration produces a traveling wave Bending of the IHC cilia produces action potentials The outer hair cells amplify vibrations of the basilar membrane Inner Ear

26 The cochlea works as a frequency analyzer It operates on the incoming sound’s frequencies

27 Place Theory Each position along the BM has a characteristic frequency for maximum vibration Frequency of vibration depends on the place along the BM At the base, the BM is stiff and thin (more responsive to high Hz) At the apex, the BM is wide and floppy (more responsive to low Hz) 32-35 mm long 4mm 2 1mm 2

28 Tuning curves of auditory nerve fibers Response curve is a BPF with almost constant Q(=f 0 /BW) To determine the tonotopic map on Cochlea Apply 50ms tone bursts every 100ms Increase sound level until discharge rate increases by 1 spike Repeat for all frequencies

29 Auditory Neuron Carries impulses from both the cochlea and the semicircular canals Connections with both auditory areas of the brain Neurons encode –Steady state sounds –Onsets or rapidly changing frequencies Auditory Area of Brain

30 Auditory Neurons Adaptation At onset, auditory neuron fiber firing increases rapidly If the stimulus remains (a steady tone for eg.) the rate decreases exponentially Spontaneous rate: Neuron firings in the absence of stimulus Neuron is more responsive to changes than to steady inputs

31 Perception of Sound Threshold of hearing –How it is measured –Age effects Equal Loudness curves Bass loss problem Critical bands Frequency Masking Temporal Masking

32 Threshold of Hearing Hearing area is the area between the Threshold in quiet and the threshold of pain

33 Bekesy Tracking STEPS: Play a tone Vary its amplitude till its audible Then tone’s amplitude is reduced to definitely inaudible and the frequency is slowly changed Continu\e

34 Threshold variation with age Presbycusis Hearing sensitivity decreases with age especially at High frequencies Threshold of pain remains the same Reduced dynamic range 32-35 mm long 4mm 2 1mm 2

35 Equal Loudness Curves Loudness is not simply sound intensity! Factor of ten increase in intensity for the sound to be perceived as twice as loud.

36 The Bass Loss Problem Eg. Rock music Too low  no bass Too high  too much bass For very soft sounds, near the threshold of hearing, the ear strongly discriminates against low frequencies. For mid-range sounds around 60 phons, the discrimination is not so pronouncedphons For very loud sounds in the neighborhood of 120 phons, the hearing response is more nearly flat.

37 Elephants Sound Production A a typical male elephant’s rumble is around an average minimum of 12 Hz, a female's rumble around 13 Hz and a calf's around 22 Hz. Produce sounds ranging over more than 10 octaves, from 5 Hz to over 9,000 Hz Produce very gentle, soft sounds as well as extremely powerful sounds. (112dB recorded a meter away) Hearing Wider tympanic membranes Longer ear canals (20 cm) Spacious middle ears. Low frequency detection


Download ppt "Meena Ramani 04/10/06 EEL 6586 Automatic Speech Processing."

Similar presentations


Ads by Google