Presentation on theme: "Sound Perception The ear, the brain & psychoacoustics."— Presentation transcript:
Sound Perception The ear, the brain & psychoacoustics
Plan About sound... How does the hear work? Absolute thresholds of hearing Auditory masking Sound spatialisation Summary
ABOUT SOUND Some definitions, and reminders about the nature of sound
Sound "Sound is a mechanical wave that is an oscillation of pressure transmitted through a solid, liquid, or gas, composed of frequencies within the range of hearing and of a level sufficiently strong to be heard, or the sensation stimulated in organs of hearing by such vibrations” – Wikipedia.
Sound (cont’d) Sound is a pattern of compression and depression of the air –Record it using microphones –Perceive it from our ears –Generate it by speaking or using speakers Energy per m2 decreases with the square of distance...
Sound is a waveform Sound is a waveform, Can be reflected when hitting a non- transmissive surface If the surface is flat, reflected in cohesive way Otherwise depends on frequency and surface texture Sound proof studio wall, for absorbing high frequencies
Effect of weather Because sound is carried by air compression and decompression, sound travelling can be affected by temperature Eg. Air temperature near the ground is cooler Eg. Wind
THE EAR How does the ear work? How do we perceive sound? How does it relate to sound synthesis techniques?
The ear pinna
Outer ear The outer ear is composed of: –The pinna (visible part) –Auditory canal (meatus) –Tympanic membrane (eardrum) The pinna –significantly modifies incoming sound (esp. High frequencies) –is important for sound localization pinna sound
(Outer ear in bats) Bats rely heavily on sonar for localization, navigation and hunting. Generate high pitched ultrasounds and listen for echoes. Highly refined sound perception & localisation (accurate enough to catch a bug in flight!)
The middle ea r Sounds make the eardrum vibrate Vibrations are transmitted through middle ear by 3 bones: –Malleus (hammer) –Incus (anvil) –Stapes (stirrup) Stapes is connected to a membrane called the oval window Transmits sounds to the inner ear. Tensor tympani Auditory canal Outer ear Tympanic membrane Incus (anvil)Malleus (hammer) Stapes (stirrup) Eustachian tube Inner ear
The middle ear (function?) Efficient transfer of sound from the air to the fluid in the cochlea (inner ear) –Otherwise, sound would mostly be reflected. –The oval window’s resistance is higher than air –... Also higher than the eardrum’s. –... But surface is smaller! – middle ear is an efficient transfer mechanism (like in a bicycle), esp. within 500Hz-4kHz
The middle ear (function - cont’d) “Barany (1938) suggested that middle ear reduces transmission of bone-conducted sound to the cochlea” Moore (2003) –Internally generated sounds: Chewing, flow of air, blood, creaking of joints... –Would cause masking... –Middle ear only transmits differential movements between ossicles and skull (when skull vibrates, spates vibrates in sync does not transmit)! –Note: birds & reptiles usually swallow food whole...
The middle ear Acoustic reflex Muscles in the middle ear (tensor tympani) can contract to pull the stapes away from the oval window, and therefore reduce drastically transmission. The ear reacts to loud sound by contracting the ossicles muscles and attenuating following sounds. Usually needs 70-90dB sounds to trigger. Protects the ear against loud sounds BUT slow, doesn’t help against sudden loud noises!
The inner ear The inner ear is also called the labyrinth Vestibular system: –balance –Spatial orientation –Horizontal, posterior and superior canals... Cochlea is used for hearing... Posterior canal Superior canal Horizontal canal utricle cochlea vestibule saccule
The Cochlea Shaped like a spiral (no functional reason – space economy?) Filled with uncompressible fluids Rigid, bony walls transmit sound pressure without loss! Divided across length by two membranes –Basilar membrane –Reissner membrane
Basilar membrane When the oval window moves –Round window moves in opposite manner –Basilar membranes moves too –Waves propagates through the BM
Basilar membrane (cont’d) Mechanical properties of BM varies across length: –Narrow & stiff at base –Wider & less stiff at apex position of the peak depends on frequency! –High frequency: near the base –Low frequency: near the apex
Basilar membrane (cont’d)
BM acts as a (imperfect) Fourier analyser! Frequency that gives best response at a point of the BM is called Characteristic Frequency (CF) for this point. In response to a steady frequency, all points vibrate at the same frequency some point with greater amplitude.
Organ of Corti
Organ of Corti & Hair cells Between BM & tectorial membrane hair cells which form the Organ of Corti –Inner hair cells (12,000 cells – 140 hairs each) –Tunnel of Corti –Outer hair cells (3,500 cells – 40 hairs each) (Hairs are called stereocilia)
Stereocilia (cont’d) Transforms mechanical movements into neural activity Stereocilia are joined by fine links (“tip links”) Deflection of the stereocilia apply tension to those links opens “transduction channels” flow of potassium ions, voltage alteration, etc.
Inner hair cells Each inner hair cell is connected ~ 20 neurons Most (all?) information is transferred by inner hair cells.
Outer hair cells What about the outer hair cells? Actively influence mechanics of the cochlea –High sensitivity –Sharp tuning –Evidenced by experiments with drugs that affect outer hair cells’ performance. Control from above? 1,800 efferent nerve fibers! hearing is not a passive phenomenon, even earlier stages are influenced by higher brain areas!
Otoacoustic emissions Experiments by Kemp (1978): if a click is sounded next to the ear, it is possible to detect a sound coming out of the ear (using a microphone sealed into the ear canal) –Reflexions? –Not only! Sound can be heard with delays from 5 to 60ms (Kemp echoes). –Relative level greater at low emissions (grows by 3dB for each 10dB of input) –May be stronger than the actual input! Disappears even with moderate cochlear pathologies
Neurons in the auditory nerve Approximately 30,000 neurons in each auditory nerves (left,right) Study this using fine tipped micro-electrodes to record voltage in single cells Most neurons fire spontaneously (0-150Hz) Most neurons are tuned to specific frequencies. Phase locking: spikes occur at specific phase of the stimulating waveform temporal regularity.
ABSOLUTE THRESHOLDS Some comments on the absolute limits of hearing
Minimum Audible Pressure An absolute threshold is the minimum detectable level of a sound, in the absence of other external sounds. Depends on set-up: important to define precisely how the intensity is measured –Probe microphone (ideally close to the eardrum) Usually using headphones. Threshold is called the Minimum Audible Pressure (MAP)
Minimum Audible Field Alternatively, loudspeakers in large anechoic chamber (walls, floor, ceiling are highly sound absorbing) –Measurement made after the subject is removed, at the point occupied by the center of the listener’s head –Minimum Audible Field (MAF)
MAF vs. MAP MAF is binaural, MAP is monaural MAF factors in the head, pinna & meatus effects (broad resonance) Thresholds increase rapidly at very high and very low frequencies transmission characteristics of the middle ear!
Absolute Thresholds Depends on people: individuals may vary by up to 20dB and have “normal” hearing. Highest audible frequency depends on age: kids up to 20kHz, adults about 15kHz
Hearing loss MAP can be used for generating Audiograms and evaluating hearing loss. Two types of hearing loss: –Conductive: middle ear problems reducing sound transmission to cochlea (eg, infection: otitis media, bone growth around stapes or oval window, wax in the ear canal) elevation in absolute threshold Help: hearing aid, surgery –Sensorineural: defects in cochlea or auditory nerve or higher centers in the brain. Extent of the loss increases with frequency (esp. Elderly) Makes it hard to understand speech, esp. In noisy environment. Generally, no surgery is possible.
Conductive hearing loss middle ear problems reducing sound transmission to cochlea –eg, infection (otitis media), bone growth around stapes or oval window, wax in the ear canal – elevation in absolute threshold –Help: hearing aid, surgery
Sensorineural hearing loss Due to defects in: –cochlea –auditory nerve –higher centers in the brain. Extent of the loss increases with frequency (esp. Elderly) Makes it hard to understand speech, esp. In noisy environment. Generally, no surgery is possible. UK data, using frequencies 0.5,1,2, and 4kHz: 61-71: 51% with loss >20dB, 30% > 40dB 71-80: 74% > 20dB and 30% > 40dB Using 4,6, and 8kHz 71-80: 98% > 20dB and 81% > 40dB
Temporal effect in Absolute Threshold Absolute thresholds of sounds depend on duration (Exner, 1876) –For sounds > 500ms, no effect –For sounds < 200ms, minimal sound intensity increases as duration decreases The ear appear to integrates a stimulus energy over time –In practice: (I-I L ).t = I L. I L threshold intensity for a long sound (>500ms) constant for the auditory system integration time
AUDITORY MASKING Limits of the human hearing, how one sound can hide another...
Auditory masking The human auditory system has a limited capacity to resolve sinusoidal components of complex sounds –Eg, if a we listen to two tuning forks, one tuned at C (262Hz) and the other at A (440Hz), we hear two separate tones, each with its own pitch. –Yet, one sound can be obscured, or rendered inaudible by other sounds (music from a car radio may mask the car’s engine – or conversely!)
Auditory Masking (definition) Definition: “1. The process by which the threshold of audibility for one sound is raised by the presence of another (masking) sound; 2. The amount by which the threshold of audibility of a sound is raised by the presence of another (masking) sound. The unit customarily used is the decibel.” American Standards Association
Auditory masking A sound is more easily masked by another having a similar frequency. – Limitations of the Basilar membrane – Limits of frequency selectivity Masking is very dependent on time: –Simultaneous presentation of the sounds –Forward masking –Backward masking
The Critical Band Fletcher (1940) suggested the auditory system works as a bank of bandpass filters, with overlapping passbands, based on the BM. When detecting a sound in a noisy background, a listener is assumed to make use of the filter with the closest center frequency. Threshold is determined by the amount of noise passing through this filter “power spectrum” view on masking (Patterson & Moore, 1986)
Critical Band (cont’d) Fletcher’s idea: Only a narrow band of frequencies surrounding the tone contribute to masking the tone When the noise just masks the tone, the power of the tone, divided by the power of the noise is a constant K Assuming rectangular bandpass filters (not true, but convenient!), we have: –P/(W.N0) = K –Where W is the bandwidth, N0 is the noise power, and P is the tone power.
Shape of the Auditory Filter Theory is sound, but square bandpass assumption is wrong measuring psychophysical auditory tuning curves Notched noise method to remove off- frequency components
Auditory masking curves
Auditory masking curves show how much masking occurs at which frequencies Useful for efficient compression: no need to encode a frequency if we can’t hear it! –Eg, MP3
Contralateral masking Another form of masking is when the signal is presented to one ear, and the noise is presented to the other. This is called contralateral masking When both sound and noise are presented to the same ear, this is called ipsilateral masking
Temporal Masking This occurs when the masking and masked signals are not simultaneous –If the masking sound precedes the masked sound, it is called forward masking. –If the masking sound follows the masked sound, it is called backward masking –Masking effectiveness attenuates exponentially from the onset and offset of the masker Onset attenuation ~ 20ms. Offset attenuation ~100ms. Note: different to the ear’s acoustic reflex (reduce ear’s sensitivity after loud sound)
PERCEPTION OF LOUDNESS Psychoacoustic perception of loudness, versus soud pressure.
Loudness Fletcher Munson (1933) –Subjects listen to pure tones Various frequencies amplitude inc. per 10dB Robinson & Dadson (1956) –more accurate –Basis for standard ISO-226 Perceived Loudness (Phons) –1 Phon = 1dB 1kHz British Standard BS ISO 226 (2003) (source wikipedia)
SOUND SPATIALIZATION Sound localization in space and stereo hearing...
Sound Spatialization Localization accuracy is 1 degree for sources in front of the listener and 15 degrees for sources to the sides. Humans can discern interaural time differences of 10 microseconds or less
Cues for Localization Interaural time difference –The sound will reach the ears at different time, depending on its location in space Phase delay at low frequencies Group delay at high frequencies Interaural level differences –The sound will be louder in one hear compared to the other. Distance can be estimated from –spectrum: high frequencies are attenuated more quickly lower loudness –Movement: parallax depends on distance
Cues for Localization In addition, the pinnae modifies the spectra of incoming sounds in a way that depends on the angle of incidence of the sound to the head Head+pinna form a direction- dependent filter Measured by comparing the spectrum of the sound source vs. the spectrum reaching the eardrum: Head Related Transfer Function (HRTF) High frequencies (>6kHz) interact especially strongly with the pinna.
SYNESTHESIA An unusual case....
Synesthesia Synesthesia: Perceiving one sense as another eg. Sound as colors. Prevalence: unknown, could be as high as 1 in 23 (Simner J, Mulvenna C, Sagiv N, et al. (2006)) This is believed to have a neurological basis (FMRI evidence) Famous synesthetes include David Hockney, who perceives music as color, shape, and configuration, and who uses these perceptions when painting opera stage sets – but not while creating his other artworksDavid Hockney
Synesthesia Some facts: –Synesthesia is involuntary and automatic –Synesthetic perceptions are spatially extended (sense of location) –Synesthetic percepts are consistent and generic –Synesthesia is highly memorable –Synesthesia is laden with affect Richard Cytowic (2002,2003,2009) Richard Cytowic
Plan About sound... How does the hear work? Absolute thresholds of hearing Auditory masking Sound spatialization Summary
Additional Reading Brian C.J. Moore (2003) An Introduction to the Psychology of Hearing, Academic Press.