Presentation on theme: "Auditory scene analysis 2. This lecture: Gestalt organising principles Sequential grouping."— Presentation transcript:
Auditory scene analysis 2
This lecture: Gestalt organising principles Sequential grouping
Gestalt Principles of Organisation Gestalt psychology founded in the early 20 th century A group of psychologists: Max Wertheimer ( ), Wolfgang Köhler ( ) and Kurt Koffka ( ) formed the Gestalt school Gestalt theory – the perceptual whole is more than the sum of its parts Put forward a set of Gestalt grouping rules that describe which elements in an image belong together to form an object – mostly described in relation to vision
Gestalt Principles of Organisation These principles can also apply to hearing / auditory perception Application of these principles generally results in a grouping of the parts of the input sound that come from the same source - segregating those that don’t Discuss each of the principles separately Important: These principles work together – to arrive at a correct interpretation of the input sound – no single rule will always work
Gestalt Principles of Organisation Similarity Good Continuation Common Fate Closure The Figure-Ground phenomenon and Attention
Similarity Sounds are grouped into a single perceptual stream if they are similar in pitch, timbre, loudness or subjective location Demonstration 17: Failure of crossing trajectories to cross perceptually Falling and rising sequence are interleaved – tones from the rising and falling sequence are alternated in time. How easy is it to hear out each of the four standards Grouping by timbre and frequency region
Good Continuation changes in frequency, intensity, location or spectrum within a single sound source tend to be smooth and continuous rather than abrupt Smooth change implies a change within a single source Abrupt change – new stream – new source Asa demo 12 – Effects of connectedness on segregation
Good Continuation In this example, the tendency of a sequence of high and low tones to split into two streams is reduced when successive tones are connected by frequency glides Hear H1 (2000 Hz), L1 (614 Hz), H2 (1600 Hz), L2 (400 Hz) tones Connecting the tones, through frequency glides, helps prevent the sequence from segregating into separate streams Continuity helps hold auditory sequences together Streaming stronger in unconnected sequence
Common Fate Based on the fact that different frequency components arising from a single sound source usually vary in a highly coherent way. Tendency to start and finish together, change in intensity and frequency together Two or more frequency components in a complex sound are grouped together and perceived as part of the same source if they undergo the same kinds of changes at the same time.
Common fate For example, a group of frequency components in a complex sound that are frequency modulated at the same rate can be heard out as separate group from the other components Asa demo 24 Role of frequency micro-modulation in voice perception Frequency components in speech contain small fluctuations called micromodulation. These micromodulations are in all the harmonics of a vowel, and so they move in parallel
Common Fate Hear: pure tone, pure tone + harmonics (hear pure tone continuing on), micromodulation and vibrato is added to all harmonics causing them to fuse (pure tone not heard as a separate sound) Pure tone no longer heard as a separate sound – singing voice emerges This correlated change causes the harmonics to group into a coherent speech sound – example of common fate
Closure A sound may be temporarily masked by other sounds – a masked sound may be perceived as continuing behind the masker. Demonstrated through the continuity effect Asa demo 28 – Apparent continuity – does the tone appear to continue through the noise? Asa demo 29 – Perceptual continuation of a gliding tone through a noise burst
Perceptual organisation of sequences of sounds Sequential grouping (integration) – connecting over time – e.g. connecting of the notes of the same instrument together to create a melody – leads to the formation of auditory streams Stream segregation / fission – hear a rapid sequence of sounds – the sounds may be perceived as a single perceptual stream or they may split into a number of perceptual streams Streaming – to denote the processes determining whether one stream or multiple streams are heard.
Perceptual organisation of sequences of sounds Streaming can occur if the elements making up the sequence differ markedly in frequency, amplitude, location or spectrum. More difficult to judge the temporal order of a pair of elements when they are part of separate streams than when they are in the same stream. Asa demo 1 – Stream segregation in a cycle of six tones 3 high and 3 low tones – order: H1 (2500 Hz), L1 (350 Hz), H2 (2000 Hz), L2 (430 Hz), H3 (1600 Hz), L3 (550 Hz)
Stream segregation When the sequence of tones is played slowly we clearly hear the alternation of high and low tones – a single six note melody - easy to hear the temporal order of the tones When played fast we hear two streams, one high and one low – a pair of three note melodies - in this case it is more difficult to hear to order of the tones Stream segregation becomes stronger at faster tone rates – segregation affects the perceived melody
Stream segregation Asa demo 3 – Loss of rhythmic information as a result of stream segregation Triplets of tones separated by silences – HLH – HLH – HLH… - perception of a ‘galloping rhythm’ Loss of ‘galloping rhythm’ when streaming occurs – each stream has its own separate melody and rhythm Illustrates the importance of speed and frequency separation of sounds in the formation of streams
Stream segregation For perceptual segregation of a sequence of tones – played at a fast rate and large separations between the frequencies of the high and low tones. No segregation at slow speeds At high speeds there may be depending on frequency separation At high speeds – need large frequency separation in order for the sequence to break into two separate streams
Stream segregation Asa demo 5 – Segregation of a melody from interfering tones – note when you can identify the melody Asa demo 10 – Stream segregation based on spectral peak position How timbre differences promote segregation
Stream segregation Two tones with the same fundamental but different positions of spectral peaks (i.e. where in the frequency range is the most energy) – difference in timbre Duller tone – spectral peak at 300 Hz Brighter tone – spectral peak at 2000 Hz Tones alternated in a galloping rhythm which gradually speeds up Hear separate streams of brighter and duller tones?
Figure-Ground phenomenon and Attention Generally don’t attend to every aspect of the auditory input – certain parts are selected for conscious analysis Complex sound analysed into streams – we attend to one stream at a time – attended stream stands out perceptually – rest of sound becomes less salient separation into attended and unattended streams – ‘figure-ground phenomenon’. Attend to one conversation at a time at a party – other conversations form a background
Figure-Ground phenomenon and Attention Possible to switch attention from one conversation / melody to another, and we may be aware of other sounds, but is seems that one stream at a time is selected for complete conscious analysis Importance of changes – the listeners’ attention is usually drawn to aspects of the sound that are changing – it becomes figure while the relatively unchanging part(s) become background