Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSP05-06 - Auditory input processing 1 Auditory input processing Lecturer: Smilen Dimitrov Cross-sensorial processing – MED7.

Similar presentations


Presentation on theme: "CSP05-06 - Auditory input processing 1 Auditory input processing Lecturer: Smilen Dimitrov Cross-sensorial processing – MED7."— Presentation transcript:

1 CSP05-06 - Auditory input processing 1 Auditory input processing Lecturer: Smilen Dimitrov Cross-sensorial processing – MED7

2 CSP05-06 - Auditory input processing 2 Introduction The immobot base exercise Work on the auditory input Goal – sound source localization in 3D Setup: –PC –Two microphones –Sound card

3 CSP05-06 - Auditory input processing 3 Setup – microphone problems We need to use two microphones to obtain a stereo signal For regular PC microphones (like our Sandbergs): –Take note they are electret! –They demand +5V from the PC in order to work –All PC mic inputs follow this standard: although we have a tip-ring-sleeve jack connector, it is NOT a stereo jack. Thus a PC mic input will always show as mono (stereo button will be greyed out in Recording control of Windows mixer)

4 CSP05-06 - Auditory input processing 4 Setup – microphone problems We need to use two microphones to obtain a stereo signal For regular PC microphones (like our Sandbergs): Hence the connection cable below will NOT work (as it assumes that the electret connector is a stereo one)

5 CSP05-06 - Auditory input processing 5 Setup – microphone problems Hence, we will have to use : –a dedicated audio card, –with two microphone inputs, even if we want to use cheap electrets for stereo! One possible soundcard: M-Audio mobilePre USB

6 CSP05-06 - Auditory input processing 6 Setup – microphone problems Interfacing two electrets for stereo input: –would involve a schematic cable like below: (assuming we have a stereo plug mic input on the card)

7 CSP05-06 - Auditory input processing 7 Setup – microphone problems To avoid these problems with electrets, we are going to use capacitor microphones (Generis) Note that these microphones must be connected using an XLR cable (the M-Audio card has such mic inputs) Note that condenser/capacitor microphones demand a power supply – so called “phantom power” (the M-Audio card has such facility) Thus, we should make sure the sound card and the microphones are compatible.

8 CSP05-06 - Auditory input processing 8 Setup Setup for a PC: (In addition to the microphones and the sound card): 1.M-Audio MobilePre USB drivers 2.Max/MSP/Jitter Microphone parameters need not be specified in the algorithm discussed today.

9 CSP05-06 - Auditory input processing 9 Goal of the auditory processing algorithm Object detection: –the application needs to detect the presence of a new object whenever it enters the monitored environment (say, a sound louder that threshold) Object recognition: –Once a new object is detected, it needs to be classified to determine its type (e.g., a car versus a truck, a tiger versus a deer) (involves comparing sounds – spectrum signatures) Object tracking: –Assuming the new object is of interest to the application, it can be tracked as it moves through the environment. Tracking involves computing current location of the object and its trajectory Preprocess- audio Estimation of 3D location through ITD / cross-correlation Relation to the model we had for visual input processing –Not really applicable for the algorithm discussed, but could be – here we will directly do tracking

10 CSP05-06 - Auditory input processing 10 Goal of the auditory processing algorithm

11 CSP05-06 - Auditory input processing 11 Sound-source localization using ITD and cross-correlation Small comparison between stereo camera and microphones system –Camera – 2D sensor (2D array of photocells) –Single camera can give a vector of direction to tracked object –Two cameras can give a point (intersection of direction vectors – CPA) –Microphone – 1D sensor (senses values at a single point – corresponds to a single photocell in camera) –Single microphone cannot give any geometric information –Two microphones can only give azimuthal angle – which corresponds to a vector of direction, confined to the “horizontal” plane

12 CSP05-06 - Auditory input processing 12 Sound-source localization using ITD and cross-correlation Algorithm – computing the the time delay of arrival (TDOA) of the wave front at the two microphones –In biological terms this is the equivalent of the Interaural Time Difference (ITD) –We compute the lag of the wave at a specific point received at both microphones (the Interaural Phase Difference (IPD) ) –Must find the time difference between two identical points in the left and right sound signal – using cross-correlation

13 CSP05-06 - Auditory input processing 13 Sound-source localization using ITD and cross-correlation Cross-correlation – two arrays, representing the left and right audio signal: g and h – their correlation is also an array The length of the cross-correlation array is

14 CSP05-06 - Auditory input processing 14 Sound-source localization using ITD and cross-correlation Cross-correlation – in essence, what we are doing is taking one array, and “sliding” it across the another, finding the sum of the products between respective elements.

15 CSP05-06 - Auditory input processing 15 Sound-source localization using ITD and cross-correlation Cross-correlation – algorithm First, find the time increment between sampling: Assume the sound can be analyzed through the diagram below: Sound arriving at left channel, will arrive at right channel after crossing distance b – we know the speed of sound, so we can also calculate time difference

16 CSP05-06 - Auditory input processing 16 Sound-source localization using ITD and cross-correlation Cross-correlation – algorithm Assume the sound can be analyzed through the diagram below: Trigonometry:

17 CSP05-06 - Auditory input processing 17 Sound-source localization using ITD and cross-correlation Cross-correlation – algorithm Assume the sound can be analyzed through the diagram below: The time difference: –Where Δ = time between sound sampling,, and σ = the number of delay samples returned from the cross-correlation function.

18 CSP05-06 - Auditory input processing 18 Sound-source localization using ITD and cross-correlation Cross-correlation – algorithm Calc length of line a –Speed of sound v = 384m/s at room temperature Finally, calc the angle θ –Where c is a known distance between the microphones

19 CSP05-06 - Auditory input processing 19 Sound-source localization using ITD and cross-correlation When θ is finally computed, we obtain a direction vector, by rotating the unit vector in the horizontal plane (xz), around the vertical axis (y) for amount θ So, the vector DA with components (-sin θ, 0, cos θ) will represent the direction of detected audio source

20 CSP05-06 - Auditory input processing 20 Sound-source localization using ITD and cross-correlation Overview of the algorithm (architecture)

21 CSP05-06 - Auditory input processing 21 Sound-source localization using ITD and cross-correlation Problems with the approach We only retrieve a direction vector in a plane (azimuthal angle) – information about the “vertical” position of the sound source is lost 3D localization of audio as a 3D point is possible using two microphones, if some medium (that changes sound) is placed between the microphones (a “head”), and then a head-related transfer function is calculated.

22 CSP05-06 - Auditory input processing 22 Implementation in Max/MSP Will program own MSP object, to perform audio cross-correlation realtime – then proceed to vector calculation and display


Download ppt "CSP05-06 - Auditory input processing 1 Auditory input processing Lecturer: Smilen Dimitrov Cross-sensorial processing – MED7."

Similar presentations


Ads by Google