Presentation on theme: "Krzysztof Marasek Summer 2002 Katedra Multimediów PJWSTK Acoustic transduction zSpeech sounds - rapid variations of air pressure and velocity around their."— Presentation transcript:
Krzysztof Marasek Summer 2002 Katedra Multimediów PJWSTK Acoustic transduction zSpeech sounds - rapid variations of air pressure and velocity around their normal values zsound field - variation of air density and pressure are functions of time and space and propagate as acoustic wave zlet assume the air to be homonogeus in a room zspeed of acoustic wave propagation depends on temperature (in K): zwave equation describes propagation of sound, if pressure is represented by a scalar field p(a,t), a=[x y z] T
Krzysztof Marasek Summer 2002 Katedra Multimediów PJWSTK Wave propagation (2) one of the solutions of wave equation is the monochromatic plane wave of frequency f= zwhere A is the wave amplitude and k=[k x,k y,k z ] T is the wavenumber vector and has a direction normal to the propagating wavefront. Distance c/f is called wavelength and describes spatial period of propagating wave in spherical coordinates (r, sound pressure depends only on the distance r from the source zany sound field can be expressed as superposition of elementary plane and spherical waves
Krzysztof Marasek Summer 2002 Katedra Multimediów PJWSTK Formants
Krzysztof Marasek Summer 2002 Katedra Multimediów PJWSTK Room acoustics zReflections from surfaces, diffusion and diffraction by objects inside the room - reverberation effect zT 60 - reverberation time, defined as the time needed for the acoustic power of the signal to decay by 60 dB after sound source is abruptly stopped zT 60 is nearly independent from the listening position in given enclosure, it can be approximated by Sabine formula: where V is room volume in m 3, S is total surface area of the room in m 2 and is the average absorption coefficient of the surfaces zreverberation times up to 1 s (for frequencies 500-1000 Hz) do not cause any loss in speech intelligibility zimpulse response h(t): described the path between source and receiver, all reflections zearly reflections - perceived if delay > 50 ms, shorter perceived as part of the direct sound
Krzysztof Marasek Summer 2002 Katedra Multimediów PJWSTK Room acoustics (2) zspeech intelligibility: “Deutlichkeit” index, centre of gravity, modulation index
Krzysztof Marasek Summer 2002 Katedra Multimediów PJWSTK Room Impulse Response zSimplest method: apply impulse excitation and observe the response of the system: balloon popping, gunshots, but it may not guarantee SNR and flat frequency response, also overload possible zto overcome these difficulties: excitation using maximum length pseudo-random sequences (Schroeder, 1979) - flat spectrum, auto-correlation of the sequence of length L becomes a close approximation of delta function when L is large: zthen the room impulse response can be simply obtained by reproducing the acoustic signal corresponding to the sequence and then by simply cross- correlating the excitation sequence p(n) with the signal y(n) acquired by the sensor zsound ray concept- diffracted by edges, scattered by small obstacles
Krzysztof Marasek Summer 2002 Katedra Multimediów PJWSTK Impulse response measurement zHow can it be measured? Speecon,2001
Krzysztof Marasek Summer 2002 Katedra Multimediów PJWSTK Microphones zConverts the acoustic energy of sound into a corresponding electrical energy; usually realized with a diaphragm whose movements are produced by sound pressure and vary the parameters of an electrical system (resistance, capacity, etc) zcharacterized by xfrequency response (flatness in speech sounds range) xsignal-to-noise ratio (SNR) ximpedance (better if low, connected to low impedance amplifier gives lower hum and electrical noise), usually specified for 94 dB SPL xsensitivity: output voltage (in milivolts) or power (in dBm) xdirectional pattern: cardioid (supercardioid, hyper-, shotgun, etc), bidirectional (figure of eight) or omni-directional (circle) xmountings: hand-held, head-mounted, table stand (desk-top), Lavalier xSmall or big diaphragm 0 dB SPL=0.0002 bar (threshold of hearing ; 0dBm corresponds to 0dB referenced to 1mW Microphone polar response
Krzysztof Marasek Summer 2002 Katedra Multimediów PJWSTK Microphones: basic transduction categories zPassive: converts directly sound to electrical energy, active: needs additional energy source (battery, phantom power) zelectromagnetic and electro-dynamic microphones: yribbon - duralumin ribbon moving in permanent magnetic field ymoving-coil- inverse of loudspeaker, bigger than ribbon, thus higher voltage induced ywidely used, good frequency and transient response, moderate cost yrather old zelectrostatic microphones: ycondenser: capacitor with dielectric inside, one of plates can move, pre- polarization needed, very high output impedance; excellent frequency and transient response, low distortion yelectret: with built-in pre-polarization condenser (100 V), power supply needed, good frequency and transient response, low distortion, but lower dynamic range and sensitivity as for condenser m. zpiezoresitive and piezoelectric microphones: yvariation of resistance ycarbon: small cylinder with granulates of carbon - by vibrations granules can separate, changing the electric resistance of cylinder;low quality ycrystal and ceramic: Rochelle salt - the same principle like carbon mike; low quality zspecial microphones: pressure-zone (PZM, for speech reinforcement), pressure- gradient microphone (for directional acquisition), noise-canceling, micro-mechanical silicon microphones, optical wave-guide
Krzysztof Marasek Summer 2002 Katedra Multimediów PJWSTK Ribbon microphones zPrinciple of work: duralumin ribbon moving in permanent magnetic field zCould be very good and expensive: (Royer labs) zFeatures: yVery high overload characteristics – max SPL > 135 dB yExtremely low noise yAbsence of high frequency phase distortion yExcellent phase linearity yEqual sensitivity from front/back yConsistent frequency response regardless of distance yNo power supply required yStrong proximity effect yStrong wind effects
Krzysztof Marasek Summer 2002 Katedra Multimediów PJWSTK Moving coil zA moving-coil microphone contains a diaphragm exposed to sound waves. The diaphragm carries a coil placed in the magnetic field. The voltage induced in the coil is proportional to its amplitude of vibration, which, in turn, depends on the sound pressure. zMoving coil microphones are cheap and robust making them good for the rigors of live performance and touring. They are especially suited for the close micking of Bass and Guitar speaker cabinets and Drum kits. zThey are also good for live vocals as their resonance peak of around 5kHz provides an inbuilt presence boost that improves speech/singing intelligibility zHowever the inertia of the coil reduces high frequency response. Hence they are NOT best suited to studio applications where quality and subtlety are important such as high quality vocal recording or acoustic instrument micking
Krzysztof Marasek Summer 2002 Katedra Multimediów PJWSTK Condenser microphone zA condenser microphone incorporates a stretched metal diaphragm that forms one plate of a capacitor. A metal disk placed close to the diaphragm acts as a backplate. When a sound field excites the diaphragm, the capacitance between the two plates varies according to the variation in the sound pressure. A stable DC voltage is applied to the plates through a high resistance to keep electrical charges on the plate. The change in the capacitance generates an AC output proportional to the sound pressure. In order to convert ultralow-frequency pressure variations, a high- frequency voltage (carrier) is applied across the plates. The output signal is the modulated carrier. zAre the best, need Condenser microphone. AP = acoustic pressure, C = variable capacitance, 1 = metal diaphragm, 2 = metal disk, 3 = insulator, 4 = case.
Krzysztof Marasek Summer 2002 Katedra Multimediów PJWSTK Electret microphone zAn electret-type microphone is a condenser microphone in which the electrical charges are created by a thin layer of polarized ceramic or plastic films (electrets). The ability of the electrets to keep the charge obviates using the source for a high-voltage polarization zOutput impedance is relatively high (typically about 1k to 5k) zSignal output is limited (relatively low sensitivity) zNoise is relatively high zSound level handling ability is low (typically < 90dB SPL) zThey are normally available from retail outlets very cheaply Electret-type microphone. AP = acoustic pressure, U o = output voltage, 1 = diaphragm, 2 = electret, 3 = case.
Krzysztof Marasek Summer 2002 Katedra Multimediów PJWSTK Piezoresistive mics zIn a carbon-button microphone, the sound field acts upon an electroconductive diaphragm that develops pressure on a packet of carbon granules. The contact resistance between the granules depends on the pressure. When a DC voltage is applied across the packet, the alternating resistance produces an AC voltage drop, which is proportional to the sound intensity. Carbon-button microphone. AP = acoustic pressure, R = variable resistance, 1 = electroconductive particles, 2 = diaphragm, 3 = electrode.
Krzysztof Marasek Summer 2002 Katedra Multimediów PJWSTK Microphone arrays zSelective acquisition of speech in spatial domain, detection, tracking and selective acquisition of speaker automatically zbeamforming: spatial filtering: filtering and sum approach: compensate for difference in path length from source to each of the microphones delay in time domain linear phase shift in frequency domain zdereverberation, talker location - time difference of arrival,power field scanning, MUSIC
Krzysztof Marasek Summer 2002 Katedra Multimediów PJWSTK Microphones in speech recognition zTraining and testing condition mismatch: the same microphone preferred zmicrophone normalization - multichannel recording and matching of signals znoise canceling head-set preferred in ASR, but users don’t like this zroom acoustic influence on recording and ASR zASR in car: ynon-homogenous acoustic environment - dependence on microphone position zSpeecon project: consumer devices environment zgradient microphones in adverse condition: aircraft cockpit zfeature selection: filtering zcochlear model and binaural processing: special microphones and filtering methods zuse of microphone arrays zactive noise cancelling: new buzzword
Krzysztof Marasek Summer 2002 Katedra Multimediów PJWSTK