Presentation is loading. Please wait.

Presentation is loading. Please wait.

Audio Henning Schulzrinne Dept. of Computer Science Columbia University Fall 2003.

Similar presentations


Presentation on theme: "Audio Henning Schulzrinne Dept. of Computer Science Columbia University Fall 2003."— Presentation transcript:

1 Audio Henning Schulzrinne Dept. of Computer Science Columbia University Fall 2003

2 Common narrowband audio codecs Codecrate (kb/s) delay (ms) multi-rateem- bedd ed VBRbit-robust/ PLC remarks iLBC15.2 13.3 20 30 --/Xquality higher than G.729A no licensing Speex2.15-- 24.6 30XXX--/Xno licensing AMR-NB4.75-- 12.2 20XX/X3G wireless G.729815X/XTDMA wireless GSM-FR1320GSM wireless (Cingular) GSM-EFR12.220X/X2.5G G.72816 12.8 2.5X/XH.320 (ISDN videconferencing) G.723.15.3 6.337.5 X/--H.323, videoconferences

3 Common wideband audio codecs Codecrate (kb/s) delay (ms) multi-rateem- bedd ed VBRbit-robust/ PLC remarks Speex4— 44.4 34XXX--/Xno licensing AMR-WB6.6— 23.85 20XX/X3G wireless G.72248, 56, 64 0.12 5 (1.5) X/--2 sub-bands now dated

4 iLBC – MOS behavior with packet loss

5 Recent audio codecs iLBC: optimized for high packet loss rates (frames encoded independently) AMR-NB – 3G wireless codec – 4.75-12.2 kb/s – 20 ms coding delay

6 Speex Open-source patent-free speech codec CELP (code-excited linear prediction) codec operating modes: – narrowband (8 kHz sampling rate) 2.15 – 24.6 kb/s delay of 30 ms – wideband (16 kHz sampling rate) 4-44.2 kb/s delay of 34 ms – ultra-wideband (32 kHz sampling rate) intensity stereo encoding variable bit rate (VBR) possible voice activity detection (VAD)

7 Ogg Vorbis Similar in application to AAC, MP3, VQF, …, but claims to be free of patents Ogg = container format file (also for Speex, FLAC) Vorbis = music speech codec near CD quality = 160 kb/s forward-adaptive modified DCT (discrete cosine transform) – overlapping windows – floor: carries frequency representation as piecewise linear interpolated representation on a dB amplitude scale and linear frequency scale – residue: subtract out floor  cascaded (multi-pass) vector quantization – entropy (Huffman) coding carries codec parameters in header

8 Sound localization Human ear uses 3 metrics for stereo localization: – intensity – time of arrival (TOA) – 7 µs – direction filtering and spectral shaping by outer ear For shorter wavelengths (4 – 20 kHz), head casts an acoustical shadow giving rise to a lower sound level at the ear farthest from the sound sources At long wavelength (20 Hz - 1 KHz) the, head is very small compared to wavelengths – In this case localization is based on perceived Interaural Time Differences (ITD) UCSC CMPE250 Fall 2002

9 Audio samples http://www.cs.columbia.edu/~hgs/audio/code cs.html Speex: http://www.speex.org/audio/samples/http://www.speex.org/audio/samples/ – both narrowband and wideband


Download ppt "Audio Henning Schulzrinne Dept. of Computer Science Columbia University Fall 2003."

Similar presentations


Ads by Google