Presentation is loading. Please wait.

Presentation is loading. Please wait.

Guerino Mazzola (Fall 2014 © ): Introduction to Music Technology IIIDigital Audio III.6 (Fr Oct 24) The MP3 algorithm with PAC.

Similar presentations


Presentation on theme: "Guerino Mazzola (Fall 2014 © ): Introduction to Music Technology IIIDigital Audio III.6 (Fr Oct 24) The MP3 algorithm with PAC."— Presentation transcript:

1 Guerino Mazzola (Fall 2014 © ): Introduction to Music Technology IIIDigital Audio III.6 (Fr Oct 24) The MP3 algorithm with PAC

2 Guerino Mazzola (Fall 2014 © ): Introduction to Music Technology The MP3 encoder chain Audio Data Filter Bank 32 Subbands Subbands Psychoacoustical Model Quantization and Encoding ( Check of Quantization loop) External Check Encoding Encoding of Additional Information Datastream Formatting to Frames etc. Additional Data Data Stream 2*16 to Line 1. Digital Datastream 2. FFT with Filter Bank 3. Psychoacoustical Model (Perceptual-Audio-Coding Model PAC) 4. Quantization 5. Huffman Compression 6. Frame Outputstream Formatting

3 Guerino Mazzola (Fall 2014 © ): Introduction to Music Technology The MP3 encoder chain Audio Data Filter Bank 32 Subbands Subbands Psychoacoustical Model Quantization and Encoding ( Check of Quantization loop) External Check Encoding Encoding of Additional Information Datastream Formatting to Frames etc. Additional Data Data Stream 2*16 to Line 1. Digital Datastream 2 ~ stereo 768 kbit/s ~ 48 000 × 16 b/s

4 Guerino Mazzola (Fall 2014 © ): Introduction to Music Technology The MP3 encoder chain Audio Data Filter Bank 32 Subbands Subbands Psychoacoustical Model Quantization and Encoding ( Check of Quantization loop) External Check Encoding Encoding of Additional Information Datastream Formatting to Frames etc. Additional Data Data Stream 2*16 to Line 2. FFT with Filter Bank 2.1 Cut spectrum 0 – 20 kHz into 32 subbands of 625 Hz each (32 × 625 = 20 000) for 1/40 sec windows. 2.2 Use MDCT (Modified Discrete Cosine Transformation ~ variant of FFT) to split each 625 Hz band into 18 subbands with variable widths, according to psychoacoustical criteria. Get 576 = 18 × 32 “lines”. Important: Since # sample rate = # Fourier coefficients, speak of “Fourier samples per second” Important: Since # sample rate = # Fourier coefficients, speak of “Fourier samples per second”

5 Guerino Mazzola (Fall 2014 © ): Introduction to Music Technology The MP3 encoder chain Audio Data Filter Bank 32 Subbands Subbands Psychoacoustical Model Quantization and Encoding ( Check of Quantization loop) External Check Encoding Encoding of Additional Information Datastream Formatting to Frames etc. Additional Data Data Stream 2*16 to Line 4. lossy Quantization 5. Huffman lossless Compression Already discussed, Ok!!!! 40% of compression

6 Guerino Mazzola (Fall 2014 © ): Introduction to Music Technology The MP3 encoder chain Audio Data Filter Bank 32 Subbands Subbands Psychoacoustical Model Quantization and Encoding ( Check of Quantization loop) External Check Encoding Encoding of Additional Information Datastream Formatting to Frames etc. Additional Data Data Stream 2*16 to Line 3. Psychoacoustical Model (Perceptual-Audio-Coding Model PAC)

7 Guerino Mazzola (Fall 2014 © ): Introduction to Music Technology The MP3 encoder chain Psychoacoustical Model (Perceptual-Audio-Coding Model PAC) = core features of MP3, it covers 60 % of MP3 compression The PAC Model is based upon three limitations of human audio-perception: PAC 1: hearing thresholds PAC 2: auditory masking PAC 3: temporary masking All three PAC components generate lossy compression

8 Guerino Mazzola (Fall 2014 © ): Introduction to Music Technology The MP3 encoder chain PAC 1: hearing thresholds loudness frequency (kHz) you don’t hear sinusoidal sounds below this threshold of loudness

9 Guerino Mazzola (Fall 2014 © ): Introduction to Music Technology The MP3 encoder chain PAC 2: auditory masking frequency loudness For every sinusoidal frequency component of frequency f and loudness l, there is a surrounding masking surface, where other frequency/loudness components cannot be heard together with the given one. Example: the 4 kHz/40 dB component (red) masks the blue one.

10 Guerino Mazzola (Fall 2014 © ): Introduction to Music Technology The MP3 encoder chain PAC 3: temporary masking For every sinusoidal frequency component of frequency f and loudness l (red) another subsequent component (blue) cannot be heard below the given curve of loudness in time, because the ear needs some time to “recover” from that first component’s perception. This is even true for sounds before the given one (red curve), because the perception needs to be built up! loudness time

11 Guerino Mazzola (Fall 2014 © ): Introduction to Music Technology The MP3 encoder chain Audio Data Filter Bank 32 Subbands Subbands Psychoacoustical Model Quantization and Encoding ( Check of Quantization loop) External Check Encoding Encoding of Additional Information Datastream Formatting to Frames etc. Additional Data Data Stream 2*16 to Line 6. Frame Outputstream Formatting


Download ppt "Guerino Mazzola (Fall 2014 © ): Introduction to Music Technology IIIDigital Audio III.6 (Fr Oct 24) The MP3 algorithm with PAC."

Similar presentations


Ads by Google