Presentation is loading. Please wait.

Presentation is loading. Please wait.

Guerino Mazzola (Fall 2015 © ): Introduction to Music Technology IIIDigital Audio III.7 (M Nov 04) The MP3 frame format.

Similar presentations


Presentation on theme: "Guerino Mazzola (Fall 2015 © ): Introduction to Music Technology IIIDigital Audio III.7 (M Nov 04) The MP3 frame format."— Presentation transcript:

1 Guerino Mazzola (Fall 2015 © ): Introduction to Music Technology IIIDigital Audio III.7 (M Nov 04) The MP3 frame format

2 Guerino Mazzola (Fall 2015 © ): Introduction to Music Technology The MP3 encoder chain Audio Data Filter Bank 32 Subbands Subbands Psychoacoustical Model Quantization and Encoding ( Check of Quantization loop) External Check Encoding Encoding of Additional Information Datastream Formatting to Frames etc. Additional Data Data Stream 2*16 to Line 1. Digital Datastream 2. FFT with Filter Bank 3. Psychoacoustical Model (Perceptual-Audio-Coding Model PAC) 4. Quantization 5. Huffman Compression 6. Frame Outputstream Formatting

3 Guerino Mazzola (Fall 2015 © ): Introduction to Music Technology The MP3 encoder chain Audio Data Filter Bank 32 Subbands Subbands Psychoacoustical Model Quantization and Encoding ( Check of Quantization loop) External Check Encoding Encoding of Additional Information Datastream Formatting to Frames etc. Additional Data Data Stream 2*16 to Line 6. Frame Outputstream Formatting

4 Guerino Mazzola (Fall 2015 © ): Introduction to Music Technology The MP3 encoder chain MP3 file Identifier = ID3 Tag At the beginning of the MP3 file, we have a 128 Byte identifier (ID3 tag), which is not an official standard, but very often appears: BytesContent 3Tag = identification as ID3 tag 30title of piece 30name of interpreter(s) 30name of album 4year of publication 30comment 1genre identification

5 Guerino Mazzola (Fall 2015 © ): Introduction to Music Technology The MP3 encoder chain Frame Outputstream Formatting The MP3 format, when used for streaming or for saving purposes, is built from units that are called frames. A frame is an autonomous information package. This means that all encoding data is provided within every frame to enable playing a file from any given time onset. A frame’s duration is 1/38.28125 ~ 1/40 sec. This enable virtually continuous playing for humans. Each frame has these parts: 1.a 32-bit header indicating the layer number (1-3), the bitrate, and the sample frequency; 2.the Cycle Redunancy Check (CRC) with 16 bits for error detection (without correction option) but frame repetition until correct frame appears; 3.12 bits for additional information for Huffman trees and quatization info; 4.main data sample block of 3344 bits for Huffman-encoded data.

6 Guerino Mazzola (Fall 2015 © ): Introduction to Music Technology The MP3 encoder chain 32 bit Frame Header PositionTaskLength in bits AFrame-SYNC (for playing and “jumping around”)11 BMPEG Audio Version (MPEG-1, -2, etc.)2 CMPEG Layer (Layer I, II, III, etc.)2 DProtection1 EBitrate Index4 FSampling Frequency (e.g. 44.1 kHz)2 GPadding bit (compensates incomplete allocation)1 HPrivate bit (application-specific trigger)1 IChannel mode (Stereo, Joint Stereo)2 JMode Extension (for Joint Stereo)2 KCopyright1 LOriginal (“0” if copy, “1” if original)1 MEmphasis (outdated)2 important: you can only play entire frames! important: you can only play entire frames!

7 Guerino Mazzola (Fall 2015 © ): Introduction to Music Technology The MP3 encoder chain Frame Sequence with reservoir technique Bits in reservoir for Block 1 = 0 Bits in reservoir for Block 2 Bits in reservoir for Block 3 Bits in reservoir for Block 4 Bits in reservoir for Block 5 Main data for block 1 Main data for block 2 Main data for block 3 Main data for block 4 Main data for block 5 Header/ Add. info block 1 Header/ Add. info block 2 Header/ Add. info block 3 Header/ Add. info block 4 Header/ Add. info block 5 3344 bits

8 Guerino Mazzola (Fall 2015 © ): Introduction to Music Technology The MP3 encoder chain Important formulas relating to frame capacities Fixed data: 1.# frames/sec = 38.28125 2.maximal audio data capacity per frame = 3,344 bit/frame 3.# frequency bands = 32 Fixed data: 1.# frames/sec = 38.28125 2.maximal audio data capacity per frame = 3,344 bit/frame 3.# frequency bands = 32 First formula: maximal bitrate 3,344 bit/frame × 38.28125 frame/sec = 128 kbit/sec guarantees CD quality. First formula: maximal bitrate 3,344 bit/frame × 38.28125 frame/sec = 128 kbit/sec guarantees CD quality. Second formula: frequency samples per frame 44,100 time-sample/sec / 38.28125 frame/sec = 1152 frequency-sample/frame guarantees CD quality. This yields 1152/32 = 36 frequency-samples/band Observe: 625 Hz/band / 38.28125 Hz = 16.3265 frequ.-samples/band, we have overlapping info, but this is ok to minimize measurement errors. Recall that time-samples ~ frequency-samples

9 Guerino Mazzola (Fall 2015 © ): Introduction to Music Technology The MP3 encoder chain Some performance values MPEG procedure compressionqualitybitrate kbit/se c bandwidt h kHz mode MPEG-1 layer-314:1 – 12:1CD128>15stereo MPEG-1 layer-316:1Approximately CD 96-11215stereo MPEG-2 layer-316:1-24:1Radio quality56-6411stereo MPEG-2 layer-324:1Language327.5mono MPEG-2 layer-348:1Shortwave radio164.5mono MPEG-2.5 layer-396:1Telephone82.5mono Input bitrate (2×768) / output bitrate (128) = 12

10 Guerino Mazzola (Fall 2015 © ): Introduction to Music Technology The MP3 encoder chain Remarks on Joint Stereo Coding MP3 implements the Joint Stereo Coding compression method, which is based on these two principles: 1.Mid/Side Stereo Coding (MSSC), where we take instead of the left and right channels (L,R) the equivalent data (L+R, L-R) and make use of the fact that L and R are usually strongly correlated and that therefore the difference is quite “tame”. 2.Intensity Stereo Coding (ISC), where the sum L+R and the direction of the signal are encoded (replacing the L-R information). This coding method also uses the fact that the human ear is weak in localizing deep frequencies. Since the direction is detected by phase differences that are difficult to retreave for deep frequencies, they are encoded mono!

11 Guerino Mazzola (Fall 2015 © ): Introduction to Music Technology The MP3 encoder chain Legal aspects The license rights of Fraunhofer IIS are represented by the French company Technicolor SA, formerly Thomson Multimedia. Here are the figures: 1.0.50USD per decoder 2.5.- USD per encoder 3.15,000.- USD annual lump-sum The license rights of Fraunhofer IIS are represented by the French company Technicolor SA, formerly Thomson Multimedia. Here are the figures: 1.0.50USD per decoder 2.5.- USD per encoder 3.15,000.- USD annual lump-sum This means that an enterprise which sells a total of annually 25,000 copies of the encoder software, pays 25,000 × 5.- + 15,000.- = 140,000.- for the first year and then 15,000.- annual fees for every successive year.


Download ppt "Guerino Mazzola (Fall 2015 © ): Introduction to Music Technology IIIDigital Audio III.7 (M Nov 04) The MP3 frame format."

Similar presentations


Ads by Google