Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 CMSHN1114/CMSCD1011 Introduction to Computer Audio Lecture 5: Digital audio formats Dr David England School of Computing and Mathematical Sciences

Similar presentations

Presentation on theme: "1 CMSHN1114/CMSCD1011 Introduction to Computer Audio Lecture 5: Digital audio formats Dr David England School of Computing and Mathematical Sciences"— Presentation transcript:

1 1 CMSHN1114/CMSCD1011 Introduction to Computer Audio Lecture 5: Digital audio formats Dr David England School of Computing and Mathematical Sciences Teaching/cmscd1011.html

2 2 In this session... We will look at the various encoding mechanisms that are currently used to store and process digital audio –PCM and ADPCM –MP3 –RealAudio These codecs employ various compression schemes to help reduce the size of digital audio

3 3 Compression techniques There are two categories of compression: Lossy compression techniques cause some information to be lost from the original data –You can never recreate the source material from the compressed version Lossless compression techniques do not lose information –You can always recreate an exact replica of the source information from the compressed version

4 4 Pulse Code Modulation (PCM) PCM is a sampling technique for digitising analogue signals, especially audio signals PCM samples the signal 8000 times a second and each sample is represented by 8 bits for a total of 64 Kbps –Although these settings can be altered (CD-DA uses PCM to store raw digital audio) Two standards are used to encode the sample level –The  -Law (Mu-Law) standard is used in North America and Japan while the A-Law standard is used in most other countries

5 5 Adaptive differential PCM (ADPCM) ADPCM is a form of pulse code modulation (PCM) that produces a digital signal with a lower bit rate than standard PCM ADPCM produces a lower bit rate by recording only the difference between samples and adjusting the coding scale dynamically to accommodate large and small differences Depending on the actual values of the samples, ADPCM can save 25% - 50% of storage space over normal PCM

6 6 ADPCM example Consider the following PCM samples: –0, 1, 8, 16, 30, 40, 41, 45 –This would take 8 bytes of storage Encoding these using ADPCM might look like this: –0, 1, 7, 8, 14, 10, 1, 4 –i.e. only the difference between the two values is stored Using PCM, each value occupies 8 bits (1 byte) In the ADPCM version, the highest number we need to store is 14 which can be stored in 4 bits Therefore we can store the ADPCM version in half the space

7 7 Other PCM codecs Other PCM-based codecs have been developed for specialist applications such as voice, telephone traffic and low-end audio Some of these include: –GSM (Global System for Mobile communications) The GSM full rate speech codec operates at 13 Kbps –G.723 Dual rate speech codec for multimedia communications transmitting at 5.3 and 6.3 Kbps –TrueSpeech™ High-quality, low-bandwidth (8.5 Kbps) can be found in Windows 95 and 98

8 8 MPEG 1 Layer 3 Also known as MP3 MP3 is a “perceptual codec” It uses knowledge of human psychoacoustics and standard compression techniques to help reduce the size of the audio Typical MP3 files are one-tenth of the size of the corresponding uncompressed audio source MP3 files can be encoded at various degrees of quality (determined by the bitrate) Near CD-quality is achieved at about 128 Kbps It should be noted that MP3 is a lossy compression scheme

9 9 Human frequency response Taken from “MP3: The Definitive Guide By Scot Hacker” (O’Reilly) Most sensitive range of hearing

10 10Psychoacoustics Psychoacoustics is the study of the interrelation between the ear, the mind and vibratory audio signals Two masking effects are used by MP3 –Simultaneous (auditory) masking If two sounds (one loud, one quiet) with frequencies very close to each other are played simultaneously, the brain will not be able to distinguish the quieter one –Temporal (time) masking Humans have trouble hearing distinct sounds that are very close together. If a loud sound and a quiet sound are played almost simultanesouly you won’t hear the quiet sound

11 11 MP3 encoding A two stage process 1st stage: Psychoacoustic modelling –If you can’t hear it - why store it? –The encoder divides the frequency spectrum into 576 frequency bands and compresses each band independently according to rules of psychoacoustics 2nd stage: Standard compression –Huffman encoding similar to ZIP file compression (a lossless compression scheme) –Further reduces size by about 20%

12 12 The MP3 format in more detail MP3 breaks the audio signal into frames lasting a fraction of a second just as in a movie film Each frame is analysed to determine its spectral energy distribution (audible frequencies) The bitrate determines how much data can be stored in each frame The frequency spread is compared to mathematical models of human psychoacoustics to determine which frequencies must be rendered accurately and which ones can be dropped or allocated fewer bits The bitstream is then compressed to reduce redundant information The frames are assembled into series with header info

13 13 ID3 tags in MP3 files MP3 audio has a section allocated to storing identification information about that particular piece or audio This is call the ID3 data and stores information such as: –Name of the artist, Track title, Album from where the track came, Recording year, Genre, Personal comments

14 14 MP3 players (software) WinAmp Real Jukebox

15 15 MP3 players (hardware) There are a number of hardware “walkman-style” MP3 players currently flooding the market –Tend to be limited by small memory capacity (e.g. 32MB) There are also in-car players… –The empeg product runs Linux! (see …and home hi-fi units –Usually have large hard-drives to store all your music collection

16 16RealAudio™ RealAudio was created by Real Networks in 1995 It is an encoding mechanism that allows audio to be streamed in real time as opposed to being downloaded all at once Two varieties of encoding are: –Single rate - Encodes using a single bitrate. Your network must be able to support this speed in order to play the audio without any glitches –SureStream™ - Encodes multiple bitrates inside the same stream. If your data rate is good enough it will use the highest rate, otherwise it will drop down to the highest quality that your network speed supports

17 17 Digital audio file extensions WAV (Microsoft Wave audio file format) AU (Sun Microsystems Inc. audio file format) SND (Next audio file format) AIF or AIFF (Audio Interchange File Format - used a lot on Macintosh machines) MP3 (MPEG 1 Layer 3 audio) RA or RM (RealAudio and RealMedia respectively) Plus others dedicated to streaming (final lecture)

18 18Summary Today we have looked at the various encoding mechanisms used to store and playback digital audio There are numerous encoding schemes currently on the market each with its own unique characteristics –e.g. low-quality voice, high-quality voice, high-quality music, etc. The most important factor encoding digital audio is the playback environment –Speed of processing and supporting data rate

19 19 Next lecture... We will begin the first part of two sessions looking at the MIDI standard (Musical Instrument Digital Interface) We will begin by looking at the MIDI protocol No lecture next week but there is a lab where the coursework will be handed out Coursework to be done in pairs – send me names of you team members by March 9th Next: Cakewalk tutorial (see recent )

Download ppt "1 CMSHN1114/CMSCD1011 Introduction to Computer Audio Lecture 5: Digital audio formats Dr David England School of Computing and Mathematical Sciences"

Similar presentations

Ads by Google