MPEG/Audio Compression Tutorial Mike Blackstock CPSC 538a January 11, 2004.

Slides:



Advertisements
Similar presentations
Alex Chen Nader Shehad Aamir Virani Erik Welsh
Advertisements

Audio Compression ADPCM ATRAC (Minidisk) MPEG Audio –3 layers referred to as layers I, II, and III –The third layer is mp3.
MP3 Overview John Ehrhardt Elena Silenok CSE228 – Spring 03.
Department of Computer Engineering University of California at Santa Cruz MPEG Audio Compression Layer 3 (MP3) Hai Tao.
Introduction to MP3 and psychoacoustics Material from website by Mark S. Drew
CS Spring 2012 CS 414 – Multimedia Systems Design Lecture 11 – MP3 and MP4 Audio (Part 7) Klara Nahrstedt Spring 2012.
Guerino Mazzola (Fall 2014 © ): Introduction to Music Technology IIIDigital Audio III.6 (Fr Oct 24) The MP3 algorithm with PAC.
Multimedia Data Speech and Audio Dr Mike Spann Electronic, Electrical and Computer Engineering.
CS335 Principles of Multimedia Systems Audio Hao Jiang Computer Science Department Boston College Oct. 11, 2007.
A Phonetician ’ s Guide to Audio Formats Chilin Shih University of Illinois at Urbana Champaign LSA 2006January 5-8, 2006.
MPEG-1 MUMT-614 Jan.23, 2002 Wes Hatch. Purpose of MPEG encoding To decrease data rate How? –two choices: could decrease sample rate, but this would cause.
CGMB324: Multimedia System Design
School of Informatics CG087 Time-based Multimedia Assets Compression & StreamingDr Paul Vickers1 Compression & Streaming Serving, shrinking, and otherwise.
Fourier Transforms and Their Use in Data Compression
Time-Frequency Analysis Analyzing sounds as a sequence of frames
Digital Audio Compression
Digital Audio Coding – Dr. T. Collins Standard MIDI Files Perceptual Audio Coding MPEG-1 layers 1, 2 & 3 MPEG-4.
AUDIO COMPRESSION TOOLS & TECHNIQUES Gautam Bhattacharya.
Digital Representation of Audio Information Kevin D. Donohue Electrical Engineering University of Kentucky.
1 Digital Audio Compression. 2 Formats  There are many different formats for storing and communicating digital audio:  CD audio  Wav  Aiff  Au 
Chapter 14 MPEG Audio Compression 14.1 Psychoacoustics 14.2 MPEG Audio 14.3 Other Commercial Audio Codecs 14.4 The Future: MPEG-7 and MPEG Further.
Speech & Audio Processing
1 Audio Compression Techniques MUMT 611, January 2005 Assignment 2 Paul Kolesnik.
MPEG-3 For Audio Presented by: Chun Lui Sunjeev Sikand.
MPEG Audio Compression by V. Loumos. Introduction Motion Picture Experts Group (MPEG) International Standards Organization (ISO) First High Fidelity Audio.
Digital Voice Communication Link EE 413 – TEAM 2 April 21 st, 2005.
11/11/03CSE 100 – Info Technology & Its Impact on Society1 MP-3 Compression: How it works.
Fundamentals of Perceptual Audio Encoding Craig Lewiston HST.723 Lab II 3/23/06.
Digital Audio Multimedia Systems (Module 1 Lesson 1)
1 Audio Compression Multimedia Systems (Module 4 Lesson 4) Summary: r Simple Audio Compression: m Lossy: Prediction based r Psychoacoustic Model r MPEG.
CS Spring 2014 CS 414 – Multimedia Systems Design Lecture 15 – MP3 and MP4 Audio Klara Nahrstedt Spring 2014.
Audio Fundamentals Lesson 2 Sound, Sound Wave and Sound Perception
Digital Audio Watermarking: Properties, characteristics of audio signals, and measuring the performance of a watermarking system نيما خادمي کلانتري
Chapter 6 Basics of Digital Audio
The Application Layer Chapter 7. DNS – The Domain Name System a)The DNS Name Space b)Resource Records c)Name Servers.
Audio Compression Usha Sree CMSC 691M 10/12/04. Motivation Efficient Storage Streaming Interactive Multimedia Applications.
CSC361/661 Digital Media Spring 2002
CMPT 365 Multimedia Systems
Media Representations - Audio
A Tutorial on MPEG/Audio Compression Davis Pan, IEEE Multimedia Journal, Summer 1995 Presented by: Randeep Singh Gakhal CMPT 820, Spring 2004.
Multimedia Data Speech and Audio Dr Sandra I. Woolley Electronic, Electrical and Computer Engineering.
MPEG Audio coders. Motion Pictures Expert Group(MPEG) The coders associated with audio compression part of MPEG standard are called MPEG audio compressor.
Sound Sound is a continuous wave that travels through the air
1 Audio Compression. 2 Digital Audio  Human auditory system is much more sensitive to quality degradation then is the human visual system  redundancy.
Image Processing Architecture, © 2001, 2002, 2003 Oleh TretiakPage 1 ECE-C490 Image Processing Architecture MP-3 Compression Course Review Oleh Tretiak.
Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh.
CS Spring 2009 CS 414 – Multimedia Systems Design Lecture 3 – Digital Audio Representation Klara Nahrstedt Spring 2009.
Encoding and Simple Manipulation
MPEG-1Standard By Alejandro Mendoza. Introduction The major goal of video compression is to represent a video source with as few bits as possible while.
Digital Audio III. Sound compression (I) Compression of sound data requires different techniques from those for graphical data Requirements are less stringent.
1 Audio Coding. 2 Digitization Processing Signal encoder Signal decoder samplingquantization storage Analog signal Digital data.
IntroductiontMyn1 Introduction MPEG, Moving Picture Experts Group was started in 1988 as a working group within ISO/IEC with the aim of defining standards.
Interactive Multimedia Sound Mikael Fernström. Data sources Microphones and transducers –Sample acoustic reality Synthesis –Simulate reality (and beyond.
CS Spring 2014 CS 414 – Multimedia Systems Design Lecture 3 – Digital Audio Representation Klara Nahrstedt Spring 2014.
Digital Audio I. Acknowledgement Some part of this lecture note has been taken from multimedia course made by Asst.Prof.Dr. William Bares and from Paul.
EE5359 Multimedia Processing Project Study and Comparison of AC3, AAC and HE-AAC Audio Codecs Dhatchaini Rajendran Student ID: Date :
Fundamentals of Multimedia Chapter 6 Basics of Digital Audio Ze-Nian Li and Mark S. Drew 건국대학교 인터넷미디어공학부 임 창 훈.
Fundamentals of Multimedia 2 nd ed., Chapter 14 Chapter 14 MPEG Audio Compression 14.1 Psychoacoustics 14.2 MPEG Audio 14.3 Other Audio Codecs 14.4 MPEG-7.
Lifecycle from Sound to Digital to Sound. Characteristics of Sound Amplitude Wavelength (w) Frequency ( ) Timbre Hearing: [20Hz – 20KHz] Speech: [200Hz.
MP3 and MP4 Audio By: Krunal Tailor
[1] National Institute of Science & Technology Technical Seminar Presentation 2004 Suresh Chandra Martha National Institute of Science & Technology Audio.
III Digital Audio III.7 (W Nov 04) The MP3 frame format.
COMPUTER NETWORKS and INTERNETS
III Digital Audio III.6 (Fr Oct 20) The MP3 algorithm with PAC.
III Digital Audio III.7 (F Oct 20) The MP3 frame format.
III Digital Audio III.7 (Mo Oct 22) The MP3 frame format.
MPEG-1 Overview of MPEG-1 Standard
III Digital Audio III.6 (Mo Oct 22) The MP3 algorithm with PAC.
Govt. Polytechnic Dhangar(Fatehabad)
Digital Audio Application of Digital Audio - Selected Examples
Presentation transcript:

MPEG/Audio Compression Tutorial Mike Blackstock CPSC 538a January 11, 2004

CPSC 538a MPEG Audio Tutorial January 12, of 17 Overview Digital Sound Psychoacoustics Time to Frequency Domain Transformation MPEG/Audio Basic Algorithm Related Work Web references

CPSC 538a MPEG Audio Tutorial January 12, of 17 Digital Sound Basics Sound is a continuous wave through the air Made up of pressure differences, detected by measuring pressure levels at a location. Microphone changes analog sound pressure to analog voltage levels. To digitize sound, the signal must be sampled in time and encoded into numbers Quantization divides signal strength into levels, linearly or logarithmically. 8 bits –> 256 levels; 16 –> levels

CPSC 538a MPEG Audio Tutorial January 12, of 17 Digital Audio Questions How often should sound be sampled? –Need to sample at a rate at least twice as high as highest frequency, otherwise frequency is lost. Nyquist Theorum What quality is required? –Telephone, radio, CD, different quality requirements. –Signal to Noise Ratio (SNR) is a measure of the quality of a signal –noise may be introduced during conversion from sound to voltage and due to sampling/quantization. Format to use? –.au, aiff,.wav, and of course.mp3

CPSC 538a MPEG Audio Tutorial January 12, of 17 Psychoacoustics Principles of the human perception of sound MPEG compression algorithm uses model of human hearing to remove data (perceptual coding algorithm) Frequency range is about 20 Hz to 20 kHz, most sensitive at 2 to 4 KHz. Dynamic range (quietest to loudest) is about 96 dB Normal voice range is about 500 Hz to 2 kHz Low frequencies -> vowels, bass; High -> consonants

CPSC 538a MPEG Audio Tutorial January 12, of 17 Human Hearing Sensitivity Experiment: Put a person in a quiet room. Raise level of 1 kHz tone until just barely audible. Vary the frequency, plot:

CPSC 538a MPEG Audio Tutorial January 12, of 17 Human Frequency Masking Experiment: Play 1 kHz tone (masking tone) at fixed level (60 dB). Play test tone at a different level (e.g., 1.1 kHz), and raise level until just distinguishable. Vary the frequency of the test tone and plot the threshold when it becomes audible

CPSC 538a MPEG Audio Tutorial January 12, of 17 Frequency Masking

CPSC 538a MPEG Audio Tutorial January 12, of 17 Temporal Masking If we hear a loud sound, then it stops, it takes a little while until we can hear a soft tone nearby. Experiment: Play 1 kHz masking tone at 60 dB, plus a test tone at 1.1 kHz at 40 dB. Test tone can't be heard (it's masked). Stop masking tone, then stop test tone after a short delay. Adjust delay time to the shortest time when test tone can be heard (e.g., 5 ms). Repeat with different level of the test tone and plot:

CPSC 538a MPEG Audio Tutorial January 12, of 17 Combination

CPSC 538a MPEG Audio Tutorial January 12, of 17 Time to Frequency Transform Transforming time/level input signals to frequency/power FFT (here) most popular – fast and easy, and in most numerical methods texts. Used by psychoacoustic model. DCT often used for spatial frequency since represents linear signals better. Something similar used by filter bank. Wavelets use non-sine/cosine functions for better performance on data with sharp discontinuities. Demo

CPSC 538a MPEG Audio Tutorial January 12, of 17 MPEG Basics

CPSC 538a MPEG Audio Tutorial January 12, of 17 Algorithm overview 1.Use convolution filters to divide the audio signal (e.g., 48 kHz sound) into 32 frequency subbands --> subband filtering. 512 sample FIFO buffer used. 2.Determine amount of masking for each band caused by nearby band using the psychoacoustic model shown above. 3.If the power in a band is below the masking threshold, don't encode it. 4.Otherwise, determine number of bits needed to represent the coefficient such that noise introduced by quantization is below the masking effect (Recall that one fewer bit of quantization introduces about 6 dB of noise). 5.Format bitstream

CPSC 538a MPEG Audio Tutorial January 12, of 17 Example After analysis, the first levels of 16 of the 32 bands are: Band Level(db) If the level of the 8th band is 60dB, it gives a masking of 12 dB in the 7th band, 15dB in the 9th. Level in 7th band is 10 dB ( < 12 dB ), so ignore it. Level in 9th band is 35 dB ( > 15 dB ), so send it. Only the amount above the masking level needs to be sent, so instead of using 6 bits to encode it, we can use 4 bits saving 2 bits (= 12 dB).

CPSC 538a MPEG Audio Tutorial January 12, of 17 MPEG layers Layer 1 –DCT-type filter with one frame –equal frequency spread per band –Psychoacoustic model only uses frequency masking. Layer 2 –Use three frames in filter –before, current, next, a total of 1152 samples –models a bit of temporal masking Layer 3 (mp3) –Better critical band filter is used (non-equal frequencies) –psychoacoustic model includes temporal masking effects –takes into account stereo redundancy –uses Huffman coder

CPSC 538a MPEG Audio Tutorial January 12, of 17 Related Work MPEG phase 2 –Multichannel (5.1) audio support –Significant in driving DVD sales MPEG-4 Structured Audio –Efficient, flexible description of synthetic music Copy protection and copyright Speech Processing –Uses many similar techniques

CPSC 538a MPEG Audio Tutorial January 12, of 17 References SFU CMPT 365 Course Contents Spring 2003 Basics of Digital Audio, retrieved January 7, Audio Compression, retrieved January 7, Audio and Multimedia Layer 3, January 8, MP3 Backgrounder January 8, 2004http:// Scheirer, E. D., The MPEG-4 Structured Audio, Proceedings of ICASSP98 Scheirer, E.D., SAOL / MPEG-4 Structured Audio homepage, PERKOWSKI, M. A., Speech Signals in Time and Frequency Domain frequency-domain.pdf, January 11, frequency-domain.pdf Graps, A. “An Introduction to Wavelets" IEEE Computational Sciences and Engineering, Volume 2, Number 2, Summer 1995, pp Also available at Signals Demonstrations