Digital Representation of Audio Information Kevin D. Donohue Electrical Engineering University of Kentucky.

Slides:



Advertisements
Similar presentations
T.Sharon-A.Frank 1 Multimedia Compression Basics.
Advertisements

Audio Compression ADPCM ATRAC (Minidisk) MPEG Audio –3 layers referred to as layers I, II, and III –The third layer is mp3.
Department of Computer Engineering University of California at Santa Cruz MPEG Audio Compression Layer 3 (MP3) Hai Tao.
Introduction to MP3 and psychoacoustics Material from website by Mark S. Drew
MPEG/Audio Compression Tutorial Mike Blackstock CPSC 538a January 11, 2004.
CS335 Principles of Multimedia Systems Audio Hao Jiang Computer Science Department Boston College Oct. 11, 2007.
MPEG-1 MUMT-614 Jan.23, 2002 Wes Hatch. Purpose of MPEG encoding To decrease data rate How? –two choices: could decrease sample rate, but this would cause.
Time-Frequency Analysis Analyzing sounds as a sequence of frames
Dale & Lewis Chapter 3 Data Representation Analog and digital information The real world is continuous and finite, data on computers are finite  need.
Digital Audio Compression
Analogue to Digital Conversion (PCM and DM)
Digital Audio Coding – Dr. T. Collins Standard MIDI Files Perceptual Audio Coding MPEG-1 layers 1, 2 & 3 MPEG-4.
Speech Compression. Introduction Use of multimedia in personal computers Requirement of more disk space Also telephone system requires compression Topics.
Carnegie Mellon © Copyright 2000 Michael G. Christel and Alexander G. Hauptmann.
AUDIO COMPRESSION TOOLS & TECHNIQUES Gautam Bhattacharya.
Motivation Application driven -- VoD, Information on Demand (WWW), education, telemedicine, videoconference, videophone Storage capacity Large capacity.
4.2 Digital Transmission Pulse Modulation (Part 2.1)
1 Digital Audio Compression. 2 Formats  There are many different formats for storing and communicating digital audio:  CD audio  Wav  Aiff  Au 
1 Audio Compression Techniques MUMT 611, January 2005 Assignment 2 Paul Kolesnik.
Spatial and Temporal Data Mining
MPEG Audio Compression by V. Loumos. Introduction Motion Picture Experts Group (MPEG) International Standards Organization (ISO) First High Fidelity Audio.
Communication Systems
Digital Voice Communication Link EE 413 – TEAM 2 April 21 st, 2005.
Department of Computer Engineering University of California at Santa Cruz Data Compression (2) Hai Tao.
COMP 249 :: Spring 2005 Slide: 1 Audio Coding Ketan Mayer-Patel.
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 1 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision.
Ch 6 Sampling and Analog-to-Digital Conversion
Fundamentals of Perceptual Audio Encoding Craig Lewiston HST.723 Lab II 3/23/06.
Digital Audio Multimedia Systems (Module 1 Lesson 1)
1 Audio Compression Multimedia Systems (Module 4 Lesson 4) Summary: r Simple Audio Compression: m Lossy: Prediction based r Psychoacoustic Model r MPEG.
CS :: Fall 2003 Audio Coding Ketan Mayer-Patel.
1/21 Chapter 5 – Signal Encoding and Modulation Techniques.
Formatting and Baseband Modulation
Digital audio. In digital audio, the purpose of binary numbers is to express the values of samples that represent analog sound. (contrasted to MIDI binary.
A/D Conversion No. 1  Seattle Pacific University Analog to Digital Conversion Based on Chapter 5 of William Stallings, Data and Computer Communication.
Digital Audio What do we mean by “digital”? How do we produce, process, and playback? Why is physics important? What are the limitations and possibilities?
DIGITAL VOICE NETWORKS ECE 421E Tuesday, October 02, 2012.
GCT731 Fall 2014 Topics in Music Technology - Music Information Retrieval Overview of MIR Systems Audio and Music Representations (Part 1) 1.
EE Audio Signals and Systems Effects Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
LECTURE Copyright  1998, Texas Instruments Incorporated All Rights Reserved Encoding of Waveforms Encoding of Waveforms to Compress Information.
Audio Compression Usha Sree CMSC 691M 10/12/04. Motivation Efficient Storage Streaming Interactive Multimedia Applications.
AUDIO COMPRESSION msccomputerscience.com. The process of digitizing audio signals is called PCM PCM involves sampling audio signal at minimum rate which.
CMPT 365 Multimedia Systems
Media Representations - Audio
MPEG Audio coders. Motion Pictures Expert Group(MPEG) The coders associated with audio compression part of MPEG standard are called MPEG audio compressor.
Image Processing and Computer Vision: 91. Image and Video Coding Compressing data to a smaller volume without losing (too much) information.
Digital Multimedia, 2nd edition Nigel Chapman & Jenny Chapman Chapter 9 This presentation © 2004, MacAvon Media Productions Sound.
Image Compression Supervised By: Mr.Nael Alian Student: Anwaar Ahmed Abu-AlQomboz ID: IT College “Multimedia”
1 Audio Compression. 2 Digital Audio  Human auditory system is much more sensitive to quality degradation then is the human visual system  redundancy.
8. 1 MPEG MPEG is Moving Picture Experts Group On 1992 MPEG-1 was the standard, but was replaced only a year after by MPEG-2. Nowadays, MPEG-2 is gradually.
Outline Kinds of Coding Need for Compression Basic Types Taxonomy Performance Metrics.
Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh.
CS Spring 2009 CS 414 – Multimedia Systems Design Lecture 3 – Digital Audio Representation Klara Nahrstedt Spring 2009.
Advances in digital image compression techniques Guojun Lu, Computer Communications, Vol. 16, No. 4, Apr, 1993, pp
MMDB-8 J. Teuhola Audio databases About digital audio: Advent of digital audio CD in Order of magnitude improvement in overall sound quality.
MPEG-1Standard By Alejandro Mendoza. Introduction The major goal of video compression is to represent a video source with as few bits as possible while.
Digital Audio III. Sound compression (I) Compression of sound data requires different techniques from those for graphical data Requirements are less stringent.
1 Audio Coding. 2 Digitization Processing Signal encoder Signal decoder samplingquantization storage Analog signal Digital data.
4.2 Digital Transmission Pulse Modulation Pulse Code Modulation
CS Spring 2014 CS 414 – Multimedia Systems Design Lecture 3 – Digital Audio Representation Klara Nahrstedt Spring 2014.
1 What is Multimedia? Multimedia can have a many definitions Multimedia means that computer information can be represented through media types: – Text.
Fundamentals of Multimedia Chapter 6 Basics of Digital Audio Ze-Nian Li and Mark S. Drew 건국대학교 인터넷미디어공학부 임 창 훈.
Lifecycle from Sound to Digital to Sound. Characteristics of Sound Amplitude Wavelength (w) Frequency ( ) Timbre Hearing: [20Hz – 20KHz] Speech: [200Hz.
III Digital Audio III.6 (Fr Oct 20) The MP3 algorithm with PAC.
Digital Communications Chapter 13. Source Coding
MPEG-1 Overview of MPEG-1 Standard
III Digital Audio III.6 (Mo Oct 22) The MP3 algorithm with PAC.
COMS 161 Introduction to Computing
Audio Compression Techniques
Govt. Polytechnic Dhangar(Fatehabad)
Presentation transcript:

Digital Representation of Audio Information Kevin D. Donohue Electrical Engineering University of Kentucky

Elements of a DSP System Digital Signal Discrete-time Signal Processed Analog Signal Processed Digital Signal Quantizer Analog Signal Coder Computing /Decoding Interpolating /Smoothing

Critical Audio Issues Trade-off between resources to store/transmit and quality of audio information  Sampling rate  Quantization level  Compression techniques

Sound and Human Perception  Signal fidelity does not need to exceed the sensitivity of the auditory system

Audible Frequency Range and Sampling Rate  Frequency range - 20 to 20,000 Hz  Audible intensities - threshold of hearing (1 Pico watt/meter 2 corresponds to 0 db  Sample sweep constant intensity – 0 to 20 kHz in 10 seconds

Sampling Requirement  A bandlimited signal can be completely reconstructed from a set of discrete samples by low-pass filtering (or interpolating) a sequence of its samples, if the original signal was sampled at a rate greater than twice its highest frequency.  Aliasing errors occur when original signal contains frequencies greater than or equal to half the sampling rate.  Signal energy beyond 20 kHz is not audible,  sampling rates beyond 40 kHz should capture almost all audible detail (no perceived quality loss).

Sampling Standards  CD quality samples at 44.1 kHz  DVD quality samples at 48 kHz  Telephone quality 8 kHz.

Spectogram of CD sound

Spectrogram at Telephone Rate Sound

Bandwidth and Sampling Errors  Original Sound  Limited Bandwidth (LPF with 900 Hz cutoff) and sampled at 2 kHz  Original Sound sampled at 2 kHz (aliasing)

Dynamic Range and Audible Sound  Intensity changes less than 1 dB in intensity typically are not perceived by the human auditory system.  25 tones at 1 kHz, decreasing in 3 dB increments  The human ear can detect sounds from 1x to 10 watts / meter 2 (130 dB dynamic range)

Quantization Levels and Dynamic Range  An N bit word can represent 2 N levels  For audio signal an N bit word corresponds to: Nx20xLog 10 (2) dB dynamic range  16 bits achieve a dynamic range of about 96 dB. For every bit added, about 6 db is added to the dynamic range.

Quantization Error and Noise  Quantization has the same effects as adding noise to the signal: AnalogDiscreteDigital  Intervals between quantization levels are proportional to the resulting quantization noise.  For uniform quantization, the interval between signal levels is the maximum signal amplitude value divided by the number of quantization intervals.

Quantization Noise  Original CD clip quantized with 6 bits at original sampling frequency  6 bit quantization at 2 kHz sampling

Encoding and Resources  Pulse code modulation (PCM) encodes each sample over uniformly spaced N bit quantization levels.  Number of bits required to represent C channels of a d second signal sampled at Fs with N bit quantization is: d*C*N*Fs + bits of header information  A 4 minute CD quality sound clip uses Fs=44.1 kHz, C=2, N=16 (assume no header):  File size = (4*60)*2*16*44.1k = Mb (or MBytes)  Transmission in real time requires a rate greater than 1.4 Mb/s

Compression Techniques  Compression methods take advantage of signal redundancies, patterns, and predictability via:  Efficient basis function transforms (wavelet and DCT)  LPC modeling (linear predictive coding)  CLPC (code excited linear prediction)  ADPCM (adaptive delta pulse code modulation)  Huffman encoding

File Formats Critical parameters for data encoding describe how samples are stored in the file ã signed or unsigned ã bits per sample ã byte order ã number of channels and interleaving ã compression parameters

File Formats Extension, name origin variable parameters (fixed; Comments).Au or.snd next, sun rate, #channels, encoding, info string.aif(f), AIFF apple, SGI rate, #channels, sample width, lots of info.aif(f), AIFC apple, SGI same (extension of AIFF with compression).Voc Soundblaster rate (8 bits/1 ch; Can use silence deletion).Wav, wave Microsoft rate, #channels, sample width, lots of info.sf IRCAM rate, #channels, encoding, info None, HCOM Mac rate (8 bits/1 ch; Uses Huffman compression) More details can be found at:

Subband Filtering and MPEG Subband filtering transforms a block of time samples (frame) into a parallel set of narrow band signal

MPEG Layers â MPEG defines 3 layers for audio. Basic model is same, but codec complexity increases with each layer. â Divides data into frames, each of them contains 384 samples, 12 samples from each of the 32 filtered subbands. â Layer 1: DCT type filter with one frame and equal frequency spread per band. Psychoacoustic model only uses frequency masking (4:1). â Layer 2: use three frames in filter (before, current, next, a total of 1152 samples). This models some temporal masking (6:1). â Layer 3: better critical band filter is used (non-equal frequencies), psychoacoustic model includes temporal masking effects, takes into account stereo redundancy, and uses Huffman coder (12:1).

MPEG - Audio Steps in algorithm: â Filters audio signal (e.g. 48 kHz sound) into frequency subbands that approximate the 32 critical bands --> sub-band filtering. â Determine amount of masking for each band caused by nearby band (this is called the psychoacoustic model). â If the power in a band is below the masking threshold, don't encode it. Otherwise, determine number of bits needed to represent the coefficient such that noise introduced by quantization is below the masking effect. â Format bitstream

Example After analysis, the first levels of 16 of the 32 bands are these: Band Level (db) If the level of the 8th band is 60db, It gives a masking of 12 db in the 7th band, 15db in the 9th. Level in 7th band is 10 db ( < 12 db ), so ignore it. Level in 9th band is 35 db ( > 15 db ), so send it. --> Can encode with up to 2 bits (= 12 db) of quantization error