Digital Audio III. Sound compression (I) Compression of sound data requires different techniques from those for graphical data Requirements are less stringent.

Slides:



Advertisements
Similar presentations
Multimedia: Digitised Sound Data Section 3. Sound in Multimedia Types: Voice Overs Special Effects Musical Backdrops Sound can make multimedia presentations.
Advertisements

Speech Coding Techniques
T.Sharon-A.Frank 1 Multimedia Compression Basics.
15 Data Compression Foundations of Computer Science ã Cengage Learning.
1 CMSHN1114/CMSCD1011 Introduction to Computer Audio Lecture 5: Digital audio formats Dr David England School of Computing and Mathematical Sciences
A stereo audio file 1. Audio Channels Number of audio channels determines number of waveforms in a recording Two relevant types of recording Stereo recording.
Sound in multimedia How many of you like the use of audio in The Universal Machine? What about The Universal Computer? Why or why not? Does your preference.
Sound can make multimedia presentations dynamic and interesting.
4.1Different Audio Attributes 4.2Common Audio File Formats 4.3Balancing between File Size and Audio Quality 4.4Making Audio Elements Fit Our Needs.
School of Informatics CG087 Time-based Multimedia Assets Compression & StreamingDr Paul Vickers1 Compression & Streaming Serving, shrinking, and otherwise.
Time-Frequency Analysis Analyzing sounds as a sequence of frames
Dale & Lewis Chapter 3 Data Representation Analog and digital information The real world is continuous and finite, data on computers are finite  need.
Digital Audio Compression
Chapter 4: Representation of data in computer systems: Sound OCR Computing for GCSE © Hodder Education 2011.
I Power Higher Computing Multimedia technology Audio.
Digital Representation of Audio Information Kevin D. Donohue Electrical Engineering University of Kentucky.
1 Digital Audio Compression. 2 Formats  There are many different formats for storing and communicating digital audio:  CD audio  Wav  Aiff  Au 
Image and Sound Editing Raed S. Rasheed Sound What is sound? How is sound recorded? How is sound recorded digitally ? How does audio get digitized.
CELLULAR COMMUNICATIONS 5. Speech Coding. Low Bit-rate Voice Coding  Voice is an analogue signal  Needed to be transformed in a digital form (bits)
1 Audio Compression Techniques MUMT 611, January 2005 Assignment 2 Paul Kolesnik.
SWE 423: Multimedia Systems Chapter 7: Data Compression (1)
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 1 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision.
Computer Science 335 Data Compression.
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 1 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision.
5. Multimedia Data. 2 Multimedia Data Representation  Digital Audio  Sampling/Digitisation  Compression (Details of Compression algorithms – following.
Digital Audio Multimedia Systems (Module 1 Lesson 1)
1 Audio Compression Multimedia Systems (Module 4 Lesson 4) Summary: r Simple Audio Compression: m Lossy: Prediction based r Psychoacoustic Model r MPEG.
Sound in Multimedia and HCI
 Continuous sequence of vibrations of air  (Why no sound in space? Contrary to Star Wars etc.)  Abstraction of an audio wave:  Ear translates vibrations.
Media File Formats Jon Ivins, DMU. Text Files n Two types n 1. Plain text (unformatted) u ASCII Character set is most common u 7 bits are used u This.
Introduction to Interactive Media 10: Audio in Interactive Digital Media.
GODIAN MABINDAH RUTHERFORD UNUSI RICHARD MWANGI.  Differential coding operates by making numbers small. This is a major goal in compression technology:
School of Informatics CG087 Time-based Multimedia Assets Compression & StreamingDr Paul Vickers1 Compression & Streaming Serving, shrinking, and otherwise.
COMP Representing Sound in a ComputerSound Course book - pages
Audio Compression Usha Sree CMSC 691M 10/12/04. Motivation Efficient Storage Streaming Interactive Multimedia Applications.
Overview of Multimedia A multimedia presentation might contain: –Text –Animation –Digital Sound Effects –Voices –Video Clips –Photographic Stills –Music.
CIS679: Multimedia Basics r Multimedia data type r Basic compression techniques.
Digital Multimedia, 2nd edition Nigel Chapman & Jenny Chapman Chapter 9 This presentation © 2004, MacAvon Media Productions Sound.
1 Audio Compression. 2 Digital Audio  Human auditory system is much more sensitive to quality degradation then is the human visual system  redundancy.
Compression No. 1  Seattle Pacific University Data Compression Kevin Bolding Electrical Engineering Seattle Pacific University.
Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh.
CS Spring 2009 CS 414 – Multimedia Systems Design Lecture 3 – Digital Audio Representation Klara Nahrstedt Spring 2009.
Digital Media Dr. Jim Rowan ITEC 2110 Audio. What is audio?
Marwan Al-Namari 1 Digital Representations. Bits and Bytes Devices can only be in one of two states 0 or 1, yes or no, on or off, … Bit: a unit of data.
1 Audio Coding. 2 Digitization Processing Signal encoder Signal decoder samplingquantization storage Analog signal Digital data.
CSCI-100 Introduction to Computing Hardware Part II.
Digital Media Dr. Jim Rowan ITEC 2110 Audio. What is audio?
School of Informatics, Engineering, & Technology CM613 Multimedia Storage & Retrieval Compression & StreamingDr Paul Vickers1 Compression & Streaming Serving,
Audio Streaming © Nanda Ganesan, Ph.D.. Audio File Features Audio file is a record of captured sound that can be played back –The WAV File is an example.
CS Spring 2014 CS 414 – Multimedia Systems Design Lecture 3 – Digital Audio Representation Klara Nahrstedt Spring 2014.
Voice Sampling. Sampling Rate Nyquist’s theorem states that a signal can be reconstructed if it is sampled at twice the maximum frequency of the signal.
1 What is Multimedia? Multimedia can have a many definitions Multimedia means that computer information can be represented through media types: – Text.
Fundamentals of Multimedia Chapter 6 Basics of Digital Audio Ze-Nian Li and Mark S. Drew 건국대학교 인터넷미디어공학부 임 창 훈.
Audio Formats. Digital sound files must be organized and structured so that your media player can read them. It's just like being able to read and understand.
1 Part A Multimedia Production Chapter 2 Multimedia Basics Digitization, Coding-decoding and Compression Information and Communication Technology.
By :- Ishank Ranjan Akash Gupta. Audio & Audio File Formats Audio is an electrical or other representation of sound. An audio file format is a file format.
Lifecycle from Sound to Digital to Sound. Characteristics of Sound Amplitude Wavelength (w) Frequency ( ) Timbre Hearing: [20Hz – 20KHz] Speech: [200Hz.
Digital Audio (2/2) S.P.Vimal CSIS Group BITS-Pilani
Digital Sound Dr. Kairui Chen GGC.
Chapter 4 Fundamentals of Digital Audio
Level 3 Extended Diploma Unit 19 Computer Systems Architecture
Multimedia: Digitised Sound Data
Data Compression.
Sound Digital Multimedia, 2nd edition Nigel Chapman & Jenny Chapman
Chapter 4: Representing sound
15 Data Compression Foundations of Computer Science ã Cengage Learning.
Audio Compression Techniques
Govt. Polytechnic Dhangar(Fatehabad)
Digital Audio Application of Digital Audio - Selected Examples
15 Data Compression Foundations of Computer Science ã Cengage Learning.
Presentation transcript:

Digital Audio III

Sound compression (I) Compression of sound data requires different techniques from those for graphical data Requirements are less stringent than for video – data rate for CD-quality audio is much less than for video, but still exceeds the capacity of dial-up Internet connections (or, indeed, of most broadband connections) Data rate is 44100*2*2 bytes/sec = bytes/s = 1.41Mbits/sec – 3 minute song recorded in stereo occupies 31Mbytes Sound is difficult to compress using lossless methods – complex and unpredictable nature of sound waveforms Different requirements depending on the nature of the sound – speech – music – natural sounds – …and on nature of the application

Sound compression (II) A simple lossless compression method is to record the length of a period of silence – no need to record 44,100 samples of value zero for each second of silence – form of run-length encoding – in reality this is not lossless, as “silence” rarely corresponds to sample values of exactly zero; rather some threshold value is applied Difference between how we perceive sounds and images results in different lossy compression techniques for the two media – high spatial frequencies can be discarded in images – high sound frequencies, however, are highly significant So what can we discard from sound data?

Sound compression (III) Our perception of loudness is essentially logarithmic in the amplitude of a sound Nonlinear quantization techniques provide compression by requiring a smaller sample size to cover the full range of input Remember...

Adaptive Differential Pulse Code Modulation ADPCM Based on storing the difference between samples – related to inter-frame compression of video BUT less straightforward Audio waveforms change rapidly so no reason to assume that differences between samples are small – unlike video, where two consecutive frames may be very similar Differential Pulse Code Modulation (DPCM) computes are predicted value for a sample based on preceding samples – stores difference between predicted and actual sample – difference will be small if prediction is good ADPCM extends this by using an adaptive step (sample) size to obtain further compression – large differences quantized using large steps, while small differences are quantized using small steps – efficient use of bits, taking account of rate of change of signal

Linear Predictive Coding Radical approach to compression of speech Uses mathematical model of the vocal tract Instead of transmitting speech as audio samples, the parameters describing the state of the vocal tract are sent At the receiving end these parameters are used to reconstruct the speech by applying them to the same model Achieves very low data rates: 2.4kbps (usable on a poor quality telephone line!) Speech has a machine-like quality – suitable for accurate transmission of content, but not faithful rendition of a particular voice Similar in concept (but rather more complicated!) to vector coding of 2D and 3D graphics

Perceptually based compression (I) Is there data corresponding to sounds we do not perceive in a sound sample? – If so, then we can discard it, thereby achieving compression Sound may be too quiet to be heard One sound may be obscured by another sound Threshold of hearing – varies nonlinearly with frequency – very low or high frequency sounds must be much louder than mid-range sounds to be heard – we are most sensitive to sounds in the frequency range corresponding to human speech Sounds below the threshold can be discarded – compression algorithm uses a psychoacoustical model that describes how the threshold varies with frequency

Perceptually based compression (II) Loud tones can obscure softer tones that occur at the same time – not just a function of relative loudness, but also of the relative frequencies of the two tones This is known as masking – modification of the threshold of hearing curve in the region of a loud tone – sounds normally above the unmodified threshold are no longer heard Masking can also hide noise as well as other tones – courser quantization (smaller sample size = fewer bits) can be used in regions of loud sounds Actually making use of masking in a compression algorithm is very complicated – MPEG standards use this sort of compression for the audio tracks of video – MP3, which is MPEG-1 Layer 3 audio, achieves 10:1 compression – AAC, from MPEG 4, is even better: approx 128Kbit/second Used by Itunes and QuickTime 6.

Sound files and formats There are many sound file formats Windows PCM waveform (.wav)  a form of RIFF specification; basically uncompressed data Windows ADPCM waveform (.wav)  another form of RIFF file, but compressed to 4 bits/channel CCITT mu-law (A-law) waveforms (.wav)  another form using 8 bit logarithmic compression NeXT/SUN file format (.snd, or.au)  actually many different varieties: header followed by data  data may be in many forms, linear, or mu-law, etc. RealAudio (.ra)  used for streaming audio; a compressed format. MPEG format (includes MP3, AAC)  has various different forms of compression QuickTime, AVI and Shockwave Flash  can include audio as well as video

Examples of sizes A second stretch of sound, digitised at samples/second, 16 bits, mono takes – bytes as a.snd – bytes as an uncompressed.wav – bytes as A-law.wav – bytes as an compressed.wav – 5860 bytes as RealAudio – bytes as ASCII text …and if you listen hard you can hear the difference!

Music So far we have considered digitizing sounds recorded from the real world To store and transmit natural sounds it is generally necessary to use digitized recordings of the real sounds But we have seen that speech can be specified as states of the vocal tract Music can also be specified: musical scores We can send a piece of music to someone as either – a recording of an actual performance – or some notation of the score, provided the receiver has some means of recreating the music from the score e.g. they can play it on a piano etc This is akin to bitmapped versus vector-based graphics