Media Representations - Audio

Slides:



Advertisements
Similar presentations
Tamara Berg Advanced Multimedia
Advertisements

CS335 Principles of Multimedia Systems Audio Hao Jiang Computer Science Department Boston College Oct. 11, 2007.
Sound in multimedia How many of you like the use of audio in The Universal Machine? What about The Universal Computer? Why or why not? Does your preference.
1. Digitization of Sound What is Sound? Sound is a wave phenomenon like light, but is macroscopic and involves molecules of air being compressed and expanded.
Time-Frequency Analysis Analyzing sounds as a sequence of frames
Part A Multimedia Production Rico Yu. Part A Multimedia Production Ch.1 Text Ch.2 Graphics Ch.3 Sound Ch.4 Animations Ch.5 Video.
Digital Audio Coding – Dr. T. Collins Standard MIDI Files Perceptual Audio Coding MPEG-1 layers 1, 2 & 3 MPEG-4.
Speech Compression. Introduction Use of multimedia in personal computers Requirement of more disk space Also telephone system requires compression Topics.
4-Integrating Peripherals in Embedded Systems (cont.)
Motivation Application driven -- VoD, Information on Demand (WWW), education, telemedicine, videoconference, videophone Storage capacity Large capacity.
4.2 Digital Transmission Pulse Modulation (Part 2.1)
Chapter 2 Digital data Ola A. Younis. Elements of digital media Symbols : representation for something else. Example: a group of letters often serve as.
Digital Representation of Audio Information Kevin D. Donohue Electrical Engineering University of Kentucky.
SIMS-201 Characteristics of Audio Signals Sampling of Audio Signals Introduction to Audio Information.
IT-101 Section 001 Lecture #8 Introduction to Information Technology.
Digital Audio.
1 Audio Compression Techniques MUMT 611, January 2005 Assignment 2 Paul Kolesnik.
Howell Istance School of Computing De Montfort University
MPEG Audio Compression by V. Loumos. Introduction Motion Picture Experts Group (MPEG) International Standards Organization (ISO) First High Fidelity Audio.
Digital Voice Communication Link EE 413 – TEAM 2 April 21 st, 2005.
EET 450 Chapter 18 – Audio. Analog Audio Sound is analog Consists of air pressure that has a variety of characteristics  Frequencies  Amplitude (loudness)
5. Multimedia Data. 2 Multimedia Data Representation  Digital Audio  Sampling/Digitisation  Compression (Details of Compression algorithms – following.
Fundamentals of Digital Audio. The Central Problem n Waves in nature, including sound waves, are continuous: Between any two points on the curve, no matter.
Chapter 4 Digital Transmission
Digital Audio Multimedia Systems (Module 1 Lesson 1)
Fundamentals of Multimedia, Chapter 6 Sound Intro Tamara Berg Advanced Multimedia 1.
Audio Fundamentals Lesson 2 Sound, Sound Wave and Sound Perception
Digital audio. In digital audio, the purpose of binary numbers is to express the values of samples that represent analog sound. (contrasted to MIDI binary.
Digital Audio What do we mean by “digital”? How do we produce, process, and playback? Why is physics important? What are the limitations and possibilities?
Fundamentals of Digital Communication
Digital Audio Watermarking: Properties, characteristics of audio signals, and measuring the performance of a watermarking system نيما خادمي کلانتري
Introduction to Interactive Media 10: Audio in Interactive Digital Media.
COSC 3213 – Computer Networks I Summer 2003 Topics: 1. Line Coding (Digital Data, Digital Signals) 2. Digital Modulation (Digital Data, Analog Signals)
Multimedia Technology Digital Sound Krich Sintanakul Multimedia and Hypermedia Department of Computer Education KMITNB.
Chapter 6 Basics of Digital Audio
COMP Representing Sound in a ComputerSound Course book - pages
Audio Compression Usha Sree CMSC 691M 10/12/04. Motivation Efficient Storage Streaming Interactive Multimedia Applications.
CSC361/661 Digital Media Spring 2002
AUDIO COMPRESSION msccomputerscience.com. The process of digitizing audio signals is called PCM PCM involves sampling audio signal at minimum rate which.
1 4-Integrating Peripherals in Embedded Systems (cont.)
Computer Some basic concepts. Binary number Why binary? Look at a decimal number: 3511 Look at a binary number: 1011 counting decimal binary
10/6/2015 3:12 AM1 Data Encoding ─ Analog Data, Digital Signals (5.3) CSE 3213 Fall 2011.
Basics of Digital Audio Outline  Introduction  Digitization of Sound  MIDI: Musical Instrument Digital Interface.
Pulse Code Modulation Pulse Code Modulation (PCM) : method for conversion from analog to digital waveform Instantaneous samples of analog waveform represented.
Multimedia Technology and Applications Chapter 2. Digital Audio
Digital Multimedia, 2nd edition Nigel Chapman & Jenny Chapman Chapter 9 This presentation © 2004, MacAvon Media Productions Sound.
1 Audio Compression. 2 Digital Audio  Human auditory system is much more sensitive to quality degradation then is the human visual system  redundancy.
1 Introduction to Information Technology LECTURE 6 AUDIO AS INFORMATION IT 101 – Section 3 Spring, 2005.
Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh.
CS Spring 2009 CS 414 – Multimedia Systems Design Lecture 3 – Digital Audio Representation Klara Nahrstedt Spring 2009.
Encoding and Simple Manipulation
4.2 Digital Transmission Pulse Modulation Pulse Code Modulation
Encoding How is information represented?. Way of looking at techniques Data Medium Digital Analog Digital Analog NRZ Manchester Differential Manchester.
CS Spring 2014 CS 414 – Multimedia Systems Design Lecture 3 – Digital Audio Representation Klara Nahrstedt Spring 2014.
Multimedia Sound. What is Sound? Sound, sound wave, acoustics Sound is a continuous wave that travels through a medium Sound wave: energy causes disturbance.
1 What is Multimedia? Multimedia can have a many definitions Multimedia means that computer information can be represented through media types: – Text.
Digital Audio I. Acknowledgement Some part of this lecture note has been taken from multimedia course made by Asst.Prof.Dr. William Bares and from Paul.
Fundamentals of Multimedia Chapter 6 Basics of Digital Audio Ze-Nian Li and Mark S. Drew 건국대학교 인터넷미디어공학부 임 창 훈.
Lifecycle from Sound to Digital to Sound. Characteristics of Sound Amplitude Wavelength (w) Frequency ( ) Timbre Hearing: [20Hz – 20KHz] Speech: [200Hz.
Digital Signal Processing (7KS01)
Analog to digital conversion
UNIT – III I: Digital Transmission.
Multimedia Systems and Applications
4.1 Chapter 4 Digital Transmission Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
4.2 Digital Transmission Pulse Modulation (Part 2.1)
Digital Control Systems Waseem Gulsher
Assist. Lecturer Safeen H. Rasool Collage of SCIENCE IT Dept.
Govt. Polytechnic Dhangar(Fatehabad)
Digital Audio Application of Digital Audio - Selected Examples
Presentation transcript:

Media Representations - Audio

Outline Audio Signals Audio file format Human auditory system Sampling Quantization Audio file format WAV/MIDI Human auditory system

What is Sound ? Sound is a wave phenomenon, involving molecules of air being compressed and expanded under the action of some physical device. A speaker (or other sound generator) vibrates back and forth and produces a longitudinal pressure wave that perceived as sound. Since sound is a pressure wave, it takes on continuous values, as opposed to digitized ones. If we wish to use a digital version of sound waves, we must form digitized representations of audio information.

Digitization Digitization means conversion to a stream of numbers, and preferably these numbers should be integers for efficiency. 1-dimensional nature of sound: amplitude values (sound pressure/level) depend on a 1D variable, time.

Digitization cont’d Digitization must be in both time and amplitude Sampling: measuring the quantity we are interested in, usually at evenly-spaced intervals First kind of sampling, using measurements only at evenly spaced time intervals, is simply called sampling. The rate is called the sampling frequency For audio, typically from 8 kHz (8,000 samples per second) to 48 kHz (determined by Nyquist theorem discussed later). Sampling in the amplitude or voltage dimension is called quantization

Sampling and Quantization

Audio Digitization (PCM) PCM: Pulse coded modulation

Parameters in Digitizing To decide how to digitize audio data we need to answer the following questions: 1. What is the sampling rate? 2. How finely is the data to be quantized, and is quantization uniform? 3. How is audio data formatted? (file format)

Sampling Rate Signals can be decomposed into a sum of sinusoids. -- weighted sinusoids can build up quite a complex signals (recall Calculus and linear algebra)

Sampling Rate cont’d If sampling rate just equals the actual frequency a false signal (constant ) is detected If sample at 1.5 times the actual frequency an incorrect (alias) frequency that is lower than the correct one it is half the correct one -- the wavelength, from peak to peak, is double that of the actual signal

Nyquist Theorem For correct sampling we must use a sampling rate equal to at least twice the maximum frequency content in the signal. This rate is called the Nyquist rate. Sampling theory – Nyquist theorem If a signal is band(frequnecy)-limited, i.e., there is a lower limit f1 and an upper limit f2 of frequency components in the signal, then the sampling rate should be at least 2(f2 − f1). Proof and more math: http://en.wikipedia.org/wiki/Nyquist-Shannon_sampling_theorem

Quantization (Pulse Code Modulation) At every time interval the sound is converted to a digital equivalent Using 2 bits the following sound can be digitized Tel: 8 bits CD: 16 bits

More on quantization Sample Resolution/Sample Size Each sample can only be measured to a certain degree of accuracy. The accuracy is dependent on the number of bits used to represent the amplitude, which is also known as the sample resolution. How do we store each sample value (quantized value)? 8 bit value (0-255) 16 bit value (Integer) (0-65535)

The amount of memory required to store t seconds long sample is as follows: If we use 8 bit resolution, mono recording memory = f*t*8*1 If we use 8 bit resolution, stereo recording memory = f*t*8*2 If we use 16 bit resolution, and mono recording memory = f*t*16*1 If we use 16 bit resolution, and stereo recording memory =f* t*16*2 where f is sampling frequency, and t is time duration in seconds

Implications of Sample Rate and Bit Size Affects Quality of Audio Affects Size of Data Clipping Both analog and digital media have an upper limit beyond which they can no longer accurately represent amplitude. Analog clipping varies in quality depending on the medium.

Digitize audio Each sample quantized, i.e., rounded e.g., 28=256 possible quantized values Each quantized value represented by bits 8 bits for 256 values Example: 8,000 samples/sec, 256 quantized values --> 64,000 bps Receiver converts it back to analog signal: some quality reduction Example rates CD: 1.411 Mbps MP3: 96, 128, 160 kbps Internet telephony: 5.3 - 13 kbps Think about the no of bits required to represent these rates

Audio Quality vs. Data Rate

More on Quantization Quantization is lossy ! Roundoff errors => quantization noise/error

values A=3 B=1 C=3 These values are converted in to binary D=1 . These values are converted in to binary Base on the sample rate (011 for A if bits sample is three)

Quantization Noise Quantization noise: the difference between the actual value of the analog signal, for the particular sampling time, and the nearest quantization interval value. At most, this error can be as much as half of the interval. The quality of the quantization is characterized by the Signal to Quantization Noise Ratio (SQNR). A special case of SNR (Signal to Noise Ratio)

Common sound levels

Audio File Format: .WAV Microsoft format: Interleaved multi-channel samples http://ccrma.stanford.edu/courses/422/projects/WaveFormat/

Audio File Format: MIDI MIDI: Musical Instrument Digital Interface A simple scripting language and hardware setup MIDI Overview MIDI codes “events" that stand for the production of sounds. E.g., a MIDI event might include values for the pitch of a single note, its duration, and its volume. MIDI is a standard adopted by the electronic music industry for controlling devices, such as synthesizers and sound cards, that produce music. Supported by most sound cards

Computer vs. Ear Multimedia signals are interpreted by humans! Need to understand human perception Almost all original multimedia signals are analog signals: A/D conversion is needed for computer processing

Properties of HAS: Human Auditory System Range of human’ hearing: 20Hz - 20kHz  Minimal sampling rate for music: 40 kHz (Nyquist frequency) CD Audio: 44.1 kHz sampling rate each sample is represented by a 16-bit signed integer 2 channels are used to create stereo system 44100 * 16 * 2 = 1,411,200 bits / second (bps) Speech signal: 300 Hz – 4 KHz  Minimum sampling rate is 8 KHz (as in telephone system) The extremes of the human voice http://www.noiseaddicts.com/2009/04/extremes-of-human-voice/

Properties of Human Auditory System Hearing threshold varies dramatically at different frequencies Most sensitive around 2KHz

Properties of Human Auditory System Critical Bands: Our brains perceive the sounds through 25 distinct critical bands. The bandwidth grows with frequency (above 500Hz). At 100Hz, the bandwidth is about 160Hz; At 10kHz it is about 2.5kHz in width. 1 2 3 4 5 6 24 25 … … frequency

Properties of Human Auditory System Masking effect: what we hear depends on what audio environment we are in One strong signal can overwhelm/ hide another The masking effects in the frequency domain: A masker inhibits perception of coexisting signals below the masking threshold.

Properties of Human Auditory System Masking thresholds in the time domain: Simultaneous masking: Two sounds occur simultaneously and one is masked by the other. Backward masking (Pre): A softer sound that occurs prior to a loud one will be masked by the louder sound. Forward masking (Post): softer sounds that occur as much as 200 milliseconds after the loud sound will also be masked.

HAS: Audio Filtering Prior to sampling and AD (Analog-to-Digital) conversion, the audio signal is also usually filtered to remove unwanted frequencies. For speech, typically from 50Hz to 10kHz is retained, and other frequencies are blocked by the use of a band-pass filter that screens out lower and higher frequencies An audio music signal will typically contain from about 20Hz up to 20kHz At the DA converter end, high frequencies may reappear in the output (Why ?) because of sampling and then quantization, smooth input signal is replaced by a series of step functions containing all possible frequencies So at the decoder side, a lowpass filter is used after the DA circuit

HAS: Perceptual audio coding The HAS properties can be exploited in audio coding: Different quantizations for different critical bands Subband coding If you can’t hear the sound, don’t encode it Discard weaker signal if a stronger one exists in the same band (frequency-domain masking) Discard soft sound after a loud sound (time-domain masking) Stereo redundancy: At low frequencies, we can’t detect where the sound is coming from. Encode it mono.