Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh.

Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh Presentation On Audio Compression & Psychoacoustic 1

Content History of Audio Compression Basic of Audio Compression Categorization of Audio compression Silence compression ADPCM LPC CELP Psychoacoustics Frequency Masking Critical Band Temporal Masking 2

Historyof Audio Compression First form of audio compression came out in 1939, when Dudley first introduced the Vocoders to reduce the amount of bandwidth needed to transmit speech over a telephone line. In the 1960 compression was used in telephony. Now a days various compression techniques are used in Storage devices and with various File Formats. 3

Basic of Audio Compression Compression can be accomplished using two ways: a. Take the data from a standard digital audio system and compress it using S/W. b. To encode the signal in a different system and compressed by the H/W. The sounds we hear are caused by variation in air pressure which are picked up by our ear. In an analog electronic audio system, these pressure signals are converted to a electric voltage by a microphone. 4

Voice Pattern 5

Quantization of Voice Signal 6

Signal Reconstruction 7

Categorization of Audio Compression 8 Audio Compression Simple Audio Compression: 1.Silence Compression using RLE. 2.Adaptive Differential PCM. 3.Linear Predicative Coding. 4.Code Excited Linear Prediction. Psychoacoustic: 1.Frequency Masking 2.Temporal Masking. MPEG Audio Compression.

Silence Compression using RLE It is a form of lossless compression. It is easy to implement. Silence are replaced by the code and no of its consecutive sequence. Steps: a. Determine threshold for audio data. b. If the audio level is below the threshold, will be considered as silence. c. Silence in the audio is replace by code(e.g.”0”), The higher the threshold level more will be compression and hence more will the loss of info. Silence encoding is important for human speech as it has flat pauses between the spoken words. 9

Adaptive Differential PCM Used for quantization of audio signal. Defined the scaled difference signal fn as: e n is difference between two signals, α is multiplier constant. f n is fed into the quantizer for quantization. 10

Vocoders It is Voice Coders. Used in Linear Predictive Coding. Used for filtering various frequency range by using sub band filters. Consonants like M, N can be taken as voice as it uses vocal cord. 11 Sound Voice (Pulse like Vowels) Unvoice (Noise like Consonants)

Working of Vocoder Pitch of period of voice is considered. Voiced/unvoiced bit is set for voice and reset for unvoiced. Frequency of the sound is filtered by various filters. Signal transmitted to receiver end and then decoded there. 12

Linear Predictive Coding LPC vocoders extracts salient features of speech directly from the waveform rather than transforming the signal to the frequency domain. Bit rate is small as sound is not sent but its analyzed attributes are sent. Attributes or description parameters (like gain, max and min amplitude etc). 13 Sound Signal Segments Sample (Speech Frames)

Linear Predictive Coding LPC decide whether the current segment is voiced or unvoiced. For unvoice: Noise generator is used to create sample values f(n). For voice: Pulse train generator is used to create sample values f(n). S(n) is current o/p, s(n-i) represents the previous o/p, G is gain factor, f(n) is current frame input. It is called linear because it consider previous output also and act linearly. The speech encoder works in a block-wise fashion. Adv: Simple and easy to implement. Disadv: Error factor in generated o/p is more. 14

Code Excited Linear Prediction It is more complex. There is a code book of excitation vector to which actual speech is matched and the index of the best match is sent to the receiver. This complexity increase the bitrates to 4800-9600 bps. CELP codes has two kinds of predictions: A. STP (Short time prediction): Predict within the sample and remove redundancy within speech frames. B. LTP(Long time prediction): Removes redundancy within the segment. Adv: It nearly produce the original sound. Disadv: It is complex and requires more bandwidth. 15

Psychoacoustic Psychoacoustics modeling referred to as perceptual coding. Range of human hearing 20Hz to 20KHz. Most audible range 500Hz to 4KHz. Maximum amplitude of quietest sound human can hear is 120 dB. 16

Equal Loudness Relation 17

Frequency Masking Threshold of Hearing: 18

Frequency Masking Curve The greater the power in the masking tone the wider its influence- then broader the range of frequency it can mask. If two tones are widely separated in frequency, little masking occurs. 19

Multiple Frequency Tone Masking 20

Critical Band The critical band represents the ears resolving power for simultaneous tones or partials. 21

Bark Unit Critical band unit given by Heinrich Barkhausen. 22

Temporal Masking The louder the test tone, the shorter the amount of time required before the test tone is audible once the masking tone is removed. 23

Summary Basic of Audio Compression. Types of Audio Compression. Fundamentals of psychoacoustics. 24

FAQ’s Why linear predictive coding is called linear? What is the significance of equal-loudness curve? How RLE can be applied on audio? What is the role of noise generator and pulse generator in Vocoder? 25

Refrences Fundamental of Multimedia by Le & Drew. http://www.cs.cf.ac.uk http://www.cs.sfu.ca/CourseCentral/365.html 26

Queries ? 27

Thank You For Your Patience 28

Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh.

Similar presentations

Presentation on theme: "Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh.

Similar presentations

Presentation on theme: "Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh."— Presentation transcript:

Similar presentations

About project

Feedback