Presentation on theme: "Ch 3 : Multimedia Compression Science and Technology Faculty Informatics Arini, ST, MT Com"— Presentation transcript:
Ch 3 : Multimedia Compression Science and Technology Faculty Informatics Arini, ST, MT arinizul@gmail. Com email@example.com
MULTIMEDIA COMPRESSION I.Fundamental of Data Compression II.Why need Compression III.What is Compression IV.Data Compression Scheme V.Principles of Data Compression VI.Classification of Data Compression
I. Fundamentals of Data Compression Motivation : huge data volumes Text 1 page with 80 characters/line and 64 lines/page and 1 byte/char results in 80 * 64 * 1 * 8 = 40 kbit/page Image 24 bits/pixel, 512 x 512 pixel/image results in 512 x 512 x 24 = 8 Mbit/image Audio CD quality, sampling rate 44,1 KHz, 16 bits per sample results in 44,1 x 16 = 706 kbit/s stereo: 1,412 Mbit/s Video Full-size frame 1024 x 768 pixel/frame, 24 bits/pixel, 30 frames/s results in 1024 x 768 x 24 x 30 = 566 Mbit/s. More realistic: 360 x 240 pixel/frame, 360 x 240 x 24 x 30 = 60 Mbit/s
II. Why need Compression Economy : Storage Transmission Bandwidth, Time, Delay Cost requirement
III. What is Compression Compression is a way to reduce the number of bits in a frame but retaining its meaning. Technique is to identify redundancy and to eliminate it Compression refers to a process of encoding information using fewer than the original representation by using specific encoding schemes. Coding is just some kind of mapping (maps a symbol/message (a number of consecutive symbols in X) to another message (a number of consecutive codewords in Y).
Data compression scheme can be broadly divided into two phases Modelling data about redundancy is analysed & represented as a model (transforming the original data from one form or representation to another) Coding the difference between the actual data and the model is coded (codeword) IV. Data Compression Scheme
2 Techniques : – Static/non adaptive : same codeword for all model data – Dynamic/adaptive : different codeword 4.1. Coding Technique
V. Principles of Data Compression 1. Lossless Compression The original object can be reconstructed perfectly Compression rates of 2:1 to 50:1 are typical 2. Lossy Compression There is a difference between the original object and the reconstructed object. Physiological and psychological properties of the ear and eye can be taken into account Higher compression rates are possible than with lossless compression (typically up to 100:1)
6.1.2. Run Length Coding Principle Replace all repetitions of the same symbol in the text (”runs“) by a repetition counter and the symbol. Example Text: AAAABBBAABBBBBCCCCCCCCDABCBAABBBBCCD Encoding: 4A3B2A5B8C1D1A1B1C1B2A4B2C1D As we can see, we can only expect a good compression rate when long runs occur frequently. Examples are long runs of blanks in text documents, leading zeroes in numbers or strings of „white“ in gray-scale images.
6.1.3. Variable Length Coding Classical character codes use the same number of bits for each character. When the frequency of occurrence is different for different characters, we can use fewer bits for frequent characters and more bits for rare characters. Example Code 1: A B C D E... 1 2 3 4 5 (binary) Encoding of ABRACADABRA with constant bit length (= 5 Bits): – 00001000101001000001000110000100100000010 0010 1001000001
6.1.4. Huffman Code Now the question arises how we can find the best variable-length code for given character frequencies (or probabilities). The algorithm that solves this problem was found by David Huffman in 1952. Algorithm Generate-Huffman-Code Determine the frequencies of the characters and mark the leaf nodes of a binary tree (to be built) with them. 1. Out of the tree nodes not yet marked as DONE, take the two with the smallest frequencies and compute their sum. 2. Create a parent node for them and mark it with the sum. Mark the branch to the left son with 0, the one to the right son with 1. 3. Mark the two son nodes as DONE. When there is only one node not yet marked as DONE, stop (the tree is complete). Otherwise, continue with step 2.
Example : Static Huffman Coding MAMA SAYA A = 4 4/8 = 0.5 M = 2 2/8 = 0.25 S = 1 1/8 = 0.125 Y = 1 1/8 = 0.125 Total = 8 Characters
Huffman Tree : w(A) = 1, w(M) = 00, w(S) = 010, dan w(Y) = 011 MAMA SAYA = 00100101010111
6.1.5. Lempel-Ziv Code Lempel-Ziv codes are an example of the large group of dictionary-based codes. Dictionary: A table of character strings which is used in the encoding process. Example : The word “lecture“ is found on page x4, line y4 of the dictionary. It can thus be encoded as (x4,y4). A sentence such as „this is a lecture“ could perhaps be encoded as a sequence of tuples (x1,y1) (x2,y2) (x3,y3) (x4,y4).
6.1.5. LZW at GIF Encoding process search the same sequence of pixel into image. The highest pattern occurrence will represent these sequence of image compression
6.3. Video Compression 1. MPEG MPEG stands for Moving Picture Experts Group (a committee of ISO). Goal of MPEG-1: compress a video signal (with audio) to a data stream of 1.5 Mbit/s, the data rate of a T1 link in the U.S. and the rate that can be streamed from a CDROM. Random access within 0.5 s while maintaining a good image quality for the video Fast forward / fast rewind Possibility to play the video backwards Allow easy and precise editing
2. H.261 Very widely used in practice, many products available in the market from many manufacturers. Has replaced earlier proprietary standards for video telephony. Currently being replaced by newer versions, such as H.263 and H.264. Pure software implementations are available as well as stand- alone hardware solutions (“black boxes“) and combined solutions, mainly for the PC.
6.4. Audio Compression Pulse Code Modulation Audio signals are analog waves. The acoustic perception is determined by the frequency (pitch) and the amplitude (loudness). For storage, processing and transmission in the computer audio signals must converted into a digital representation. The classical way to do that is called pulse code modulation (PCM). It consists of three steps: sampling, quantization and coding. Sampling The analog signal is sampled periodically. At each sampling interval the analog value of the signal (e.g., the voltage level) is recorded as a real number. After sampling the signal is no longer continuous but discrete in the temporal dimension.
Sampling Theorem of Nyquist In order to reconstruct the original analog signal without loss we obviously need a minimum sampling frequency. The minimum sampling frequency fA is given by the sampling theorem of Nyquist (1924): For noise-free channels the sampling frequency fA must be twice as high as the highest frequency occurring in the signal: –fA = 2 fS
Quantization (Linear – Non Linear) The range of values occurring in the analog signal is subdivided into a fixed number of discrete intervals. Since all analog values contained in an interval will be mapped tothe same interval number we introduce a quantization error. If the size of the quantization interval is a then the maximum quantization error is a/2. Binary Coding We now have to determine a unique binary representation for each quantization interval. In principle any binary code does the job. The simplest code (which is in fact often used in practice) is to encode each interval with a fixed- size binary number.
Non-Linear Quantization With linear quantization all intervals have the same size, they do not depend on the amplitude of the signal. However it would be desirable to have a smaller amount of quantization noise at small amplitude levels because quantization noise is more disturbing in “quiet times“. This goal can be reached with non-linear quantization. We simply chose larger quantization intervals at higher amplitude values. Technically this can be done by a “signal compressor“ which preceeds the coding step. At the receiver side an expander is used to reconstruct the original dynamics. Many compressors use a logarithmic mapping. In digital electronics this is often approximated by a piecewise-linear curve. The 13-segment compressor curve is a typical example.
6.5. Speech Coding Special codecs optimized for the human voice can reach a very high speech quality at very low data rates.They operate at the normal range of the voice, i.e. at 300 – 3400 Hz. Such special codecs are most often based on Linear Predictive Coding (LPC). G.723.1 Adaptive CELP Encoder (Code Excited Linear Predictor). Bit rate for G.723.1: 5,3 kbit/s - 6.3 kbit/s GSM 06.10 Regular Pulse Excitation – Long Term Prediction (RPE-LTP) LPC encoding The synthetically generated signal is based on earlier signal values. Bit rate for GSM 06.10: 13.2 kbit/s