Information Theory EE322 Al-Sanie.

Presentation on theme: "Information Theory EE322 Al-Sanie."— Presentation transcript:

Information Theory EE Al-Sanie

Claude Shannon: April 30, 1916 - February 24, 2001
Introduced by Claude Shannon. Shannon (1948) , Information theory, The Mathematical theory of Communication Claude Shannon: April 30, February 24, 2001 EE Al-Sanie

Two foci: a) data compression and b) reliable
What is the irreducible complexity below which a signal cannot be compressed ? Entropy What is the ultimate transmission rate for reliable communication over noisy channel? Channel capacity Two foci: a) data compression and b) reliable communication through noisy channels. EE Al-Sanie

Amount of Information Before the event occurs, there is an amount of uncertainty. When the event occurs there is an amount of surprise. After the occurrence of the event, there is gain in the amount of information, the essence of which of which may be viewed as the resolution of uncertainty. The amount of information is related to the inverse of probability of occurrence. EE Al-Sanie

Uncertainty Surprise Probability Information
The amount of information is related to the inverse of probability of occurrence. The base of the logarithm is arbitrary. It is the standard practice today to use a logarithm to base 2. EE Al-Sanie

Discrete Source The discrete source is a source that emits symbols from a finite alphabet The source output is modeled as discrete random variable S which takes one symbol of the alphabet with probabilities Of course, this set of probabilities must satisfy the condition EE Al-Sanie

Example of Discrete Source
Analog source Sampler Quantizer EE Al-Sanie

Discrete memoryless source: If the symbol emitted by the source during successive signaling intervals are statistically independent. We define the amount of information gained after observing the event S=sk, which occurs with probability pk, as: EE Al-Sanie

Properties of amount of information:
If we are absolutely certain of the outcome of an event, even before it occurs, there is no information gained. . The event yields a gain of information (or no information) but never a loss of information. The event with lower probability of occurrence has the higher information For statistically independent events sk and sl. EE Al-Sanie

Entropy (The average information content per source symbol)
Consider discrete memoryless source that emits symbols from a finite alphabet The amount of information I(sk) is a discrete random variable that takes on values I(s0), I(s1), …, I(sK-1) with probabilities p0, p1, …, pK-1 respectively. EE Al-Sanie

The mean of I(sk) over the source alphabet is given by
H is called the entropy of a discrete memoryless source. It is a measure of the average information content per source symbol. EE Al-Sanie

Example A source emits one of four symbols s0, s1, s2, and s3 with probabilities ½, ¼, 1/8, and 1/8, respectively. The successive symbols emitted by the source are statistically independent. p0=1/ p1=1/ p2=1/ p3=1/8 I(s0)= I(s1)= I(s2)= I(s3)=3 EE Al-Sanie

Properties of the entropy
. where K is the number of Symbols H=0 if pk=1 and pi=0 for i≠k. H=log2(K) if pk=1/K maximum uncertainty when all symbols occur with the same probabilities EE Al-Sanie

Example: Entropy of Binary Memoryless Source
Consider a binary source for which symols 0 occurs with probability p0 and symbol 1 with probability p1=1-p0. EE Al-Sanie

Entropy function H(p0) of binary source.
H=0 when p0=0 H=0 when p0=1 H=1 when p0=0.5 (equally likely symbols) EE Al-Sanie

Extension of Discrete Memoryless Source
It is useful to consider blocks rather than individual symbols, with each block consisting of n successive symbols. We may view each block as being produced by extended source with Kn symbols, where K is the number of distinct symbols in the alphabet of the original source. The entropy of the extended source is EE Al-Sanie

Example: Entropy of Extended source
EE Al-Sanie

The entropy of the extended source:
EE Al-Sanie

Source Coding Source Coding is an efficient representation of symbols generated by the discrete source. The device that performs source coding is called source encoder. Source code: Assign short code words to frequent symbols Assign long code word to rare source symbols EE Al-Sanie

EE Al-Sanie

Example: Source code (Huffman code) for English alphabet
EE Al-Sanie

The source encoder should satisfy the following:
1. The code words produced by the encoder are in binary form. 2. The source code is uniquely decodable, so that the original source symbols can be reconstructed from the encoded binary sequence. Source encoder Source decoder sk Binary code word

Discrete Source Source encoder Modulator Channel Demodulator Source decoder EE Al-Sanie

Consider the following source code for this source:
Example A source emits one of four symbols s0, s1, s2, and s3 with probabilities ½, ¼, 1/8, and 1/8, respectively. The successive symbols emitted by the source are statistically independent. Consider the following source code for this source: The average code word length codeword p symbol 1/2 s0 10 1/4 s1 110 1/8 s2 111 s3 EE Al-Sanie

Compare the two source codes (I and II) for the pervious source
If the source emits symbols with symbol rate 1000 symbols /s. If we use code I: average bit rate=1000X1.75=1750 bits/s If we use code II: average bit rate=1000X2=2000 bits/s codeword for code I p symbol 1/2 s0 10 1/4 s1 110 1/8 s2 111 s3 codeword for code II p symbol 00 1/2 s0 01 1/4 s1 10 1/8 s2 11 s3 EE Al-Sanie

What is the minimum value of ?
Let the binary code word assigned to symbol sk by the encoder has length lk, measured in bits. We define the average code-word length of the source encoder (the average number of bits per symbol) as What is the minimum value of ? The answer to this question is in Shannon’s first theorem “The source Coding Theorem” EE Al-Sanie

Source Coding Theorem Given a discrete memoryless source of entropy H, the average code-word length for any distortionless source encoding scheme is bounded as: The minimum length The efficiency of the source encoder: EE Al-Sanie

Example: The previous example
EE Al-Sanie

Uniquely Decodable Source Code
A code is said to be uniquely decodable (U.D.) if the original symbols can be recovered uniquely from sequences of encoded bits. The source code should be uniquely decodable code. Source encoder Source decoder sk Binary code word Binary sequence EE Al-Sanie

Example codeword symbol 00 s0 s1 11 s2 This code is not UD because the symbols s0 and s1 have the same code words This code is not UD: the sequence: … can be decoded as 1 1 1 …→ s1 s1 s1 or 1 1 1 …→ s1 s2 or 1 1 1 …→ s2 s1 codeword symbol s0 1 s1 11 s2 codeword symbol 00 s0 01 s1 11 s2 This code is UD code EE Al-Sanie

Prefix-free Source Codes
A prefix-free code: is a code in which no codeword is a prefix of any other codeword. Example: Prefix-free Code Not Prefix-free Code codeword symbol s0 10 s1 110 s2 111 s4 codeword symbol s0 01 s1 011 s2 0111 s4 EE Al-Sanie

UD not necessarily prefix-free Example: UD but not prefix-free
A prefix-free code has the important property that it is always uniquely decodable. But the converse in not necessarily true. Prefix-free → UD UD not necessarily prefix-free Example: UD but not prefix-free codeword symbol s0 01 s1 011 s2 0111 s4 EE Al-Sanie

Prefix-free codes have the advantage of being instantaneously decodable, i.e., a symbol can be decoded by the time the last bit in it is reached. Example: codeword symbol s0 10 s1 110 s2 111 s4 The sequence …. is decoded as s1 s3 s2 s0 s0 … EE Al-Sanie

Huffman Code Huffman is an important prefix-free source code.
The Huffman encoding algorithm proceeds as follows: The source symbols are listed in order of decreasing probability. The two source symbols of lowest probability are assigned a 0 and a 1. These two source symbols are regarded as being combined into a new symbol with probability equal to the sum of the two probabilities. The probability of the new symbol is placed in the list in accordance with its value. The procedure is repeated until we are left with a final list of two symbols for which a 0 and 1 are assigned. The code word for each symbol is found by working backward and tracing the sequence of 0 and 1. EE Al-Sanie

Example: Huffman code A source emits one of four symbols s0, s1, s2, and s3 with probabilities ½, ¼, 1/8, and 1/8, respectively. S /2 S /4 S /8 S /8 codeword p symbol 1/2 s0 10 1/4 s1 110 1/8 s2 111 s3 1 1 1 EE Al-Sanie

Example: The previous example
EE Al-Sanie

Example: Huffman code EE Al-Sanie

EE Al-Sanie

Example: Huffman code S1 S2 S3 S4 S5 S6 S7 s8 00 s1 010 s2 011 s3 100
101 s5 110 s6 1110 S7 1111 s8 EE Al-Sanie

Discrete memoryless channel
EE Al-Sanie

Discrete memoryless channels
A discrete memoryless channel is a statistical model with an input X and an output Y that is a noisy version of X; both X and Y are random variables. Every unit of time, the channel accepts an input symbol X selected from an alphabet and in response it emits an output symbol Y from an alphabet . The channel is said to be discrete when both of the alphabets and have finite size. It said to be memoryless when the current output symbol depends only on current input symbol and not any of the previous ones. EE Al-Sanie

The transition probabilities:
The input alphabet: The output alphabet: The transition probabilities: EE Al-Sanie

The event that the channel input X=xj occurs with probability
The joint probability distribution of random variable X and Y is given by The probabilities of output symbols EE Al-Sanie

Example: Binary Symmetric Channel (BSC)
It is a special case of the discrete memoryless channel with J=K=2. The channel has two input symbols x0=0 and x1=1. The channel has two output symbols y0=0 and y1=1. The channel is symmetric the probability of receiving 1 if a 0 is sent is the same as the probability of receiving a 0 if a 1 is sent. i.e P(y=1/x=0)=P(y=0/x=1)=p EE Al-Sanie

Transition probability diagram of binary symmetric channel
EE Al-Sanie

Information Capacity Theorem (Shannon’s third theorem)
The information capacity of a continuous channel of bandwidth B Hertz, perturbed by additive white Gaussian noise of power spectral density No/2 and limited in bandwidth to B is given by Where P is the average transmitted signal power. And NoB= σ2 =noise power EE Al-Sanie

The dependence of C on B is linear, whereas its dependence on signal-to-noise ratio P/NoB is logarithmic. Accordingly it is easier to increase the capacity of a channel by expanding bandwidth than increasing the transmitted power for a prescribed noise variance. EE Al-Sanie

The channel capacity theorem implies that we can transmit information at the rate of C bits per second with arbitrarily small probability of error by employing sufficiently complex encoding system. It is not possible to transmit at a rate higher than C by any encoding system without a definite probability of error. Hence, the channel capacity defines the fundamental limit on the rate of error-free transmission for power-limited, band-limited Gaussian channel. EE Al-Sanie

If Rb≤C it is possible to transmit with small probability of error by employing sufficiently complex encoding system. If Rb>C it is not possible to transmit with definite probability of error. EE Al-Sanie

Implications of the information Capacity Theorem
Consider an ideal system that transmits data at bit rate Rb equal to the information capacity C. Rb=C The average transmitted power may expressed as But EE Al-Sanie

A plot o bandwidth efficiency Rb/B versus Eb/No is called the bandwidth efficiency diagram. (Figure in next slide). The curve labeled “capacity boundary” corresponds to the ideal system for which Rb=C. EE Al-Sanie

Bandwidth-efficiency diagram.
EE Al-Sanie

Based on the previous figure, we make the following observations:
For infinite bandwidth, the ratio Eb/No approaches the limiting value This value is called the Shannon limit. The capacity boundary curve (Rb=C) separate two regions Rb<C :error-free transmission is possible Rb>C: error-free transmission is not possible = -1.6 dB EE Al-Sanie

a) Comparison of M-ary PSK against the ideal system for Pe  105 and increasing M. (b) Comparison of M-ary FSK against the ideal system for Pe  10 5 and increasing M. EE Al-Sanie