Presentation on theme: "CY2G2 Information Theory 1"— Presentation transcript:
1CY2G2 Information Theory 1 A review on important bits of Part IQuantity of Information (noiseless system)a) Depends on probability of event.b) Depends on length of message.Average Information: Entropyprobability of eventTo introduce the concept of information, we start by comparing the amount of information contained in two statements; “The sun has risen today”, and “ I am pleased to tell you that you’ve just won the lottery” (assumed this is true).We know that a communication system carries information-bearing baseband signals from one place to another over a communication channel. What information it will provide to receiver if the receiver receives a fixed amplitude and fixed frequency sinusoidal signal. It is not interesting because it is deterministic (known), and bring no information to receiver.Two questions naturally arise: (1) What is information? and (2) How to quantify information conveyed in a signal? Qualitative description of information is not helpful in this context. A quantification of information is necessary. It seems that information involves an element of uncertainty.The amount of information conveyed in an event depends on the probability of the event, depends on how much the receiver is surprised by the message. In other words, information is the resolution of uncertainty. The smaller probability (chance) of the message is, the larger the information quantity. Therefore, information is related to the inverse of probability .Source producing many symbols of probabilitiesetc.
2CY2G2 Information Theory 1 Maximum entropyFor a binary sourceThe base of the logarithm is not important here, but base 2 is most common. This results in the unit of information as bit. Taking base ‘e’, we get unit of nats (natural unit).We present the definition in a more formal way as shown in above slide.We will illustrate this with a few examples;i. Letters of alphabet. If the letters are assumed equiprobable, the probability of a letter is 1/26, andI=log 26 =4.7bits;ii. Numerals. Assuming that numbers from 0 ~9 are equiprobable, the probability of a number is 1/10, then I =log 10= 3.32 bits.Note that in these cases, equiprobablity (or a uniformly probability distribution function ) is assumed.
3CY2G2 Information Theory 1 RedundancyConditional entropy H(j|i)If there is intersymbol influence, average information is given byRedundancy is the presence of more symbols in a message than is strictly necessary.Spoken languages usually have high redundancy. English has a redundancy of about 80 percent.Redundancy is an important concept in Information theory. In order to overcome noise and inference, to increase the reliability of the data transmitted through the channel, it is often necessary to introduce in a controlled manner some redundancy in the binary sequence from the source.Conditional probability(probability of j given i)Joint probability
4Coding in noiseless channel : Source coding (Speed of transmission is the main consideration )Important properties of codesuniquely decodable (all combinations of code words distinct)instantaneous (no code words a prefix of another)compact (shorter code words given to more probable symbols)
5CY2G2 Information Theory 1 Important parameters:whereis length (in binary digits)Coding methodsFano-Shannon methodHuffman’s Method
6Coding methodsFano-Shannon method1. Writing the symbol in a table in the order ofdescending order of probabilities ;2. Dividing lines are inserted to successively divide theprobabilities into halves, quarters, etc (or as near aspossible);3. A ‘0’ and ‘1’ are added to the code at each division.4. Final code for each symbol is obtained by readingfrom towards each symbol.
9Coding methodsHuffman’s Method1. Writing the symbol in a table in the order ofdescending order of probabilities ;The probabilities are added in pairs from bottomand reordered.3. A ‘0’ or ‘1’ is placed at each branch;4. Final code for each symbol is obtained by reading from towards each symbol.
14Shannon’s first theorem Shannon proved formally that if the source symbols are coded in groups of n, then the average length per symbol tends to the source entropy H as n tends to infinite. In consequence, a further increase in efficiency can be obtained by grouping the source symbols in groups, ( pairs, threes), and applying the coding procedure to the relevant probabilities of the chosen group.Matching source to channelThe coding process is sometimes known as ‘matching source to channel’ , that is to making the output of the coder as suitable as possible for the channel.
15Coding singly, using Fano-Shannon method ExampleAn information source produces a long sequence of three independent symbols A, B, C with probabilities 16/20,3/20 and 1/20 respectively; 100 such symbols are produced per second. The information is to be transmitted via a noiseless binary channel which can transmit up to 100 binary digits per second. Design a suitable compact instantaneous code and find the probabilities of the binary digits produced.0, 1100 symbol/schanneldecodersourcecoderP(A)=16/20, p(B)=3/20, p(C)=1/20Coding singly, using Fano-Shannon methodA16/20B3/20110c1/2011P(0)=0.73, p(1)=0.27
16close to maximum value of 1bit, (p(0)=p(1)). AA0.64AB0.12110BA110AC0.0411100CA11101BB0.022511110BC0.0075111110CBCC0.0025L=1.865 per pair, R=93.25bits/sp(0)=0.547, p(1)=0.453.The entropy of the output stream is –(p(1)logp(0)+p(1)logp(1))=0.993 bits.close to maximum value of 1bit, (p(0)=p(1)).