§1 Entropy and mutual information

Name: §1 Entropy and mutual information
Uploaded: 2017-07-10T04:01:24+00:00
Duration: PTM35S47
Channel: Clarissa Harvey
Description: §1 Entropy and mutual information

§1 Entropy and mutual information
§1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1.1.1 Discrete memoryless source and entropy §1.1.2 Discrete memoryless channel and mutual information

§1.1.1 Discrete memoryless source and entropy
1. DMS (Discrete memoryless source ) Probability Space: Example 1.1.1 Let X represent the outcome of a single roll of a fair die.

2. self information Example 1.1.2 red white red white blue black 板书（例2——摸球及猜字【单猜及双猜】）函数I(ai) = f [P(ai)]应满足以下条件： Analyse the uncertainty of red ball selected from X and from Y.

2. self information I(ai) = f [p(ai)] Satisfy: 1) I(ai) is the monotone decreasing function of p(ai): if p(a1)> p(a2), then I(a1) < I(a2)； 2) if p(ai)＝1, then I(ai)＝0； 3) if p(ai)＝0 , then I(ai)→∞； 4) if p(ai aj)＝p(ai) p(aj)， then I(aiaj)=I(ai)+I(aj) 板书（例2——摸球及猜字【单猜及双猜】）函数I(ai) = f [P(ai)]应满足以下条件：

self information bit nat hart Remark: I(ai) p(ai) 1 The measure of uncertainty of the random variable ai The measure of information the random variable ai provides. a and b are statistically independent

Definition: Suppose X is a discrete random variable, whose range R={a1,a2,…} is finite or countable. Let p(ai)=P{X=ai}. The entropy of X is defined by 到此，次2 第1节后20分钟和第2节前10分钟由学生介绍上次留下的英文文章的翻译及理解 uncertainty (or randomness) about X. average A measure of amount of information provided by X.

Entropy-the amount of “information”provided by an observation of X Example 1.1.3 100 balls in a bag, 80% is red, and remain is white. Now , we fetch out a ball. How about the information of every fetching? Let X represent the color of the ball.a1--red,a2--white =0.722 bit/sig

Entropy-the “uncertainty” or “randomness” about X average Example 1.1.4

Note: 1) units: bit/sig,nat/sig,hart/sig 2) If p(ai)=0, p(ai)log p(ai)-1 = 0 3) If R is infinite , H(X) may be +

Example entropy of BS entropy function probability vector

4. The properties of entropy
§1.1.1 Discrete memoryless source and entropy 4. The properties of entropy Theorem Let X assume values in R={x1,x2,…,xr}. (Theorem 1.1 in textbook) 1) 2) H(X) = 0 iff pi = 1 for some i 3) H(X) ≤ logr ,with equality iff pi = 1/r for all i Nonnegative Certainty Extremum ——base of data compressing Proof:

4. The properties of entropy 4) Example 1.1.6 Let X,Y,Z are all discrete random variables: symmetry

4. The properties of entropy 5) If X,Y are independent , then H(XY) = H(X) + H(Y) Proof: addition

Proof: Joint source: ＋＝ H(X) H(Y)

4. The properties of entropy 6) Convex properties Theorem1.2 The entropy function H(p1,p2,…,pr) is a convex function of probability vector (p1,p2,…,pr) . H p Example (continued) entropy of BS 1 1/2 1

5. conditional entropy Definition: X, Y are a pair of random variables, if (X,Y)~p(x,y) Then the conditional entropy of X , given Y is defined by

5. conditional entropy Analyse:

5. conditional entropy Example 1.1.7 H(X) = H(2/3,1/3)= bit/sig H(X)=? 3/4 1 ? 1/4 1/2 X Y H(X|Y=0) = 0 H(X|Y=0) = ? H(X|Y=1) = 0 H(X|Y=?) = H(1/2,1/2)=1 bit/sig H(X|Y=?) = ? H(X|Y) = ? H(X|Y) = 1/3 bit/sig pX(0)=2/3, pX(1) = 1/3

5. conditional entropy Theorem1.3
§1.1.1 Discrete memoryless source and entropy 5. conditional entropy Theorem1.3 (conditioning reduces entropy) with equality iff X and Y are independent. Proof:

Measure of information
Review KeyWords: Measure of information self information entropy properties of entropy conditional entropy

Homework P44: T1.1, P44: T1.4, P44: T1.6, 4. Let X be a random variable taking on a finite number of values. What is the relationship of H(X) or H(Y) if (1) Y=2X ? (2) Y=cosX ?

Homework

Homework 6. Given a chessboard with 8×8=64 squares. A chessman is put randomly in a square. Guess the location of the chessman. Find the uncertainty of the result. if we mark every square by its row and column number, and already know the row number of the chessman, how about the uncertainty?

Homework thinking： Coin flip. A fair coin is flipped until the first head occurs. Let X denote the number of flips required. Find the entropy H(X) in bits. Imply:

§1.1 Discrete random variables §1.2 Discrete random vectors §1.1.1 Discrete memoryless source and entropy §1.1.2 Discrete memoryless channel and mutual information

§1.1.2 Discrete memoryless channel and mutual information
p(y∣x)

1. DMC (Discrete Memoryless Channel) The model of DMC r input symbols, s output symbols 1 r-1 s-1 p(y|x)

1. DMC (Discrete Memoryless Channel)
§1.1.2 Discrete memoryless channel and mutual information 1. DMC (Discrete Memoryless Channel) representation of DMC graph x y p(y|x) transition probabilities for all x,y for all x

1. DMC (Discrete Memoryless Channel)
§1.1.2 Discrete memoryless channel and mutual information 1. DMC (Discrete Memoryless Channel) representation of DMC transition probabilities matrix matrix

1. DMC (Discrete Memoryless Channel) representation of DMC formula

1. DMC (Discrete Memoryless Channel) Example 1.1.8: BSC (Binary Symmetric Channel) r = s = 2 1 1-p p p(0|0) = p(1|1) = 1-p p(0|1) = p(1|0) = p

1. DMC (Discrete Memoryless Channel) Example 1.1.9: BEC (Binary Erasure Channel) 1 ？ 1 1 ？？ 1 ？ 1 ？ 1 ？

1. DMC (Discrete Memoryless Channel) Example 1.1.9: BEC (Binary Erasure Channel) r = 2, s = 3 p 1 ? 1-p 1-q q p(0|0) = p, p(?|0) = 1-p p(1|1) = q, p(?|1) = 1-q

2. average mutual information
§1.1.2 Discrete memoryless channel and mutual information 2. average mutual information definition The reduction in uncertainty about X conveyed by the observations Y; The information about X from Y. Channel p(y∣x) or p(ai|bj) H(X) H(X|Y) entropy equivocation I(X;Y) = H(X) – H(X|Y) average mutual information

2. average mutual information definition I(X;Y) = H(X) – H(X|Y)

2. average mutual information definition I(X;Y) and I(x;y) mutual information I(X;Y)＝EXY[I(x;y)] I(X;Y) and H(X)

2. Average mutual information
§1.1.2 Discrete memoryless channel and mutual information 2. Average mutual information properties 1) Non-negativity of average mutual information Theorem1.4 For any discrete random variables X and Y, .Moreover I(X;Y) = 0 iff X and Y are independent. 到此，次4 全损信道还未介绍 We do not expect to be misled on average by observing the output of channel. (Theorem 1.3 in textbook) Proof:

2. Average mutual information properties total loss X Y Y’ S encrypt Key channel decrypt D listener-in A cryptosystem Caesar cryptography message：arrive at four ciphertext：duulyh dw irxu

2. Average mutual information
§1.1.2 Discrete memoryless channel and mutual information 2. Average mutual information properties 2) symmetry I(X;Y) = I(Y;X) 3) relationship between entropy and average mutual information Joint entropy H(XY) Mnemonic Venn diagram I(X;Y) = H(X) – H(X|Y) H(X∣Y) H(Y∣X) I(X;Y) = H(Y) – H(Y|X) H(Y) I(X;Y) = H(X) + H(Y) – H(XY) H(X) I(X;Y)

2. Average mutual information properties Recognising channel a1 a2 ar b1 b2 br a1 a2 b1 b2 b5 b3 b4 1／2 1／5 2／5 a1 a2 a3 b1 b2

I(X;Y)=f [P(x)，P(y|x)]
§1.1.2 Discrete memoryless channel and mutual information 2. Average mutual information properties 4) Convex property I(X;Y)=f [P(x)，P(y|x)]

I(X;Y)=f [P(x)，P(y|x)]
§1.1.2 Discrete memoryless channel and mutual information 2. Average mutual information properties 4) Convex properties I(X;Y)=f [P(x)，P(y|x)] Theorem1.5 I(X;Y) is a convex function of the input probabilities P(x). (Theorem 1.6 in textbook) Theorem1.6 I(X;Y) is a convex function of the transition probabilities P(y|x). (Theorem 1.7 in textbook)

2. Average mutual information Example analyse the I(X;Y) of BSC ，channel： 1－p p 1 source：

2. Average mutual information Example analyse the I(X;Y) of BSC 板书推导

2. Average mutual information Example analyse the I(X;Y) of BSC

Review KeyWords: Channel and it’s information measure channel model
equivocation average mutual information mutual information properties of average mutual information

Thinking ？到此，次5 第一节朱建清老师听课，反映很好，认为讲解流利，发音清楚，难点以中文讲解很好。同时指出发音有的时候会吞音，用语还可以多练习，一个错误：as random as posible,不是as posible as random!

Example D Q D Q a(t) b0 b1 b2 Let the source have alphabet A={0,1} with p0=p1=0.5. Let encoder C have alphabet B={0,1,… ,7}and let the elements of B have binary representation The encoder is shown below. Find the entropy of the coded output and find the output sequence if the input sequence is a(t)={ } and the initial contents of the registers are where the “addtion” blocks are modulo-2 adders(i.e,exclusice-or gates).

Yt Yt+1 1 1 2 2 3 3 a(t)={ } b = { } 4 4 5 5 6 6 7 7

conveys message through a channel:
Homework P45: T1.10, P46: T1.19(except c) 3. Let the DMS conveys message through a channel: Calculate that: H(X) and H(Y); the mutual information of xi and yj (i,j=1,2); the equivocation H(X|Y) and average mutual information.

Homework 4. Suppose that I(X;Y)=0.Does this imply that I(X;Z)=I(X;Z|Y)? 5. In a joint ensemble XY, the mutual information I(x;y) is a random variable. In this problem we are concerned with the variance of that random variable, VAR[I(x;y)]. Prove that VAR[I(x;y)]=0 iff there is a constant αsuch that, for all x,y with P(xy)>0, P(xy)= αP(x) P(y) Express I(X;Y) in term of α and interpret the special case α =1. (continued)

Homework 5. (3) for each of the channel in fig5 , find a probability assignment P(x) such that I(X;Y) >0 and VAR[I(x;y)]=0 . Calculate I(X;Y). a1 a2 a3 b1 b2 1 a1 a2 a3 b1 b2 1/2 b3

§1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1.2.1 Extended source and joint entropy §1.2.2 Extended channel and mutual information

§1.2.1 Extended source and joint entropy
Source model Example 1.2.1 N-times extended source Example:二元二次扩展信源

2. Joint entropy Definition:
§ Extended source and joint entropy 2. Joint entropy Definition: The joint entropy H(XY) of a pair of discrete random variables (X,Y) with a joint distribution p(x,y) is defined as which can also be expressed as

Extended DMS

memory source 1）Conditional entropy 2）Joint entropy 3）(per symbol) entropy bit/sig

3. Properties of joint entropy Theorem1.7 (Chain rule) : H(XY) = H(X) + H(Y|X) Proof:

3. Properties of joint entropy Example 1.2.3 Let X be a random variable, its probability space is： 1 2 (per symbol) entropy 1/4 1/4 Its joint probability 1 1/4 1/24 1/24 2 H(X)=? 1/24 1/8

3. Properties of joint entropy Relationship H(X2) ≥ H(X2|X1) H(X1X2) ≤ 2H(X1)

3. Properties of joint entropy General stationary source Let X1,X2,…,XN be dependent , the joint probability is：

3. Properties of joint entropy Definition of entropies conditional entropy Joint entropy (per symbol) entropy

3. Properties of joint entropy
§ Extended source and joint entropy 3. Properties of joint entropy Theorem1.8 (Chain rule for entropy): Let X1,X2,…,Xn be drawn according to p(x1,x2,…,xn). Then Proof (do it by yourself)

§1.2.1 Extended source and joint entropy ——base of data compressing
3. Properties of joint entropy Relation of entropies If H(X)<∞, then: ——base of data compressing entropy rate

3. Properties of joint entropy
§ Extended source and joint entropy 3. Properties of joint entropy Theorem1.9 (Independence bound on entropy): Let X1,X2,…, Xn be drawn according to p(x1,x2,…,xn). Then with equality iff the Xi are independent (P37(corollary) in textbook)

3. Properties of joint entropy Example 1.2.4 Suppose a memoryless source with A={0,1} having equal probabilities emits a sequence of six symbols. Following the sixth symbol, suppose a seventh symbol is transmitted which is the sum modulo 2 of the six previous symbols. What is the entropy of the seven-symbol sequence? 到此，第1节课唐波、韩永宁听课，反映很好。此题很吸引学生，第2节课留出20分钟学生报告：coin weighing 一名学生（李陆洋）还没讲完，因赶班车先走，学生自己组织讨论。

§1.1 Discrete random variables §1.2 Discrete random vectors §1.2.1 Extended source and joint entropy §1.2.2 Extended channel and mutual information

§1.2.2 Extended channel and mutual information
1. The model of extended channel (U1,U2,…,Uk) (X1,X2,…,XN) source encoder XN (Y1,Y2,…,YN) channel decoder (V1,V2,…,Vk) YN A general communication system

1. The model of extended channel Extended channel

1. The model of extended channel

2. Average mutual information example 1.2.5

3. The properties Theorem1.11 If the components(X1, X2,…,XN) of XN are independent, then (Theorem 1.8 in textbook)

3. The properties Theorem1.12 If XN =(X1, X2,…,XN) and YN =(Y1, Y2,…,YN) are random vectors and the channel is memoryless, that is then (Theorem 1.9 in textbook)

example 1.2.6 Let X1,X2,…,X5 be independent identically distributed random variables with common entropy H. Also let T be a permutation of the set {1, 2,3,4,5}, and let Yi = XT(i) 1 2 3 4 5 讲完下课，并补讲coin weighing 和保密编码题。，次7 感觉讲启示学生兴趣很大，且有恍然大悟之感 Show that

Measure of information
Review Keywords: vector Extented source stationary source Extented channel Measure of information joint entropy (per symbol) entropy conditional entropy entropy rate

conditioning reduces entropy Independence bound on entropy
Review Conclusion: conditioning reduces entropy chain rule for entropy Independence bound on entropy properties of

Homework P47: T1.23, P47: T1.24,

Homework 1）show that 2）when 3）when
4.Let X1, X2 be identically distributed random variables. Let be: 1）show that 2）when 3）when

Homework Thinking : 5. Shuffles increase entropy. Argue that for any distribution on shuffles T and any distribution on card positions X that H(TX) ≥ H(TX|T) , if X and T are independent.

§1 Entropy and mutual information

Similar presentations

Presentation on theme: "§1 Entropy and mutual information"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

§1 Entropy and mutual information

Similar presentations

Presentation on theme: "§1 Entropy and mutual information"— Presentation transcript:

Similar presentations

About project

Feedback