Presentation is loading. Please wait.

Presentation is loading. Please wait.

§1 Entropy and mutual information

Similar presentations


Presentation on theme: "§1 Entropy and mutual information"— Presentation transcript:

1 §1 Entropy and mutual information
§1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1.1.1 Discrete memoryless source and entropy §1.1.2 Discrete memoryless channel and mutual information

2 §1.1.1 Discrete memoryless source and entropy
1. DMS (Discrete memoryless source ) Probability Space: Example 1.1.1 Let X represent the outcome of a single roll of a fair die.

3 §1.1.1 Discrete memoryless source and entropy
2. self information Example 1.1.2 red white red white blue black 板书(例2——摸球及猜字【单猜及双猜】) 函数I(ai) = f [P(ai)]应满足以下条件: Analyse the uncertainty of red ball selected from X and from Y.

4 §1.1.1 Discrete memoryless source and entropy
2. self information I(ai) = f [p(ai)] Satisfy: 1) I(ai) is the monotone decreasing function of p(ai): if p(a1)> p(a2), then I(a1) < I(a2); 2) if p(ai)=1, then I(ai)=0; 3) if p(ai)=0 , then I(ai)→∞; 4) if p(ai aj)=p(ai) p(aj), then I(aiaj)=I(ai)+I(aj) 板书(例2——摸球及猜字【单猜及双猜】) 函数I(ai) = f [P(ai)]应满足以下条件:

5 §1.1.1 Discrete memoryless source and entropy
self information bit nat hart Remark: I(ai) p(ai) 1 The measure of uncertainty of the random variable ai The measure of information the random variable ai provides. a and b are statistically independent

6 §1.1.1 Discrete memoryless source and entropy
Definition: Suppose X is a discrete random variable, whose range R={a1,a2,…} is finite or countable. Let p(ai)=P{X=ai}. The entropy of X is defined by 到此,次2 第1节后20分钟和第2节前10分钟由学生介绍上次留下的英文文章的翻译及理解 uncertainty (or randomness) about X. average A measure of amount of information provided by X.

7 §1.1.1 Discrete memoryless source and entropy
Entropy-the amount of “information”provided by an observation of X Example 1.1.3 100 balls in a bag, 80% is red, and remain is white. Now , we fetch out a ball. How about the information of every fetching? Let X represent the color of the ball.a1--red,a2--white =0.722 bit/sig

8 §1.1.1 Discrete memoryless source and entropy
Entropy-the “uncertainty” or “randomness” about X average Example 1.1.4

9 §1.1.1 Discrete memoryless source and entropy
Note: 1) units: bit/sig,nat/sig,hart/sig 2) If p(ai)=0, p(ai)log p(ai)-1 = 0 3) If R is infinite , H(X) may be +

10 §1.1.1 Discrete memoryless source and entropy
Example entropy of BS entropy function probability vector

11 4. The properties of entropy
§1.1.1 Discrete memoryless source and entropy 4. The properties of entropy Theorem Let X assume values in R={x1,x2,…,xr}. (Theorem 1.1 in textbook) 1) 2) H(X) = 0 iff pi = 1 for some i 3) H(X) ≤ logr ,with equality iff pi = 1/r for all i Nonnegative Certainty Extremum ——base of data compressing Proof:

12 §1.1.1 Discrete memoryless source and entropy
4. The properties of entropy 4) Example 1.1.6 Let X,Y,Z are all discrete random variables: symmetry

13 §1.1.1 Discrete memoryless source and entropy
4. The properties of entropy 5) If X,Y are independent , then H(XY) = H(X) + H(Y) Proof: addition

14 §1.1.1 Discrete memoryless source and entropy
Proof: Joint source: H(X) H(Y)

15 §1.1.1 Discrete memoryless source and entropy
4. The properties of entropy 6) Convex properties Theorem1.2 The entropy function H(p1,p2,…,pr) is a convex function of probability vector (p1,p2,…,pr) . H p Example (continued) entropy of BS 1 1/2 1

16 §1.1.1 Discrete memoryless source and entropy
5. conditional entropy Definition: X, Y are a pair of random variables, if (X,Y)~p(x,y) Then the conditional entropy of X , given Y is defined by

17 §1.1.1 Discrete memoryless source and entropy
5. conditional entropy Analyse:

18 §1.1.1 Discrete memoryless source and entropy
5. conditional entropy Example 1.1.7 H(X) = H(2/3,1/3)= bit/sig H(X)=? 3/4 1 ? 1/4 1/2 X Y H(X|Y=0) = 0 H(X|Y=0) = ? H(X|Y=1) = 0 H(X|Y=?) = H(1/2,1/2)=1 bit/sig H(X|Y=?) = ? H(X|Y) = ? H(X|Y) = 1/3 bit/sig pX(0)=2/3, pX(1) = 1/3

19 5. conditional entropy Theorem1.3
§1.1.1 Discrete memoryless source and entropy 5. conditional entropy Theorem1.3 (conditioning reduces entropy) with equality iff X and Y are independent. Proof:

20 Measure of information
Review KeyWords: Measure of information self information entropy properties of entropy conditional entropy

21 Homework P44: T1.1, P44: T1.4, P44: T1.6, 4. Let X be a random variable taking on a finite number of values. What is the relationship of H(X) or H(Y) if (1) Y=2X ? (2) Y=cosX ?

22 Homework

23 Homework 6. Given a chessboard with 8×8=64 squares. A chessman is put randomly in a square. Guess the location of the chessman. Find the uncertainty of the result. if we mark every square by its row and column number, and already know the row number of the chessman, how about the uncertainty?

24 Homework thinking: Coin flip. A fair coin is flipped until the first head occurs. Let X denote the number of flips required. Find the entropy H(X) in bits. Imply:

25 §1 Entropy and mutual information
§1.1 Discrete random variables §1.2 Discrete random vectors §1.1.1 Discrete memoryless source and entropy §1.1.2 Discrete memoryless channel and mutual information

26 §1.1.2 Discrete memoryless channel and mutual information
p(y∣x)

27 §1.1.2 Discrete memoryless channel and mutual information
1. DMC (Discrete Memoryless Channel) The model of DMC r input symbols, s output symbols 1 r-1 s-1 p(y|x)

28 1. DMC (Discrete Memoryless Channel)
§1.1.2 Discrete memoryless channel and mutual information 1. DMC (Discrete Memoryless Channel) representation of DMC graph x y p(y|x) transition probabilities for all x,y for all x

29 1. DMC (Discrete Memoryless Channel)
§1.1.2 Discrete memoryless channel and mutual information 1. DMC (Discrete Memoryless Channel) representation of DMC transition probabilities matrix matrix

30 §1.1.2 Discrete memoryless channel and mutual information
1. DMC (Discrete Memoryless Channel) representation of DMC formula

31 §1.1.2 Discrete memoryless channel and mutual information
1. DMC (Discrete Memoryless Channel) Example 1.1.8: BSC (Binary Symmetric Channel) r = s = 2 1 1-p p p(0|0) = p(1|1) = 1-p p(0|1) = p(1|0) = p

32 §1.1.2 Discrete memoryless channel and mutual information
1. DMC (Discrete Memoryless Channel) Example 1.1.9: BEC (Binary Erasure Channel) 1 1 1 1 1 1

33 §1.1.2 Discrete memoryless channel and mutual information
1. DMC (Discrete Memoryless Channel) Example 1.1.9: BEC (Binary Erasure Channel) r = 2, s = 3 p 1 ? 1-p 1-q q p(0|0) = p, p(?|0) = 1-p p(1|1) = q, p(?|1) = 1-q

34 2. average mutual information
§1.1.2 Discrete memoryless channel and mutual information 2. average mutual information definition The reduction in uncertainty about X conveyed by the observations Y; The information about X from Y. Channel p(y∣x) or p(ai|bj) H(X) H(X|Y) entropy equivocation I(X;Y) = H(X) – H(X|Y) average mutual information

35 §1.1.2 Discrete memoryless channel and mutual information
2. average mutual information definition I(X;Y) = H(X) – H(X|Y)

36 §1.1.2 Discrete memoryless channel and mutual information
2. average mutual information definition I(X;Y) and I(x;y) mutual information I(X;Y)=EXY[I(x;y)] I(X;Y) and H(X)

37 2. Average mutual information
§1.1.2 Discrete memoryless channel and mutual information 2. Average mutual information properties 1) Non-negativity of average mutual information Theorem1.4 For any discrete random variables X and Y, .Moreover I(X;Y) = 0 iff X and Y are independent. 到此,次4 全损信道还未介绍 We do not expect to be misled on average by observing the output of channel. (Theorem 1.3 in textbook) Proof:

38 §1.1.2 Discrete memoryless channel and mutual information
2. Average mutual information properties total loss X Y Y’ S encrypt Key channel decrypt D listener-in A cryptosystem Caesar cryptography message:arrive at four ciphertext:duulyh dw irxu

39 2. Average mutual information
§1.1.2 Discrete memoryless channel and mutual information 2. Average mutual information properties 2) symmetry I(X;Y) = I(Y;X) 3) relationship between entropy and average mutual information Joint entropy H(XY) Mnemonic Venn diagram I(X;Y) = H(X) – H(X|Y) H(X∣Y) H(Y∣X) I(X;Y) = H(Y) – H(Y|X) H(Y) I(X;Y) = H(X) + H(Y) – H(XY) H(X) I(X;Y)

40 §1.1.2 Discrete memoryless channel and mutual information
2. Average mutual information properties Recognising channel a1 a2 ar b1 b2 br a1 a2 b1 b2 b5 b3 b4 1/2 1/5 2/5 a1 a2 a3 b1 b2

41 I(X;Y)=f [P(x),P(y|x)]
§1.1.2 Discrete memoryless channel and mutual information 2. Average mutual information properties 4) Convex property I(X;Y)=f [P(x),P(y|x)]

42 I(X;Y)=f [P(x),P(y|x)]
§1.1.2 Discrete memoryless channel and mutual information 2. Average mutual information properties 4) Convex properties I(X;Y)=f [P(x),P(y|x)] Theorem1.5 I(X;Y) is a convex function of the input probabilities P(x). (Theorem 1.6 in textbook) Theorem1.6 I(X;Y) is a convex function of the transition probabilities P(y|x). (Theorem 1.7 in textbook)

43 §1.1.2 Discrete memoryless channel and mutual information
2. Average mutual information Example analyse the I(X;Y) of BSC ,channel: 1-p p 1 source:

44 §1.1.2 Discrete memoryless channel and mutual information
2. Average mutual information Example analyse the I(X;Y) of BSC 板书推导

45 §1.1.2 Discrete memoryless channel and mutual information
2. Average mutual information Example analyse the I(X;Y) of BSC

46 Review KeyWords: Channel and it’s information measure channel model
equivocation average mutual information mutual information properties of average mutual information

47 §1.1.2 Discrete memoryless channel and mutual information
Thinking 到此,次5 第一节朱建清老师听课,反映很好,认为讲解流利,发音清楚,难点以中文讲解很好。同时指出发音有的时候会吞音,用语还可以多练习,一个错误:as random as posible,不是as posible as random!

48 §1.1.2 Discrete memoryless channel and mutual information
Example D Q D Q a(t) b0 b1 b2 Let the source have alphabet A={0,1} with p0=p1=0.5. Let encoder C have alphabet B={0,1,… ,7}and let the elements of B have binary representation The encoder is shown below. Find the entropy of the coded output and find the output sequence if the input sequence is a(t)={ } and the initial contents of the registers are where the “addtion” blocks are modulo-2 adders(i.e,exclusice-or gates).

49 §1.1.2 Discrete memoryless channel and mutual information
Yt Yt+1 1 1 2 2 3 3 a(t)={ } b = { } 4 4 5 5 6 6 7 7

50 conveys message through a channel:
Homework P45: T1.10, P46: T1.19(except c) 3. Let the DMS conveys message through a channel: Calculate that: H(X) and H(Y); the mutual information of xi and yj (i,j=1,2); the equivocation H(X|Y) and average mutual information.

51 Homework 4. Suppose that I(X;Y)=0.Does this imply that I(X;Z)=I(X;Z|Y)? 5. In a joint ensemble XY, the mutual information I(x;y) is a random variable. In this problem we are concerned with the variance of that random variable, VAR[I(x;y)]. Prove that VAR[I(x;y)]=0 iff there is a constant αsuch that, for all x,y with P(xy)>0, P(xy)= αP(x) P(y) Express I(X;Y) in term of α and interpret the special case α =1. (continued)

52 Homework 5. (3) for each of the channel in fig5 , find a probability assignment P(x) such that I(X;Y) >0 and VAR[I(x;y)]=0 . Calculate I(X;Y). a1 a2 a3 b1 b2 1 a1 a2 a3 b1 b2 1/2 b3

53 §1 Entropy and mutual information
§1.1 Discrete random variables §1.2 Discrete random vectors §1.1 Discrete random variables §1.2 Discrete random vectors §1.2.1 Extended source and joint entropy §1.2.2 Extended channel and mutual information

54 §1.2.1 Extended source and joint entropy
Source model Example 1.2.1 N-times extended source Example:二元二次扩展信源

55 2. Joint entropy Definition:
§ Extended source and joint entropy 2. Joint entropy Definition: The joint entropy H(XY) of a pair of discrete random variables (X,Y) with a joint distribution p(x,y) is defined as which can also be expressed as

56 §1.2.1 Extended source and joint entropy
Extended DMS

57 §1.2.1 Extended source and joint entropy
memory source 1)Conditional entropy 2)Joint entropy 3)(per symbol) entropy bit/sig

58 §1.2.1 Extended source and joint entropy
3. Properties of joint entropy Theorem1.7 (Chain rule) : H(XY) = H(X) + H(Y|X) Proof:

59 §1.2.1 Extended source and joint entropy
3. Properties of joint entropy Example 1.2.3 Let X be a random variable, its probability space is: 1 2 (per symbol) entropy 1/4 1/4 Its joint probability 1 1/4 1/24 1/24 2 H(X)=? 1/24 1/8

60 §1.2.1 Extended source and joint entropy
3. Properties of joint entropy Relationship H(X2) ≥ H(X2|X1) H(X1X2) ≤ 2H(X1)

61 §1.2.1 Extended source and joint entropy
3. Properties of joint entropy General stationary source Let X1,X2,…,XN be dependent , the joint probability is:

62 §1.2.1 Extended source and joint entropy
3. Properties of joint entropy Definition of entropies conditional entropy Joint entropy (per symbol) entropy

63 3. Properties of joint entropy
§ Extended source and joint entropy 3. Properties of joint entropy Theorem1.8 (Chain rule for entropy): Let X1,X2,…,Xn be drawn according to p(x1,x2,…,xn). Then Proof (do it by yourself)

64 §1.2.1 Extended source and joint entropy ——base of data compressing
3. Properties of joint entropy Relation of entropies If H(X)<∞, then: ——base of data compressing entropy rate

65 3. Properties of joint entropy
§ Extended source and joint entropy 3. Properties of joint entropy Theorem1.9 (Independence bound on entropy): Let X1,X2,…, Xn be drawn according to p(x1,x2,…,xn). Then with equality iff the Xi are independent (P37(corollary) in textbook)

66 §1.2.1 Extended source and joint entropy
3. Properties of joint entropy Example 1.2.4 Suppose a memoryless source with A={0,1} having equal probabilities emits a sequence of six symbols. Following the sixth symbol, suppose a seventh symbol is transmitted which is the sum modulo 2 of the six previous symbols. What is the entropy of the seven-symbol sequence? 到此,第1节课唐波、韩永宁听课,反映很好。 此题很吸引学生,第2节课留出20分钟学生报告:coin weighing 一名学生(李陆洋)还没讲完,因赶班车先走,学生自己组织讨论。

67 §1 Entropy and mutual information
§1.1 Discrete random variables §1.2 Discrete random vectors §1.2.1 Extended source and joint entropy §1.2.2 Extended channel and mutual information

68 §1.2.2 Extended channel and mutual information
1. The model of extended channel (U1,U2,…,Uk) (X1,X2,…,XN) source encoder XN (Y1,Y2,…,YN) channel decoder (V1,V2,…,Vk) YN A general communication system

69 §1.2.2 Extended channel and mutual information
1. The model of extended channel Extended channel

70 §1.2.2 Extended channel and mutual information
1. The model of extended channel

71 §1.2.2 Extended channel and mutual information
2. Average mutual information example 1.2.5

72 §1.2.2 Extended channel and mutual information
3. The properties Theorem1.11 If the components(X1, X2,…,XN) of XN are independent, then (Theorem 1.8 in textbook)

73 §1.2.2 Extended channel and mutual information
3. The properties Theorem1.12 If XN =(X1, X2,…,XN) and YN =(Y1, Y2,…,YN) are random vectors and the channel is memoryless, that is then (Theorem 1.9 in textbook)

74 §1.2.2 Extended channel and mutual information
example 1.2.6 Let X1,X2,…,X5 be independent identically distributed random variables with common entropy H. Also let T be a permutation of the set {1, 2,3,4,5}, and let Yi = XT(i) 1 2 3 4 5 讲完下课,并补讲coin weighing 和保密编码题。 ,次7 感觉讲启示学生兴趣很大,且有恍然大悟之感 Show that

75 Measure of information
Review Keywords: vector Extented source stationary source Extented channel Measure of information joint entropy (per symbol) entropy conditional entropy entropy rate

76 conditioning reduces entropy Independence bound on entropy
Review Conclusion: conditioning reduces entropy chain rule for entropy Independence bound on entropy properties of

77 Homework P47: T1.23, P47: T1.24,

78 Homework 1)show that 2)when 3)when
4.Let X1, X2 be identically distributed random variables. Let be: 1)show that 2)when 3)when

79 Homework Thinking : 5. Shuffles increase entropy. Argue that for any distribution on shuffles T and any distribution on card positions X that H(TX) ≥ H(TX|T) , if X and T are independent.


Download ppt "§1 Entropy and mutual information"

Similar presentations


Ads by Google