Presentation is loading. Please wait.

Presentation is loading. Please wait.

COT 5611 Operating Systems Design Principles Spring 2012

Similar presentations


Presentation on theme: "COT 5611 Operating Systems Design Principles Spring 2012"— Presentation transcript:

1 COT 5611 Operating Systems Design Principles Spring 2012
Dan C. Marinescu Office: HEC 304 Office hours: M-Wd 5:00-6:00 PM

2 Lecture 16 – Monday March 12, 2012 Reading assignment:
Chapter 8 from the on-line text Claude Shannon’s paper 9/18/2018 Lecture 16

3 Today Information Theory
Information theory - a statistical theory of communication Random variables, probability density functions (PDF), cumulative distribution functions (CDF), Thermodynamic entropy Shannon entropy 9/18/2018 Lecture 16

4 Continuous and discrete random variables, pdf, cdf
Given a discrete random variable X which takes the values xi with probability pi ,1 ≤ i ≤ n the expected value is and the variance is Given a continuous random variable X which takes values x in some interval I with probability fX(x) the probability density function is the cumulative distribution function is the expected value is the variance is 9/18/2018 Lecture 16

5 Joint and conditional probability density functions
Discrete random variables X and Y Joint probability density function pXY(x,y) : pXY(xi, yj)  the probability that X=xi and, at the same time, Y=yj Conditional probability density function of X given Y  pX|Y(x|y) pXY(xi |yj)  the probability that X=xi when Y=yj Continuous random variables X and Y : Joint probability density function pXY(x,y) pXY(x,y)  the probability that X=x and, at the same time Y=y pXY(xi |yj)  the probability that X=x when Y=y 9/18/2018 Lecture 16

6  probability density function
Normal distribution:  probability density function cumulative distribution function 9/18/2018 Lecture 16

7 Cumulative distribution function 
Exponential distribution  PROBABILITY DENSITY FUNCTION Cumulative distribution function  9/18/2018 Lecture 16

8 Information theory The statistical theory of communication introduced by Claude Shannon in 1949 answers fundamental questions: How much information can be generated by an agent acting as a source of information? How much information can be squeezed through a channel? Communication model: sender – communication channel –receiver the sender and the receiver share a common alphabet Entropy – the amount of uncertainty in a system. Important concept for any physical system. Do not forget that information is physical!!! Thermodynamic entropy: related to the number of microstates of the system. S = kB ln Ω kB  the Boltzmann constant and Ω is the number of microstates Shannon entropy: measures the quantity of information necessary to remove the uncertainty. Used to measure the quantity of information a source can produce: The two are related: Shannon entropy represents the number of bits required to label the individual microstates as we can see on the next slide. 9/18/2018 Lecture 16

9 9/18/2018 Lecture 16

10 Shannon’s entropy Consider an event which happens with probability p; we wish to quantify the information content of a message communicating the occurrence of this event. The measure should reflect the “surprise" brought by the occurrence of this event. An initial guess for a measure of this surprise would be 1/p, the lower the probability of the event the larger the surprise. But the surprise should be additive. If an event is composed of two independent events which occur with probabilities q and r then the probability of the event should be p = qr, but we see that If the surprise is measured by the logarithm of 1/p, then additivity is obeyed The entropy of a random variable X with a probability density function pX(x) is Example: if X is a binary random variable, x={0,1} and p = pX(x = 1) then H(p) = -p log p - (1 - p) log(1 - p) 9/18/2018 Lecture 16

11 Example p1=1/2, p2=1/4, p3=1/8, p4=1/16, p5- p8=1/64
Eight cars compete in several Formula I races. The probability of winning calculated based on the past race history for the eight cares are: p1=1/2, p2=1/4, p3=1/8, p4=1/16, p5- p8=1/64 To send a binary message revealing the winner of a particular race we could encode the identities of the winning car in several ways. For example, we can use an ``obvious'' encoding scheme; the identities of the eight cars could be encoded using three bits, the binary representation of integers 0-7, namely 000, 001, 010, 011, 100, 101, 110, 111, respectively. Obviously, in this case we need three bits, the average length of the string used to communicate the winner of any race is l=3. The cars have different probability to win a race and it makes sense to assign a shorter string to a car which has a higher probability to win. Thus, a better encoding of the identities is: 0, 10, 110, 1110, , , , In this case the corresponding lengths of the strings encoding the identity of each car are: 1, 2, 3, 4, 6, 6, 6, 6 for an average l=2 bits. Note that we have computed the average as l=∑pili This is optimal encoding is possible because the Shannon entropy is 2 bits 9/18/2018 Lecture 16


Download ppt "COT 5611 Operating Systems Design Principles Spring 2012"

Similar presentations


Ads by Google