Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to information theory

Similar presentations


Presentation on theme: "Introduction to information theory"— Presentation transcript:

1 Introduction to information theory
LING 572 Fei Xia, Dan Jinguji Week 1: 1/10/06

2 Today Information theory Hw #1 Exam #1

3 Information theory

4 Information theory Reading: M&S 2.2
It is the use of probability theory to quantify and measure “information”. Basic concepts: Entropy Cross entropy and relative entropy Joint entropy and conditional entropy Entropy of the language and perplexity Mutual information

5 Entropy Entropy is a measure of the uncertainty associated with a distribution. The lower bound on the number of bits that it takes to transmit messages. An example: Display the results of horse races. Goal: minimize the number of bits to encode the results.

6 An example Uniform distribution: pi=1/8.
Non-uniform distribution: (1/2,1/4,1/8, 1/16, 1/64, 1/64, 1/64, 1/64) (0, 10, 110, 1110, , , , ) Uniform distribution has higher entropy. MaxEnt: make the distribution as “uniform” as possible.

7 Cross Entropy Entropy: Cross Entropy:
Cross entropy is a distance measure between p(x) and q(x): p(x) is the true probability; q(x) is our estimate of p(x).

8 Relative Entropy Also called Kullback-Leibler divergence:
Another “distance” measure between probability functions p and q. KL divergence is asymmetric (not a true distance):

9 Reading assignment #1 Read M&S 2.2: Essential Information Theory
Questions: For a random variable X, p(x) and q(x) are two distributions: Assuming p is the true distribution. p(X=a)=p(X=b)=1/8, p(X=c)=1/4, p(X=d)=1/2 q(X=a)=q(X=b)=q(X=c)=q(X=d)=1/4 (a) What is H(X)? What is H(X, q)? What is KL divergence D(p||q)? What is D(q||p)?

10 H(X) and H(X, q)

11 D(p||q)

12 D(q||p)

13 Joint and conditional entropy
Joint entropy: Conditional entropy:

14 Entropy of a language (per-word entropy)
The entropy of a language L: If we make certain assumptions that the language is “nice”, then the cross entropy can be calculated as:

15 Per-word entropy (cont)
p(x1n) can be calculated by n-gram models Ex: unigram model

16 Perplexity Perplexity is 2H.
Perplexity is the weighted average number of choices a random variable has to make. => We learned how to calculate perplexity in LING570.

17 Mutual information It measures how much is in common between X and Y:
I(X;Y)=KL(p(x,y)||p(x)p(y)) I(X;Y) = I(Y;X)

18 Summary on Information theory
Reading: M&S 2.2 It is the use of probability theory to quantify and measure “information”. Basic concepts: Entropy Cross entropy and relative entropy Joint entropy and conditional entropy Entropy of the language and perplexity Mutual information

19 Hw1

20 Hw1 Q1-Q5: Information theory Q6: Condor submit Q7: Hw10 from LING570.
You are not required to turn in anything for Q7. If you want feedback on this, you can choose to turn it in. It won’t be graded. You get 30 points for free.

21 Q6: condor submission Especially Slide #22 - #28.

22 For a command we can run as: mycommand -a -n <mycommand
For a command we can run as: mycommand -a -n <mycommand.in >mycommand.out The submit file might look like this: save it to *.cmd Executable = mycommand  The command Universe = vanilla getenv = true input = mycommand.in  STDIN output = mycommand.out  STDOUT error = mycommand.error  STDERR Log = /tmp/brodbd/mycommand.log  A log file that stores the results of condor sumbission arguments = "-a -n“  The arguments for the command transfer_executable = false Queue

23 Submission and monitoring jobs on condor
condor_submit mycommand.cmd => get a job number List the job queue: condor_q Status changes from “I” (idle) to “R” (run) to “H”: means the job fails. Look at the log file specified in *.cmd Disappeared from the queue: You will receive an Use “man condor_q” etc. to learn more about those commands.

24 The path names for files in *.cmd
In the *.cmd file: Executable = aa194.exec input = file1 The environment (e.g., ~/.bash_profile) might not be set properly It assumes that the files are in the current directory (the dir where the job is submitted) => Use the full part names if needed.


Download ppt "Introduction to information theory"

Similar presentations


Ads by Google