Presentation is loading. Please wait.

Presentation is loading. Please wait.

30-08-05Prof. Pushpak Bhattacharyya, IIT Bombay1 CS 621 Artificial Intelligence Lecture 12 – 30/08/05 Prof. Pushpak Bhattacharyya Fundamentals of Information.

Similar presentations


Presentation on theme: "30-08-05Prof. Pushpak Bhattacharyya, IIT Bombay1 CS 621 Artificial Intelligence Lecture 12 – 30/08/05 Prof. Pushpak Bhattacharyya Fundamentals of Information."— Presentation transcript:

1 30-08-05Prof. Pushpak Bhattacharyya, IIT Bombay1 CS 621 Artificial Intelligence Lecture 12 – 30/08/05 Prof. Pushpak Bhattacharyya Fundamentals of Information Theory

2 30-08-05Prof. Pushpak Bhattacharyya, IIT Bombay2 Weather (0) Temp (T) Humidity (H) Windy (W) Decision (D) SunnyHigh FN SunnyHigh TN CloudyHigh FY RainMedHighFY RainColdLowNY RainColdLowTN CloudyColdLowTY

3 30-08-05Prof. Pushpak Bhattacharyya, IIT Bombay3 Weather (0) Temp (T) Humidity (H) Windy (W) Decision (D) SunnyMedHighFN SunnyColdLowFY RainMedLowFY SunnyMedLowTY CloudyMedHighTY CloudyHighLowFY RainHigh TN

4 30-08-05Prof. Pushpak Bhattacharyya, IIT Bombay4 Outlook Rain CloudySunny Windy YesHumidity High Low Yes No T F Yes No

5 30-08-05Prof. Pushpak Bhattacharyya, IIT Bombay5 Rule Base R1: If outlook is sunny and if humidity is high then Decision is No. R2: If outlook is sunny and if humidity is low then Decision is Yes. R3: If outlook is cloudy then Decision is Yes.

6 30-08-05Prof. Pushpak Bhattacharyya, IIT Bombay6 Making Sense of Information Classification Clustering Giving a short and nice description

7 30-08-05Prof. Pushpak Bhattacharyya, IIT Bombay7 Short Description Occam Razor principle (Shortest/simplest description is the best for generalization)

8 30-08-05Prof. Pushpak Bhattacharyya, IIT Bombay8 Representation Language Decision tree. Neural network. Rule base. Boolean expression.

9 30-08-05Prof. Pushpak Bhattacharyya, IIT Bombay9 The example data presented in the form of rows and labels has low ordered/structured information compared to the succinct description (Decision Trees and Rule Base). Define “information” Lack of structure in information by “Entropy” Information & Entropy

10 30-08-05Prof. Pushpak Bhattacharyya, IIT Bombay10 Define Entropy of S (Labeled data) E(S) = - ( P + log 2 P + + P - log 2 P - ) P + = proportion of positively labeled data. P - = proportion of negatively labeled data.

11 30-08-05Prof. Pushpak Bhattacharyya, IIT Bombay11 Example P + = 9/14 P 1 = 5/14 E(S) = - 9/14 log 2 (9/14) – 5/14 log 2 (5/14) = 0.91

12 30-08-05Prof. Pushpak Bhattacharyya, IIT Bombay12 Partitioning the Data “Windy” as the attribute Windy = [ T, F] Windy = T : Partitioning the data

13 30-08-05Prof. Pushpak Bhattacharyya, IIT Bombay13 Partitioning by focusing on a particular attribute produced “Information gain” “Reduction in Entropy” Partitioning the Data (Contd)

14 30-08-05Prof. Pushpak Bhattacharyya, IIT Bombay14 Information gain when we choose windy = [ T, F ] Windy = T,P + = 6, P - = 2 Windy = F, P + = 3, P - = 3 Partitioning the Data (Contd)

15 30-08-05Prof. Pushpak Bhattacharyya, IIT Bombay15 Windy TF 6, + 2, - 3, + 3, - Partitioning the Data (Contd)

16 30-08-05Prof. Pushpak Bhattacharyya, IIT Bombay16 Gain(S,A) = = E(S) -∑( |S v | / |S| )E(S v ) v є values of A Partitioning the Data (Contd) E(S) = 0.914 E(S, Windy): E( Windy=T) = - 6/8 log 6/8 – 2/8 log 2/8 = v

17 30-08-05Prof. Pushpak Bhattacharyya, IIT Bombay17 E( Windy=F) = - 3/6 log 3/6 – 3/6 log 3/6 = 1.0 Partitioning the Data (Contd)

18 30-08-05Prof. Pushpak Bhattacharyya, IIT Bombay18 Gain(S,Windy) = = 0.0914 – (8/14 * v + 9/19* 1.0) = N Exercise: Find information gain for each attribute: outlook, Temp, Humidity and windy. Partitioning the Data (Contd)

19 30-08-05Prof. Pushpak Bhattacharyya, IIT Bombay19 ID3 Algorithm Calculating the gain for every attribute and choosing the one with maximum gain to finally arrive at the decision tree is called “ID3” algorithm to build a classifier.

20 30-08-05Prof. Pushpak Bhattacharyya, IIT Bombay20 Origin of Information Theory 1)Shannon “The mathematical Theory of communication”, Bell systems Journal, 1948. 2)Cover and Thomas, “Elements of Information Theory”, 1991.

21 30-08-05Prof. Pushpak Bhattacharyya, IIT Bombay21 Motivation with the example of a horse race 8 horses - h 1,h 2 ……h 8 Person P would like to bet on one of the horse. The horse have probability of winning as follows: Example

22 30-08-05Prof. Pushpak Bhattacharyya, IIT Bombay22 h 1 = 1/5 h 5 = 1/64 h 2 = 1/4 h 6 = 1/64 h 3 = 1/8 h 7 = 1/64 h 4 = 1/16 h 8 = 1/64 ∑h i = 1. Example (Contd 1)

23 30-08-05Prof. Pushpak Bhattacharyya, IIT Bombay23 Send message specifying the horse on which to bet. If the situation is “unbiased” i.e., all horses have equal probability of winning then we need 3 binary units. 3 = log 2 8 Example (Contd 2)

24 30-08-05Prof. Pushpak Bhattacharyya, IIT Bombay24 Compute the bias E(s) = - ∑ P i Log P i i = 1,… 8 P i = Probability of h i winning E(s) = - ( ½ log ½ + ¼ log ¼ + …… ……... + l/64 log 1/64) = 2 Example (Contd 3)

25 30-08-05Prof. Pushpak Bhattacharyya, IIT Bombay25 On the average we do not need more than 2 bits to communicate the desired horse. Actual length of code ? Example (Contd 4) Design Of Optimal Code Is A Separate Problem.

26 30-08-05Prof. Pushpak Bhattacharyya, IIT Bombay26 Example 2( Letter Guessing Game) Pt K a i u 1/8 1/4 1/81/81/4 1/8 20 – Question game E(s) = - ∑ P i Log 2 P i i = {p, t, k, a, i, u } = 2.5

27 30-08-05Prof. Pushpak Bhattacharyya, IIT Bombay27 On the average we need no more than 2.5 questions. Design a code: Pt K a i u 1/8 1/4 1/81/81/4 1/8 100 0010111001111

28 30-08-05Prof. Pushpak Bhattacharyya, IIT Bombay28 Q1) Is the letter t or I ? Q2) Is it a constant ? Expected number of questions = ∑ P i * N i Where N i = # questions for Situation i.

29 30-08-05Prof. Pushpak Bhattacharyya, IIT Bombay29 What has all this got to do with AI ? Why entropy? Why design codes? Why communicate ?

30 30-08-05Prof. Pushpak Bhattacharyya, IIT Bombay30 Bridge Multiparty participation is intelligent transformation processing. Information gain sets up theoretical limits in communicability.

31 30-08-05Prof. Pushpak Bhattacharyya, IIT Bombay31 Summary Haphazard presentation of data is not acceptable to MIND. Focusing attention on an attribute, automatically leads to information gain. Designed entropy. Parallely designed information gain. Related this to message communication.


Download ppt "30-08-05Prof. Pushpak Bhattacharyya, IIT Bombay1 CS 621 Artificial Intelligence Lecture 12 – 30/08/05 Prof. Pushpak Bhattacharyya Fundamentals of Information."

Similar presentations


Ads by Google