Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mehdi Ghayoumi MSB rm 132 Ofc hr: Thur, 11-12 a Machine Learning.

Similar presentations


Presentation on theme: "Mehdi Ghayoumi MSB rm 132 Ofc hr: Thur, 11-12 a Machine Learning."— Presentation transcript:

1 Mehdi Ghayoumi MSB rm 132 mghayoum@kent.edu Ofc hr: Thur, 11-12 a Machine Learning

2

3 “Learning denotes changes in a system that... enable a system to do the same task more efficiently the next time.” –Herbert Simon “Learning is constructing or modifying representations of what is being experienced.” –Ryszard Michalski “Learning is making useful changes in our minds.” –Marvin Minsky Machine Learning

4 Decision Tree Hunt and colleagues use exhaustive search decision-tree methods (CLS) to model human concept learning in the 1960’s. In the late 70’s, Quinlan developed ID3 with the information gain heuristic to learn expert systems from examples. Quinlan’s updated decision-tree package (C4.5) released in 1993. Machine Learning

5 Classification: predict a categorical output from categorical and/or real inputs Decision trees are most popular data mining tool Easy to understand Easy to implement Easy to use Computationally cheap Machine Learning

6 Extremely popular method –Credit risk assessment –Medical diagnosis –Market analysis –Bioinformatics –Chemistry … Machine Learning

7

8 Internal decision nodes –Univariate: Uses a single attribute, x i –Multivariate: Uses all attributes, x Leaves –Classification: Class labels, or proportions –Regression: Numeric; r average, or local fit Learning is greedy; find the best split recursively Machine Learning

9 Occam’s razor: (year 1320) –Prefer the simplest hypothesis that fits the data. –The principle states that the explanation of any phenomenon should make as few assumptions as possible, eliminating those that make no difference in the observable predictions of the explanatory hypothesis or theory.phenomenonhypothesistheory Albert Einstein: Make everything as simple as possible, but not simpler. Why? –It’s a philosophical problem. –Simple explanation/classifiers are more robust –Simple classifiers are more understandable Machine Learning

10 Objective: Shorter trees are preferred over larger Trees Idea: want attributes that classifies examples well. The best attribute is selected. Select attribute which partitions the learning set into subsets as “pure” as possible. Machine Learning

11

12  Each branch corresponds to attribute value  Each internal node has a splitting predicate  Each leaf node assigns a classification Machine Learning

13 Entropy (disorder, impurity) of a set of examples, S, relative to a binary classification is: where p 1 is the fraction of positive examples in S and p 0 is the fraction of negatives. Machine Learning

14

15 If all examples are in one category, entropy is zero (we define 0  log(0)=0) If examples are equally mixed (p 1 =p 0 =0.5), entropy is a maximum of 1. Entropy can be viewed as the number of bits required on average to encode the class of an example in S where data compression (e.g. Huffman coding) is used to give shorter codes to more likely cases. For multi-class problems with c categories, entropy generalizes to: Machine Learning

16

17 Thank you!


Download ppt "Mehdi Ghayoumi MSB rm 132 Ofc hr: Thur, 11-12 a Machine Learning."

Similar presentations


Ads by Google