Presentation is loading. Please wait.

Presentation is loading. Please wait.

Decision Trees - Intermediate

Similar presentations


Presentation on theme: "Decision Trees - Intermediate"— Presentation transcript:

1 Decision Trees - Intermediate
Some material from Russell and Norvig, Artificial Intelligence, a Modern Approach, 2009 Villanova University Machine Learning Project

2 The Inductive Learning Problem
Extrapolate from a given set of examples to make accurate predictions about future examples Concept learning or classification Given a set of examples of some concept/class/category, determine if a given example is an instance of the concept If it is an instance, we call it a positive example If it is not, it is called a negative example Usually called supervised learning Villanova University Machine Learning Project Decision Trees

3 Inductive Learning Framework
Representation must extract from possible observations a feature vector of relevant features for each example. The number of attributes and values for the attributes are fixed (although values can be continuous). Each example is represented as a specific feature vector, and is identified as a positive or negative instance. Each example can be interpreted as a point in an n-dimensional feature space, where n is the number of attributes Villanova University Machine Learning Project Decision Trees

4 Machine Learning Project
Hypotheses The task of a supervised learning system can be viewed as learning a function which predicts the outcome from the inputs: Given a training set of N example pairs (xI, yI) (x2,y2)...(xn,yn), where each yj was generated by an unknown function y = f(x), discover a function h that approximates the true function y h is our hypothesis, and learning is the process of finding a good h in the space of possible hypotheses Prefer simplest consistent with the data Tradeoff between fit and generalizability Tradeoff between fit and computational complexity Villanova University Machine Learning Project Decision Trees

5 Decision Tree Induction
Very common machine learning and data mining technique. One of the earliest methods for inductive learning Induction of Decision Trees, J. Ross Quinlan Machine Learning Vol1: ,Kluwer Academic Publishers, 1986. Given: Examples Attributes Goal (Classes) Pick “important” attribute: one which divides set cleanly. Recur with subsets not yet classified. Villanova University Machine Learning Project Decision Trees

6 Machine Learning Project
A Restaurant Domain Develop a decision tree to model the decision a patron makes when deciding whether or not to wait for a table at a restaurant Two classes: wait, leave Ten attributes: Alternative available? Bar in restaurant? Is it Friday/Saturday? Are we hungry? How full is the restaurant? How expensive? Is it raining? Do we have a reservation? What type of restaurant is it? What’s the purported waiting time? Training set of 12 examples ~ 7000 possible cases Villanova University Machine Learning Project Decision Trees

7 What Might Your First Question Be?
Alternative available? Bar in restaurant? Is it Friday or Saturday? Are we hungry? How full is the restaurant? How expensive? Is it raining? Do we have a reservation? What type of restaurant is it? What’s the purported waiting time? Villanova University Machine Learning Project Decision Trees

8 A Decision Tree from Introspection
Villanova University Machine Learning Project Decision Trees

9 Machine Learning Project
A Training Set Villanova University Machine Learning Project Decision Trees

10 Machine Learning Project
Thinking About It Looking at these examples, now what might you expect the first question to be? The second? Villanova University Machine Learning Project Decision Trees

11 Machine Learning Project
Tree by Inspection You have a copy of this table Get together in threes Decide on a decision tree Choose a representative to come up and draw your tree on the whiteboard Someone with legible handwriting! What issues came up? How many decisions did your tree have? Was it balanced? How do you decide what to split next? How good was it? Did every case get classified correctly? How many decisions would cases take? How many cases at the leaves? Do you think it would generalize? Villanova University Machine Learning Project Decision Trees

12 What Does your Group’s Tree Look Like?
Villanova University Machine Learning Project Decision Trees

13 Machine Learning Project
Choosing an attribute Idea: a good attribute splits the examples into subsets that are (ideally) "all positive" or "all negative" Which is the better choice? Patrons makes a cleaner split; we get 2 clean categories. Type doesn’t gain us anything at all. Villanova University Machine Learning Project Decision Trees

14 Machine Learning Project
Best Attribute What’s the best attribute to choose? The one with the best information gain If we choose Bar, we have no: 3 -, yes: 3 -, 3+ If we choose Hungry, we have no: 4-, yes: 1 -, 5+ Hungry has given us more information about the correct classification. So we want to choose the attribute split which gives us the most useful division of our data Villanova University Machine Learning Project Decision Trees

15 Machine Learning Project
ID3 A greedy algorithm for decision tree construction originally developed by Ross Quinlan, 1987 Top-down construction of decision tree by recursively selecting “best attribute” to use at the current node Once an attribute is selected, generate children nodes, one for each possible value of selected attribute Partition examples using possible values of attribute, assign subsets of examples to appropriate child node Repeat for each child node until all examples associated with a node are either all positive or all negative J48 is an improved version of ID3 In Weka j4.8 is an updated ID3 Villanova University Machine Learning Project Decision Trees

16 One Possible Learned Tree
Substantially simpler than “true” tree---a more complex hypothesis isn’t justified by small amount of data Note that it is much simpler than my induced tree, and just as accurate. Villanova University Machine Learning Project Decision Trees

17 Machine Learning Project
How well does it work? Many case studies have shown that decision trees are at least as accurate as human experts. A study for diagnosing breast cancer had humans correctly classifying the examples 65% of the time; the decision tree classified 72% correct British Petroleum designed a decision tree for gas-oil separation for offshore oil platforms that replaced an earlier rule-based expert system Cessna designed an airplane flight controller using 90,000 examples and 20 attributes per example Villanova University Machine Learning Project Decision Trees

18 More on Attribute Splits
Each node tests one attribute The split does not need to be binary; note the “Outlook” split in the Weka weather data ID3 required nominal attributes; ID4.5 has been extended to numeric attributes, such as humidity. Tree from running Weka’s J48 on weather.numeric.arff Villanova University Machine Learning Project Decision Trees

19 Machine Learning Project
Pruning With enough levels of a decision tree we can always get the leaves to be 100% positive or negative (if there is no inconsistency in the data) But if we are down to one or two cases in each leaf we are probably overfitting Useful to prune leaves; stop when we reach a certain level we reach a small enough size leaf our information gain is increasing too slowly If exactly the same values of xi lead to different yi, you can’t get a perfect tree. Villanova University Machine Learning Project Decision Trees

20 Machine Learning Project
Expressiveness Decision trees can express any function of input attributes. E.g., for Boolean functions, truth table row → path to leaf: Trivially, there is a consistent decision tree for any training set with one path to leaf for each example (unless f nondeterministic in x) but it probably won't generalize to new examples Prefer to find more compact decision trees Villanova University Machine Learning Project Decision Trees

21 Machine Learning Project
BUT! Decision tree tests are univariate: one attribute at a time In the OR tree we have captured the “OR” by essentially replicating the B question under both A answers. Inefficient if we have many attributes and/or values. Really inefficient if out attributes are real-valued. So a decision tree can express a function or model with a complex relationship among attributes but it may be unusably complicated and inefficient. Villanova University Machine Learning Project Decision Trees

22 Decision Tree Architecture
Knowledge Base: the decision tree itself. Performer: tree walker Critic: actual outcome in training case Learner: ID3 or its variants This is an example of a large class of learners that need all of the examples at once in order to learn. Batch, not incremental. Villanova University Machine Learning Project Decision Trees

23 Strengths of Decision Trees
Strengths include Fast to learn and to use Simple to implement Can look at the tree and see what is going on -- relatively “white box” Has been empirically validated many times Handles noisy data (with pruning) Quinlan’s C4.5 and C5.0 are extension of ID3 that account for unavailable values, continuous attribute value ranges, pruning of decision trees, rule derivation. Villanova University Machine Learning Project Decision Trees

24 Decision Tree Weaknesses
Weaknesses include: Univariate splits/partitioning (one attribute at a time) limits types of possible trees Large decision trees may be hard to understand Requires fixed-length feature vectors Non-incremental (i.e., batch method) For continuous or real-valued features requires additional complexity to choose decision points Prone to over-fitting Villanova University Machine Learning Project Decision Trees

25 Summary: Decision Tree Learning
Model being learned is a tree of nodes Each node is a test of the value of one attribute Series of test results for an example leads to classifying that example, at a leaf One of the earliest techniques to demonstrate machine learning from examples Widely used learning methods in practice Can out-perform human experts in many problems Not really suitable for large number of attributes and values Villanova University Machine Learning Project Decision Trees


Download ppt "Decision Trees - Intermediate"

Similar presentations


Ads by Google