Presentation is loading. Please wait.

Presentation is loading. Please wait.

Classification with CART

Similar presentations


Presentation on theme: "Classification with CART"— Presentation transcript:

1 Classification with CART
Splitting: At each node, choose split maximizing decrease in impurity (e.g. Gini index, entropy, Bayes error). Split-stopping: Minimum terminal node size, pruning. Class assignment: For each terminal node, choose the class with the majority vote. Root x > a x  a Internal z > b z  b Terminal Class A Random Forest Slides adapted from slides by Xuelian Wei, Ruhai Cai, Leo Breiman Class B

2 Drawbacks of CART Accuracy - recent methods have lower error rates than CART in many cases. Instability – little changes in data results in large changes in the tree

3 Classification with Random Forests
Class A Input Vector Class B Random Forest Tree 1 Tree 2 Tree 3

4 Building Your Forest Construct large number of trees obviously but …
Use bootstrap sample of data set Use random sample of predictors Grow tree without pruning Tree then “votes” for a class Prediction is the class with the most “votes”

5 Random Forest Features (in the words of Breiman)
“Unexcelled” accuracy Does not overfit Can determine variable importance Does not require cross validation (instead uses oob estimates) Runs efficiently for large datasets Easy-to-use randomforest package in R based on Breiman’s Fortran code


Download ppt "Classification with CART"

Similar presentations


Ads by Google