Multiple Decision Trees ISQS7342

Multiple Decision Trees ISQS7342
Xiayi Kuang Chin Hwa Tan

Outlines 1 2 3 4 Multiple decision trees
Outlines Multiple decision trees 1 Advantages of multiple decision tress 2 Major multiple decision tree methods 3 Multiple random classification decision trees 4 Company Logo

Multiple decision trees
Multiple decision trees The original data set is used to create derived data sets or alternative decision tree models. These data sets or models are used to develop multiple decision tree models and score algorithms. Company Logo

Comparison Single decision tree sample results
Comparison Single decision tree sample results IF Age > 20 AND Weight > 180 (lbs.) THEN Gender = Male Multiple decision trees sample results Rule 1 IF Age > 20 AND Weight > 180 (lbs.) Rule 2 IF Height > 160 (cm.) AND BodyMass < .3 THEN Probability of Gender = Female is .3 Company Logo

Advantages Use randomization approach
Advantages Use randomization approach Create an aggregate outcome that represents the summary of all decision trees Significantly improve model results Because many models are developed and averaged, the results are highly stable Company Logo

Major multiple decision trees methods
Major multiple decision trees methods Cross -validation Bootstrapping Bagging Boosting AdaBoost Company Logo

Cross-Validation Use in CRT decision tree approach
Cross-Validation Use in CRT decision tree approach The decision tree is allowed to grow larger than normal because cross-validation is then used to trim branches The method used is by partitioning the data in to ten partitions and then 9 of the new partitions are used as new cross-validate training data. The other part is used as an independent test sample Repeat the method above ten times. Then the average if the error rate is calculated and this is known as cross-validation cost. Company Logo

Bootstrapping The bootstrap process begins with the original or training data set of observations and forms a bootstrap sample of the original data set by repeatedly selecting an observation from the original data set at random. Bootstrapping can also be used to assess the stability of various predictors. This will serve as indicator of how well predictors will perform in the new data sets. Company Logo

Bagging Bagging refers to the creation of a pooled estimate of the target. Bagging stands for bootstrap aggregation Bagging can improve the predictive accuracy of unstable models Bagging help smooth out the prediction Company Logo

Boosting Boosting uses varying probabilities in selecting an observation to be included in the sample The goal of boosting is to increase the probability of selecting an observation that performs well when predicting the target Boosting builds a series of decision trees and the prediction receives incremental improvement by each decision tree Company Logo

AdaBoost AdaBoost is a form of boosting that builds an initial model from the training data set In the model, some records will be correctly classified by the decision algorithm, and some records will be misclassified Company Logo

Multiple random classification decision trees
Multiple random classification decision trees Random forests Each decision tree in the random forest is grown in a bootstrap sample of the training data set Company Logo

Thank You !

Multiple Decision Trees ISQS7342

Similar presentations

Presentation on theme: "Multiple Decision Trees ISQS7342"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Multiple Decision Trees ISQS7342

Similar presentations

Presentation on theme: "Multiple Decision Trees ISQS7342"— Presentation transcript:

Similar presentations

About project

Feedback