Presentation is loading. Please wait.

Presentation is loading. Please wait.

KAIR 2013 Nov 7, 2013 A Data Driven Analytic Strategy for Increasing Yield and Retention at Western Kentucky University Matt Bogard Office of Institutional.

Similar presentations


Presentation on theme: "KAIR 2013 Nov 7, 2013 A Data Driven Analytic Strategy for Increasing Yield and Retention at Western Kentucky University Matt Bogard Office of Institutional."— Presentation transcript:

1 KAIR 2013 Nov 7, 2013 A Data Driven Analytic Strategy for Increasing Yield and Retention at Western Kentucky University Matt Bogard Office of Institutional Research Western Kentucky University

2 Purpose Are there opportunities at the applicant stage to improve our yield, implement cost savings, and shape our freshmen class to maximize retention? Is there a way of knowing which applicants are most likely to enroll and retain?

3 Methodology Machine Learning vs. Statistical Inference Decision Trees
Emphasis on accurate predictions vs. inferences about particular roles of specific variables Decision Trees Ensemble Methods Gradient Boosting Neural Networks

4 Decision Tree Basics- Algorithm
Chooses variables and split values creating data partitions that differ based on the outcome of interest (retention) Finds all possible splits based on an adjusted χ2 p-value Prunes the tree to derive the most accurate predictions with fewest possible splits based on validation data The final model is characterized by the split values for each explanatory variable and creates a set of rules for classifying new cases.

5 Basic Decision Tree Visualization

6 Benefits of Decision Trees
"Approaching problems by looking for a data model imposes an apriori straight jacket that restricts the ability of statisticians to deal with a wide range of statistical problems.“ – Leo Brieman, Statistical Modeling: The Two Cultures (Statistical Science,2001) Non-parametric and non-linear No distributional assumptions Treat the data generation process as unknown No required functional form for predictors Identify complex interactions

7 Ensemble Methods Generalization Error- how well does a model predict across training, validation, and test data sets Ensemble- combined predictions of several learners or models The generalization error of a weighted combination of predictors in an ensemble is equal to the average error of the individual predictors minus ‘disagreement’ among them’-Krogh (1997), Statistical Mechanics of Ensemble Learning. Physical Review. Ensemble Error is smaller than the weighted average of the error of a single optimized predictor

8 Gradient Boosting Boosting algorithms: ensemble of a series of weak learners. Fit a series of trees using resampled training data weighted by classification accuracy of previous tree Combined series of trees form a single model

9 Neural Networks A nonlinear model of complex relationships with 'hidden' layers Using logistic activation functions, NNETS can be visualized as an ensemble of logits Y= W0 + W1 H1 + W2 H2 + W3 H3 + W4 Logit H4   and H1= logit(w10 +w11 x1 + w12 x2 ) H2 = logit(w20 +w21 x1 + w22 x2 ) H3 = logit(w30 +w31 x1 + w32 x2 ) H4 = logit(w40 +w41 x1 + w42 x2 )

10 Gradient Boosting vs. Decision Trees vs.NNETs vs. Logistic Regression
Decision Trees and Gradient Boosting are both robust to data generation process Decision Trees - more transparent model structure, which is lost in ensemble methods like gradient boosting and neural networks Neural Networks have issues with input selection and are more complex to train Decision tree posterior probability distribution may not be very smooth

11 Gradient Boosting vs. Decision Trees vs. Logistic Regression
Logistic Regression provides Smooth posterior probability distribution Less transparent model structure than decision trees but more transparent than GB Could be used for inferences or agnostic learning algorithm based on a specified functional form *some may refuse to make this distinction and make inferences where inappropriate

12 Machine Learning vs. Inference
Trees can guide and direct further inferential work, but can be misleading in terms of causal relationships if you are not careful

13 Fitting the Models

14 Results Focus: how well does the model predict behavior vs. inferences about the roles of specific variables Tradeoff between discrimination (measured by ROC ) & model calibration (Cook,2007) Gradient Boosting outperformed the other models based on calibration

15 Scorecard Using our models, we can sort applicants into 4 categories for enrollment propensity and predicted retention.

16 Implementation: Use advanced analytics to develop a strategic recruitment and retention strategy

17 Adhoc Reports Report by Counselor/Region/Territory
Report by County/ School Report by Student demographics Report by Prospect Source …other??

18 IR-DSS

19 Detail Reporting

20 Additional Reading Bogard, M.T. (2013).A Data Driven Analytic Strategy for Increasing Yield and Retention at Western Kentucky University Using SAS Enterprise BI and SAS Enterprise Miner. Paper SAS Institute Inc Proceedings of the SAS® Global Forum 2013 Conference. Cary, NC. DeVille, Barry. (2006). Decision Trees for Business Intelligence and Data Mining Using SAS®  Enterprise Miner. SAS® Institute. SAS® Institute.. By Barry de Ville and Padraic Neville. SAS® Institute Friedman, Jerome H. (2001), Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 29, Available at Hasti, Tibshirani and Friedman. (2009)Elements of Statistical Learning: Data Mining,Inference, and Prediction. Second Edition. Springer-Verlag.  'Statistical Modeling: The Two Cultures' by L. Breiman (Statistical Science 2001, Vol. 16, No. 3, 199–231) Cook,Nancy R.,(2007). Use and Misuse of the Receiver Operating Characteristic Curve in Risk Prediction. Circulation, 115 (7): Krogh, A. & Sollich, P. (1997, January). Statistical mechanics of ensemble learning. Physical Review E (Statistical Physics, Plasmas, Fluids, and Related Interdisciplinary Topics), 55 (1),


Download ppt "KAIR 2013 Nov 7, 2013 A Data Driven Analytic Strategy for Increasing Yield and Retention at Western Kentucky University Matt Bogard Office of Institutional."

Similar presentations


Ads by Google