Presentation is loading. Please wait.

Presentation is loading. Please wait.

Eco 6380 Predictive Analytics For Economists Spring 2016 Professor Tom Fomby Department of Economics SMU.

Similar presentations


Presentation on theme: "Eco 6380 Predictive Analytics For Economists Spring 2016 Professor Tom Fomby Department of Economics SMU."— Presentation transcript:

1 Eco 6380 Predictive Analytics For Economists Spring 2016 Professor Tom Fomby Department of Economics SMU

2 Presentation 5 Building a Predictive Regression Model using Data Mining Techniques (Backward, Forward, Stepwise, All Regressions, Cp Statistic) Chapter 6 in SPB

3 Building a Data Mining Model: Example with Linear Regression – The Boston Housing Data Set Selecting a Best Subset Regression Selection Methods: Forward, Backward, Stepwise, and All Subsets Methods. See the file “Multiple Linear Regression and Subset Selection.pdf” on the class website. The critical p-value as a “tuning parameter” Thus the best “architecture” of the regression equation is determined by the choice of best subset selection method as well as the choice of the p-value tuning parameter of the respective subset selection methods

4 Building a Model: Example with Linear Regression – The Boston Housing Data Set continued Demonstrate the danger of over-fitting a multiple regression equation by running the SAS simulation program “Cross Validation.sas” The Validation data set reveals the true bogus nature of “spurious” regressions

5 Evaluation of Predictive Performance I.Various Measures of Predictive Accuracy II.Comparisons of Predictive Accuracies across competing prediction models in the validation data set III.Is the Predictive Accuracy of one method statistically better than the Predictive Accuracy of a competing method?

6 Now go directly to the file “Scoring Measures for Prediction Problems.pdf” on the class website

7 Building a Model: Example with Linear Regression – The Boston Housing Data Set continued Rule of Thumb Adjustment of p-values in linear regressions arising from multiple testing vis-à-vis subset selection methods. See the paper on Data Mining by Michael C. Lovell I have posted on the class website in the file Data Mining_Lovell.pdf and, in particular, equation (3).

8 Building a Model: Example with Linear Regression – The Boston Housing Data Set continued Demonstrate the Best Subset Selection Methods on the Boston Housing Data Set using the SAS program “Boston Best Subset Partitioned.sas” that can be found on the course website Demonstrate Best Subset Selection Procedure in SPSS Modeler

9 Classroom Exercise: Exercise 3


Download ppt "Eco 6380 Predictive Analytics For Economists Spring 2016 Professor Tom Fomby Department of Economics SMU."

Similar presentations


Ads by Google