Statistics 350 Lecture 25. Today Last Day: Start Chapter 9 (9.1-9.3)…please read 9.1 and 9.2 thoroughly Today: More Chapter 9…stepwise regression.

Slides:



Advertisements
Similar presentations
All Possible Regressions and Statistics for Comparing Models
Advertisements

Chapter 5 Multiple Linear Regression
Automated Regression Modeling Descriptive vs. Predictive Regression Models Four common automated modeling procedures Forward Modeling Backward Modeling.
Chapter 17 Making Sense of Advanced Statistical Procedures in Research Articles.
Multiple Logistic Regression RSQUARE, LACKFIT, SELECTION, and interactions.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1.2 Row Reduction and Echelon Forms
Linear Equations in Linear Algebra
Psychology 202b Advanced Psychological Statistics, II February 22, 2011.
Statistics for Managers Using Microsoft® Excel 5th Edition
Statistics for Managers Using Microsoft® Excel 5th Edition
Part I – MULTIVARIATE ANALYSIS C3 Multiple Linear Regression II © Angel A. Juan & Carles Serrat - UPC 2007/2008.
Statistics 350 Lecture 23. Today Today: Exam next day Good Chapter 7 questions: 7.1, 7.2, 7.3, 7.28, 7.29.
Lecture 6: Multiple Regression
1 Chapter 9 Variable Selection and Model building Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
MODEL BUILDING IN REGRESSION MODELS. Model Building and Multicollinearity Suppose we have five factors that we feel could linearly affect y. If all 5.
LECTURE 5 MULTIPLE REGRESSION TOPICS –SQUARED MULTIPLE CORRELATION –B AND BETA WEIGHTS –HIERARCHICAL REGRESSION MODELS –SETS OF INDEPENDENT VARIABLES –SIGNIFICANCE.
Predictive Analysis in Marketing Research
Lecture 11 Multivariate Regression A Case Study. Other topics: Multicollinearity  Assuming that all the regression assumptions hold how good are our.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
Chapter 15: Model Building
Stat 112: Lecture 9 Notes Homework 3: Due next Thursday
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
Model selection Stepwise regression. Statement of problem A common problem is that there is a large set of candidate predictor variables. (Note: The examples.
Copyright ©2011 Pearson Education 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft Excel 6 th Global Edition.
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition
Elements of Multiple Regression Analysis: Two Independent Variables Yong Sept
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft.
Lecture 12 Model Building BMTRY 701 Biostatistical Methods II.
315 Feature Selection. 316 Goals –What is Feature Selection for classification? –Why feature selection is important? –What is the filter and what is the.
Selecting Variables and Avoiding Pitfalls Chapters 6 and 7.
Multiple Regression Selecting the Best Equation. Techniques for Selecting the "Best" Regression Equation The best Regression equation is not necessarily.
Model Selection1. 1. Regress Y on each k potential X variables. 2. Determine the best single variable model. 3. Regress Y on the best variable and each.
The Goal of MLR  Types of research questions answered through MLR analysis:  How accurately can something be predicted with a set of IV’s? (ex. predicting.
Research Methods I Lecture 10: Regression Analysis on SPSS.
Stepwise Regression SAS. Download the Data atData.htmhttp://core.ecu.edu/psyc/wuenschk/StatData/St atData.htm.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
©2006 Thomson/South-Western 1 Chapter 14 – Multiple Linear Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western Concise.
Multiple Regression Selecting the Best Equation. Techniques for Selecting the "Best" Regression Equation The best Regression equation is not necessarily.
1 Experimental Statistics - week 12 Chapter 12: Multiple Regression Chapter 13: Variable Selection Model Checking.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Variable Selection 1 Chapter 8 Variable Selection Terry Dielman Applied Regression Analysis:
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 12 Multiple.
Lesson 14 - R Chapter 14 Review. Objectives Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 10 th Edition.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
6.2 Solve a System by Using Linear Combinations
Using SPSS Note: The use of another statistical package such as Minitab is similar to using SPSS.
Variable selection and model building Part I. Statement of situation A common situation is that there is a large set of candidate predictor variables.
Applied Quantitative Analysis and Practices LECTURE#28 By Dr. Osman Sadiq Paracha.
Week of March 23 Partial correlations Semipartial correlations
Lab 4 Multiple Linear Regression. Meaning  An extension of simple linear regression  It models the mean of a response variable as a linear function.
1 1.2 Linear Equations in Linear Algebra Row Reduction and Echelon Forms © 2016 Pearson Education, Ltd.
Variable Selection 1 Chapter 8 Variable Selection Terry Dielman Applied Regression Analysis: A Second Course in Business and Economic Statistics, fourth.
Yandell – Econ 216 Chap 15-1 Chapter 15 Multiple Regression Model Building.
Chapter 15 Multiple Regression Model Building
1.4 Significant Figures in Calculations
Chapter 9 Multiple Linear Regression
Forward Selection The Forward selection procedure looks to add variables to the model. Once added, those variables stay in the model even if they become.
Business Statistics, 4e by Ken Black
Chapter 15 – Multiple Linear Regression
Problems of Tutorial 9 1. Consider a data set consisting of 1 response variable (Y) and 4 predictor variables (X1, X2, X3 and X4) with n=40. The following.
Model Selection In multiple regression we often have many explanatory variables. How do we find the “best” model?
Linear Equations in Linear Algebra
Linear Model Selection and regularization
Lecture 12 Model Building
Model selection Stepwise regression.
Lecture 20 Last Lecture: Effect of adding or deleting a variable
Business Statistics, 4e by Ken Black
Linear Equations in Linear Algebra
Presentation transcript:

Statistics 350 Lecture 25

Today Last Day: Start Chapter 9 ( )…please read 9.1 and 9.2 thoroughly Today: More Chapter 9…stepwise regression

Stepwise Variable Selection Three categories of stepwise variable selection:

Forward Selection For all methods considered today, have P-1 possible predictors and 2 P-1 possible models Start with no variables in model Select a significance level at which variables can be included in the model Find the critical value of F for this level: F ENTER

Forward Selection Consider every possible 1-variable model,

Forward Selection Each time a variable is entered into the model (i.e. the maximum F is big enough), then use the newly-augmented model as the base model Check extra SS for each remaining variable For example, if X a is entered, then at the next step, check all SSR(X k |X a ) for all variables (other than X a, which is already in the model) Keep adding variables and revising the base model until at some step F * < F ENTER. Then no more variables can be added.

Forward Selection The final model is the last base model. Procedure gives a single model, declared best by the procedure Also, once a variable is added, it can never be removed, even if subsequent additions render it unimportant (e.g. through multi-collinearity)

Backward Elimination Start with all varaibles:

Backward Elimination Consider all possible 1-varaible reduction in the model size

Backward Elimination Each time a variable is dropped, use the revised model as the base model and check all the extra SS for variables remaining in the model Keep eliminating variables and revising the model until all variables remaining in the model have F k > F STAY

Backward Elimination Gives a best model according to this criterion It may differ from the one given in Forward Selection Once a variable is removed, it remains out of the model, even if subsequent eliminations render it useful

Stepwise Selection Alternates between Forward and Backward steps to address the problems noted above Start with no variables in the model

Stepwise Selection After each Backward phase, use the revised model as the base model from which to begin another round of Forward/Backward Continue until no further variables can be added or removed

Stepwise Selection Note that in each forward phase, only one variable can be added before the new model is trimmed with (possibly multiple steps of) backward elimination The final model may or may not match either of the models obtained using Forward Selection or Backward Elimination alone

Comments In all cases, methods based on insertion or deletion criteria In forward steps: In backward steps:

Comments Significance level is a personal decision Common practice in regression to use slightly higher levels of α in allowing variables to enter into or remain in the model than in other testing situations

Comments Note that in Stepwise Selection, you must arrange for α ENTER <= α STAY or, equivalently, F ENTER >=F STAY Otherwise, a variable's p-value could be small enough to include but large enough to eliminate in each step, leading to an infinite loop One suggestion is to use α STAY = 2 α ENTER