2018-12-06 Lecture 5 732G21/732G28/732A35 Detta är en generell mall för att göra PowerPoint presentationer enligt LiUs grafiska profil. Du skriver in din.

Slides:



Advertisements
Similar presentations
Qualitative predictor variables
Advertisements

Multicollinearity.
More on understanding variance inflation factors (VIFk)
Multiple Regression in Practice The value of outcome variable depends on several explanatory variables. The value of outcome variable depends on several.
Pengujian Parameter Regresi Pertemuan 26 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
1 1 Slide 統計學 Spring 2004 授課教師:統計系余清祥 日期: 2004 年 5 月 4 日 第十二週:複迴歸.
Slide 1 Larger is better case (Golf Ball) Linear Model Analysis: SN ratios versus Material, Diameter, Dimples, Thickness Estimated Model Coefficients for.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
Review for Final Exam Some important themes from Chapters 9-11 Final exam covers these chapters, but implicitly tests the entire course, because we use.
© 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.
1 1 Slide © 2003 Thomson/South-Western Chapter 13 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple Coefficient of Determination.
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons Business Statistics, 4e by Ken Black Chapter 15 Building Multiple Regression Models.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 12-1 Correlation and Regression.
Introduction to Linear Regression
Soc 3306a Lecture 9: Multivariate 2 More on Multiple Regression: Building a Model and Interpreting Coefficients.
Detecting and reducing multicollinearity. Detecting multicollinearity.
Solutions to Tutorial 5 Problems Source Sum of Squares df Mean Square F-test Regression Residual Total ANOVA Table Variable.
Sequential sums of squares … or … extra sums of squares.
 Relationship between education level, income, and length of time out of school  Our new regression equation: is the predicted value of the dependent.
Week 101 ANOVA F Test in Multiple Regression In multiple regression, the ANOVA F test is designed to test the following hypothesis: This test aims to assess.
Review Session Linear Regression. Correlation Pearson’s r –Measures the strength and type of a relationship between the x and y variables –Ranges from.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 10 th Edition.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
732G21/732G28/732A35 Lecture 4. Variance-covariance matrix for the regression coefficients 2.
Multiple Regression II 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 4 Multiple Regression Analysis (Part 2) Terry Dielman.
Multicollinearity. Multicollinearity (or intercorrelation) exists when at least some of the predictor variables are correlated among themselves. In observational.
Interaction regression models. What is an additive model? A regression model with p-1 predictor variables contains additive effects if the response function.
732G21/732G28/732A35 Lecture 6. Example second-order model with one predictor 2 Electricity consumption (Y)Home size (X)
1 Multiple Regression. 2 Model There are many explanatory variables or independent variables x 1, x 2,…,x p that are linear related to the response variable.
Chapter 15 Multiple Regression Model Building
Chapter 20 Linear and Multiple Regression
Multiple Regression (1)
Introduction to Regression Lecture 6.2
Least Square Regression
Regression Diagnostics
Multiple Regression Analysis and Model Building
Chapter 13 Created by Bethany Stubbe and Stephan Kogitz.
Least Square Regression
Opportunities in Film Cooling
John Loucks St. Edward’s University . SLIDES . BY.
Fatigue life of thermal barrier coatings (TBCs) – physically based modeling Sten Johansson, Håkan Brodin, Robert Eriksson, Sören Sjöström, Lars.
732A30 Master’s thesis Department of Computer and Information Science (IDA) Linköpings universitet Sweden Detta är en mall för att göra PowerPoint presentationer.
Chapter 13 Simple Linear Regression
Lecture 12 More Examples for SLR More Examples for MLR 9/19/2018
9/19/2018 ST3131, Lecture 6.
Business Statistics, 4e by Ken Black
Business Statistics Multiple Regression This lecture flows well with
Lecture 18 Outline: 1. Role of Variables in a Regression Equation
Regression Model Building - Diagnostics
Cases of F-test Problems with Examples
Properties of the LS Estimates Inference for Individual Coefficients
Solutions for Tutorial 3
Solutions of Tutorial 10 SSE df RMS Cp Radjsq SSE1 F Xs c).
24/02/11 Tutorial 3 Inferential Statistics, Statistical Modelling & Survey Methods (BS2506) Pairach Piboonrungroj (Champ)
Solution 9 1. a) From the matrix plot, 1) The assumption about linearity seems ok; 2).The assumption about measurement errors can not be checked at this.
Program Structure Business & Economics Program May 2016
Multiple Regression Chapter 14.
Business Statistics, 4e by Ken Black
Interpretation of Regression Coefficients
Regression Model Building - Diagnostics
Multiple Linear Regression
Lecture 20 Last Lecture: Effect of adding or deleting a variable
Solutions of Tutorial 9 SSE df RMS Cp Radjsq SSE1 F Xs c).
Chapter Fourteen McGraw-Hill/Irwin
Chapter 11 Variable Selection Procedures
Introduction to Probability and Statistics Twelfth Edition
Business Statistics, 4e by Ken Black
Biomedical Engineering - Project Course TBMT14
Multicollinearity Multicollinearity occurs when explanatory variables are highly correlated, in which case, it is difficult or impossible to measure their.
Presentation transcript:

2018-12-06 Lecture 5 732G21/732G28/732A35 Detta är en generell mall för att göra PowerPoint presentationer enligt LiUs grafiska profil. Du skriver in din rubrik, namn osv på sid 1. Börja sedan skriva in din text på sid 2. För att skapa nya sidor, tryck Ctrl+M. Sidan 3 anger placering av bilder och grafik. Titta gärna på ”Baspresentation 2008” för exempel. Den sista bilden är en avslutningsbild som visar LiUs logotype och webadress. Om du vill ha fast datum, eller ändra författarnamn, gå in under Visa, Sidhuvud och Sidfot. Linköpings universitet

2018-12-06 Extra sums of squares The difference between SSE for a model with a certain setup of predictors and the SSE for a model with the same predictors plus one or more additional predictors Consider the model Then, we can define the extra sums of squares from adding X2 to the model as Linköpings universitet

Salary example Regression Analysis: Salary (Y) versus Age (X1) 2018-12-06 Salary example Regression Analysis: Salary (Y) versus Age (X1) The regression equation is Salary (Y) = 8.45 + 0.547 Age (X1) Predictor Coef SE Coef T P Constant 8.454 4.848 1.74 0.132 Age (X1) 0.5471 0.1099 4.98 0.003 S = 4.05592 R-Sq = 80.5% R-Sq(adj) = 77.2% Analysis of Variance Source DF SS MS F P Regression 1 407.30 407.30 24.76 0.003 Residual Error 6 98.70 16.45 Total 7 506.00 Linköpings universitet

Salary (Y) Age (X1) Highschool points (X2) 17 21 30 32 120 27 40 35 56 2018-12-06 Salary (Y) Age (X1) Highschool points (X2) 17 21 30 32 120 27 40 35 56 90 44 61 160 38 55 36 39 140 25 33 80 Linköpings universitet

2018-12-06 Regression Analysis: Salary (Y) versus Age (X1), Highschool points (X2) The regression equation is Salary (Y) = 10.1 + 0.319 Age (X1) + 0.0805 Highschool points (X2)    Predictor Coef SE Coef T P Constant 10.126 2.347 4.32 0.008 Age (X1) 0.31869 0.07225 4.41 0.007 Highschool points (X2) 0.08049 0.01746 4.61 0.006 S = 1.93941 R-Sq = 96.3% R-Sq(adj) = 94.8% Analysis of Variance Source DF SS MS F P Regression 2 487.19 243.60 64.76 0.000 Residual Error 5 18.81 3.76 Total 7 506.00 Source DF Seq SS Age (X1) 1 407.30 Highschool points (X2) 1 79.90   Linköpings universitet

2018-12-06 Partial F-test H0: βq = βq+1 = … = βp-1 = 0 Ha: not all β in H0 = 0 Reject H0 if F* > F(1-α; p-q; n-p) Linköpings universitet

Linköpings universitet 2018-12-06 Salary (Y) Age (X1) Highschool points (X2) Female/Male (X3) 17 21 1 30 32 120 27 40 35 56 90 44 61 160 38 55 36 39 140 25 33 80 Linköpings universitet

2018-12-06 Regression Analysis: Salary (Y) versus Age (X1), Highschool point, ... The regression equation is Salary (Y) = 7.13 + 0.393 Age (X1) + 0.0652 Highschool points (X2) + 2.73 Female/Male (X3) Predictor Coef SE Coef T P Constant 7.132 2.155 3.31 0.030 Age (X1) 0.39317 0.06201 6.34 0.003 Highschool points (X2) 0.06521 0.01441 4.52 0.011 Female/Male (X3) 2.732 1.185 2.31 0.082 S = 1.42101 R-Sq = 98.4% R-Sq(adj) = 97.2% Analysis of Variance Source DF SS MS F P Regression 3 497.92 165.97 82.20 0.000 Residual Error 4 8.08 2.02 Total 7 506.00 Source DF Seq SS Age (X1) 1 407.30 Highschool points (X2) 1 79.90 Female/Male (X3) 1 10.73 Linköpings universitet

Summary of tests of regression coefficients 2018-12-06 Summary of tests of regression coefficients Test whether a single βk = 0: t-test Test whether all β = 0: F-test Test whether a subset of the β = 0: Partial F-test Linköpings universitet

Coefficient of partial determination 2018-12-06 Coefficient of partial determination Tell us how much R2 increases if another predictor is added to the model. Consider and add X2. Linköpings universitet

Multicollinearity When we have high correlation among the predictors. 2018-12-06 Multicollinearity When we have high correlation among the predictors. Multicollinearity causes Adding or deleting a predictor changes the estimates of the regression coefficients very much. The standard errors of the regression coefficients become very large. Thus, conclusions from the model become more imprecise. The estimated regression coefficients will be nonsignificant, although they are highly correlated with Y. When we interpret the regression coefficients, we interpret one at the time, keeping the others constant. If there is high correlation among the predictors, this is of course not possible because if we change one of them, the others will change too (it is possible mathematically, but not logically). Linköpings universitet

Indications of the presence of multicollinearity 2018-12-06 Indications of the presence of multicollinearity Large changes in the regression coefficients when a predictor is added or deleted Non-significant results in t-tests on the regression coefficients for variables that through scatter matrix and correlation matrices (and logically) seemed to be very important. Estimated regression coefficients with a sign opposite to what we expect it to be. Linköpings universitet

Formal test of the presence of multicollinearity 2018-12-06 Formal test of the presence of multicollinearity Variance Inflation Factor (VIF) is the coefficient of determination when performing a regression of Xk versus the other X-variables in the model. Consider a model with predictors X1, X2 and X3: Decision rule: if the largest VIF > 10 and the average of the VIF:s are larger than one, we may multicollinearity in the model. give give give Linköpings universitet