Class 23 The most over-rated statistic The four assumptions The most Important hypothesis test yet Using yes/no variables in regressions.

Slides:



Advertisements
Similar presentations
Chap 12-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 12 Simple Regression Statistics for Business and Economics 6.
Advertisements

Class 14 Testing Hypotheses about Means Paired samples 10.3 p
Hypothesis Testing Steps in Hypothesis Testing:
Class 25 T-test 2-sample ≡ Regression with Dummy Understanding Multiple Regression. ANOVA ≡ Regression with p-1 Dummies EMBS 13.7 Pfeifer Note: section.
Simple Regression Model
Chapter 15 Multiple Regression. Regression Multiple Regression Model y =  0 +  1 x 1 +  2 x 2 + … +  p x p +  Multiple Regression Equation y = 
Classical Regression III
Chapter 13 Multiple Regression
Chapter 12 Simple Regression
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 13-1 Chapter 13 Simple Linear Regression Basic Business Statistics 11 th Edition.
Chapter 12 Multiple Regression
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Lecture 19: Tues., Nov. 11th R-squared (8.6.1) Review
The Simple Regression Model
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 13 Introduction to Linear Regression and Correlation Analysis.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 13-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
1 Review of Correlation A correlation coefficient measures the strength of a linear relation between two measurement variables. The measure is based on.
Linear Regression and Correlation Analysis
1 Regression Analysis Regression used to estimate relationship between dependent variable (Y) and one or more independent variables (X). Consider the variable.
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Linear Regression Example Data
Statistical Methods in Computer Science Hypothesis Testing I: Treatment experiment designs Ido Dagan.
Chapter 14 Introduction to Linear Regression and Correlation Analysis
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 13-1 Chapter 13 Simple Linear Regression Basic Business Statistics 10 th Edition.
In-Patient Costs for Neurological Disorders By Katherine Ammon.
Chapter 13 Simple Linear Regression
Constant Dosage day 7 Apigenin Control. Weight Adjusted Dosage Control Apigenin.
Checking Regression Model Assumptions NBA 2013/14 Player Heights and Weights.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 13-1 Chapter 13 Introduction to Multiple Regression Statistics for Managers.
Chapter 13 Simple Linear Regression
Introduction to Linear Regression and Correlation Analysis
Inference for regression - Simple linear regression
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.
Measures of relationship Dr. Omar Al Jadaan. Agenda Correlation – Need – meaning, simple linear regression – analysis – prediction.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
Statistics for Business and Economics 7 th Edition Chapter 11 Simple Regression Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch.
You want to examine the linear dependency of the annual sales of produce stores on their size in square footage. Sample data for seven stores were obtained.
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 12-1 Correlation and Regression.
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
EQT 373 Chapter 3 Simple Linear Regression. EQT 373 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value.
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.
Exam2 A learning experience….. Scores Raw Scores went from 68 to 147 As percentage of total….40% to 86% Scaled scores went from 60.5 to 100 Some still.
Trial Group AGroup B Mean P value 2.8E-07 Means of Substances Group.
Simple Linear Regression ANOVA for regression (10.2)
ANOVA for Regression ANOVA tests whether the regression model has any explanatory power. In the case of simple regression analysis the ANOVA test and the.
Statistics for Business and Economics 8 th Edition Chapter 11 Simple Regression Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 13-1 Introduction to Regression Analysis Regression analysis is used.
Construction Engineering 221 Sampling and Mean Comparison.
Statistics for Business and Economics 8 th Edition Chapter 11 Simple Regression Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch.
Lecture 10: Correlation and Regression Model.
Linear Discriminant Analysis (LDA). Goal To classify observations into 2 or more groups based on k discriminant functions (Dependent variable Y is categorical.
Applied Quantitative Analysis and Practices LECTURE#25 By Dr. Osman Sadiq Paracha.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice- Hall, Inc. Chap 14-1 Business Statistics: A Decision-Making Approach 6 th Edition.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 10 th Edition.
Statistics for Managers Using Microsoft® Excel 5th Edition
Class 22. Understanding Regression EMBS Part of 12.7 Sections 1-3 and 7 of Pfeifer Regression note.
Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.
Real Estate Sales Forecasting Regression Model of Pueblo neighborhood North Elizabeth Data sources from Pueblo County Website.
Multiple Regression The equation that describes how the dependent variable y is related to the independent variables: x1, x2, xp and error term e.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 14-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
Significance Tests for Regression Analysis. A. Testing the Significance of Regression Models The first important significance test is for the regression.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.
Construction Engineering 221 Probability and Statistics.
1 Estimating and Testing  2 0 (n-1)s 2 /  2 has a  2 distribution with n-1 degrees of freedom Like other parameters, can create CIs and hypothesis tests.
Chapter 14 Introduction to Multiple Regression
BUSI 410 Business Analytics
Simple Linear Regression
Presentation transcript:

Class 23 The most over-rated statistic The four assumptions The most Important hypothesis test yet Using yes/no variables in regressions

Adjusted R-square Pg 9-12 Pfeifer note Hours Hours Mean Standard Error Median7.08 Mode7.17 Standard Deviation Sample Variance Kurtosis Skewness Range13.08 Minimum2 Maximum15.08 Sum Count15 Our better method of forecasting hours would use a mean of 7.9 and standard deviation of 3.89 (and the t- distribution with 14 dof) The variation in Hours that regression will try to explain

Our better method of forecasting hours for job A would use a mean of and standard deviation of 2.77 (and the t-distribution with 13 dof) The variation in Hours regression leaves unexplained. MSFHours SUMMARY OUTPUT Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations15 ANOVA df Regression1 Residual13 Total14 Coefficients Intercept MSF Adjusted R-square Pg 9-12 Pfeifer note

Adjusted R-square Pg 9-12 Pfeifer note

From the Pfeifer note Adj R-square = 0.0 Adj R-square = 0.5 Adj R-square = 1.0 Standard error = 0 Standard error = s

Why Pfeifer says R2 is over-rated There is no standard for how large it should be. – In some situations an adjusted R 2 of 0.05 would be FANTASTIC. In others, an adjusted R 2 of 0.96 would be DISAPOINTING. It has no real use. – Unlike “standard error” which is needed to make probability forecasts. It is usually redundant – When comparing models, lower standard errors mean higher adj R 2 – The correlation coefficient (which shares the same sign as b) ≈ the square root of adj R 2.

The Coal Pile Example The firm needed a way to estimate the weight of a coal pile (based on it’s dimensions) WDhd SUMMARY OUTPUT Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations10 ANOVA df Regression3 Residual6 Total9 Coefficients Intercept D h d % of the variation in W is explained by this regression. We just used MULTIPLE regression.

The Coal Pile Example Engineer Bob calculated the Volume of each pile and used simple regression… 100% of the variation in W is explained by this regression. Standard error went from to 20.6 to 2.8!!! W Vol SUMMARY OUTPUT Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations10 ANOVA df Regression1 Residual8 Total9 Coefficients Intercept Vol

The Four Assumptions Sec 5 of Pfeifer note Sec 12.4 of EMBS

Our better method of forecasting hours for job A would use a mean of and standard deviation of 2.77 (and the t-distribution with 13 dof) MSFHours SUMMARY OUTPUT Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations15 ANOVA df Regression1 Residual13 Total14 Coefficients Intercept MSF The four assumptions Linearity Independence (all 15 points count equally) homoskedasticity Normality Sec 5 of Pfeifer note Sec 12.4 of EMBS

Hypotheses H0: P=0.5 (LTT, wunderdog) H0: Independence (supermarket job and response, treatment and heart attack, light and myopia, tosser and outcome) H0: μ=100 (IQ) H0: μ M = μ F (heights, weights, batting average) H0: μ compact = μ mid = μ large (displacement) P 13 of Pfeifer note Sec 12.5 of EMBS

H0: b=0 P 13 of Pfeifer note Sec 12.5 of EMBS

Testing b=0 is EASY!!! MSFHours CoefficientsStandard Errort StatP-valueLower 95%Upper 95% Intercept MSF The standard error of the coefficient The t-stat to test b=0. The 2-tailed p- value. P 13 of Pfeifer note Sec 12.5 of EMBS

Using Yes/No variable in Regression Car Class Displaceme ntFuel TypeHwy MPG 1 Midsize3.5R28 2 Midsize3R26 3 Large3P26 4 Large3.5P Compact6P20 59 Midsize2.5R30 60 Midsize2R32 Categorical Numerical n=60 Sec 8 of Pfeifer note Sec 13.7 of EMBS Does MPG “depend” on fuel type?

Fuel type (yes/no) and mpg (numerical) Un-stack the data so there are two columns of MPG data. Data Analysis, T-test two sample t-Test: Two-Sample Assuming Equal Variances PR Mean Variance Observations3624 Pooled Variance Hypothesized Mean Difference0 df58 t Stat P(T<=t) one-tail t Critical one-tail P(T<=t) two-tail t Critical two-tail Sec 8 of Pfeifer note Sec 13.7 of EMBS H0: μ P = μ R Or H0: μ P – μ R = 0

Using Yes/No variables in Regression 1.Convert the categorical variable into a 1/0 DUMMY Variable. – Use an if statement to do this. – It won’t matter which is assigned 1, which is assigned 0. – It doesn’t even matter what 2 numbers you assign to the two categories (regression will adjust) 2.Regress MPG (numerical) on DUMMY (1/0 numerical) 3.Test H0: b=0 using the regression output. Sec 8 of Pfeifer note Sec 13.7 of EMBS

Using Yes/No variables in Regression Fuel TypeDprem Hwy MPG R028 R026 P1 P P121 P125 P120 R030 R032 SUMMARY OUTPUT Regression Statistics Adj R Square Standard Error Observations60 ANOVA dfSSMSFSig F Regression E-04 Residual Total CoeffStd Errort StatP-value Intercept E-44 Dprem E-04 Sec 8 of Pfeifer note Sec 13.7 of EMBS

Regression with one Dummy variable H0: μ P = μ R Or H0: μ P – μ R = 0 Or H0: b = 0

What we learned today We learned about “adjusted R square” – The most over-rated statistic of all time. We learned the four assumptions required to use regression to make a probability forecast of Y│X. – And how to check each of them. We learned how to test H0: b=0. – And why this is such an important test. We learned how to use a yes/no variable in a regression. – Create a dummy variable.