More Simple Linear Regression 1. Variation 2 Remember to calculate the standard deviation of a variable we take each value and subtract off the mean and.

Slides:



Advertisements
Similar presentations
Lesson 10: Linear Regression and Correlation
Advertisements

Here we add more independent variables to the regression.
Simple Linear Regression 1. 2 I want to start this section with a story. Imagine we take everyone in the class and line them up from shortest to tallest.
Chap 12-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 12 Simple Regression Statistics for Business and Economics 6.
Forecasting Using the Simple Linear Regression Model and Correlation
Hypothesis Testing Steps in Hypothesis Testing:
Chapter 14, part D Statistical Significance. IV. Model Assumptions The error term is a normally distributed random variable and The variance of  is constant.
1 Multiple Regression Interpretation. 2 Correlation, Causation Think about a light switch and the light that is on the electrical circuit. If you and.
Chapter 12 Simple Linear Regression
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
1 Matched Samples The paired t test. 2 Sometimes in a statistical setting we will have information about the same person at different points in time.
1 Difference Between the Means of Two Populations.
Chapter 10 Simple Regression.
Chapter 12 Simple Regression
1 More Regression Information. 2 3 On the previous slide I have an Excel regression output. The example is the pizza sales we saw before. The first thing.
1 Multiple Regression Here we add more independent variables to the regression. In this section I focus on sections 13.1, 13.2 and 13.4.
1 Qualitative Independent Variables Sometimes called Dummy Variables.
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Linear Regression and Correlation
The Simple Regression Model
The Basics of Regression continued
SIMPLE LINEAR REGRESSION
Chapter Topics Types of Regression Models
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
SIMPLE LINEAR REGRESSION
1 T-test for the Mean of a Population: Unknown population standard deviation Here we will focus on two methods of hypothesis testing: the critical value.
Chapter 14 Introduction to Linear Regression and Correlation Analysis
1 Confidence Interval for Population Mean The case when the population standard deviation is unknown (the more common case).
Simple Linear Regression and Correlation
Simple Linear Regression 1. 2 I want to start this section with a story. Imagine we take everyone in the class and line them up from shortest to tallest.
Lecture 5 Correlation and Regression
Correlation and Linear Regression
Correlation and Regression
SIMPLE LINEAR REGRESSION
Introduction to Linear Regression and Correlation Analysis
Chapter 13: Inference in Regression
Linear Regression and Correlation
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
Statistics for Business and Economics 7 th Edition Chapter 11 Simple Regression Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch.
1 1 Slide Simple Linear Regression Coefficient of Determination Chapter 14 BA 303 – Spring 2011.
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Section 9-1: Inference for Slope and Correlation Section 9-3: Confidence and Prediction Intervals Visit the Maths Study Centre.
Interval Estimation and Hypothesis Testing Prepared by Vera Tabakova, East Carolina University.
Statistics for Business and Economics 8 th Edition Chapter 11 Simple Regression Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Chapter 13 Multiple Regression
Lecture 10: Correlation and Regression Model.
Statistics for Managers Using Microsoft® Excel 5th Edition
Chapter 12 Simple Linear Regression n Simple Linear Regression Model n Least Squares Method n Coefficient of Determination n Model Assumptions n Testing.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
Bivariate Regression. Bivariate Regression analyzes the relationship between two variables. Bivariate Regression analyzes the relationship between two.
Chapter 13 Simple Linear Regression
Linear Regression and Correlation Analysis
Chapter 11 Simple Regression
Statistics for Business and Economics (13e)
Section 9-3   We already know how to calculate the correlation coefficient, r. The square of this coefficient is called the coefficient of determination.
Simple Linear Regression
PENGOLAHAN DAN PENYAJIAN
SIMPLE LINEAR REGRESSION
SIMPLE LINEAR REGRESSION
Simple Linear Regression
Section 9-3   We already know how to calculate the correlation coefficient, r. The square of this coefficient is called the coefficient of determination.
Introduction to Regression
Presentation transcript:

More Simple Linear Regression 1

Variation 2 Remember to calculate the standard deviation of a variable we take each value and subtract off the mean and then square the result. (We also the divided by something, but that is not important in this discussion.) In a regression setting on the dependent variable Y we define the total sum of squares SST as Σ(Yi – Ybar) 2. SST can be rewritten as SST = Σ(Yi – Ŷi + Ŷi –Ybar) 2 = Σ(Ŷi –Ybar) 2 + Σ(Yi – Ŷi) 2 = SSR + SSE. Note: you may recall from algebra that (a + b) 2 = a 2 + 2ab + b 2. In our story here 2ab = 0. While this is not true in general in algebra it is in this context of regression. If this note makes no sense to you do not worry, just use SST = SSR + SSE

Variation 3 So we have SST = Σ(Yi – Ybar) 2, SSR = Σ(Ŷi –Ybar) 2 and SSE = Σ(Yi – Ŷi) 2. On the next slide I have a graph of the data with the regression line put in and a line showing the mean of Y. For each point we could look at the how far the point is from the mean line. This is what SST is looking at. But SSR is indicating that of all the difference in the point and the mean the regression line is able to account for some of that variation. The rest of the difference is SSE.

Variation Y Least Squares regression Line = Ŷi X 4 Y bar Two examples of what is going into SSR Two examples of what is going into SSE

The Coefficient of Determination 5 The coefficient of determination, often denoted r 2, measures the proportion in the variation in Y that is explained by the independent variable X in the regression model. r 2 = SSR/SST. In our example from the text about the apparel company we have r 2 = SSR/SST = / = This means that percent of the variation in sales is explained by the variability in the store square footage. Plus, only 9.58% of the variability in sales is due to other factors.

Coefficient of Determination 6 Say we didn’t have an X variable to help us predict the Y variable. Then a reasonable way to predict Y would be to just use its average or mean value. But, with a regression, by using an X variable it is thought we can do better than just using the mean of Y as a predictor. In a simple linear regression r 2 is an indicator of the strength of the relationship between two variables because the use of the regression model would reduce the variability in predicting the sales by just using the mean sales by the percentage obtained. In different areas of study (like marketing, management, and so on) the idea of what a good r 2 is varies. But, you can be sure if r 2 is.8 or above you have a strong relationship.

t Test for slope 7 Hypothesis test about the population slope B 1. Remember we have taken a sample of data. In this context we have taken a sample and estimated the unknown population regression. Our real point in a study like this is to see if a relationship exists between the two variables in the population. If the slope is not zero in the population, then the X variable has an influence on the outcome of Y. Now, in a sample, the estimated slope may or may not be zero. But the sample provides a basis for a test of the true unknown population slope being zero. For the test we will use the t distribution. Admittedly, degrees of freedom is a term without much meaning to you, but in the context of simple regression equals the sample size minus 2.

t Test for slope 8 Back to our hypothesis test about the slope. The null hypothesis is that B 1 = 0, and the alternative is that B 1 is not equal to zero. Since the alternative is not equal to zero we have a two-tailed test. If we have 14 data points (pairs of points in regression) the df = 12 and if we want alpha =.05 we divide that in half because of the two tail test and our critical values are and If the sample based statistic, tstat, is between the two critical values we can not reject the null and we would conclude the data supports a statement of no relationship between X and Y. If the tstat is outside the critical values we reject the null and go with the alternative and say the data supports that a relationship exists between the variables.

t Test for the Slope 9 In a class such as ours the point is usually not to do a lot of calculations in regression, but interpret results. On page 579 we see an Excel printout for the apparel company. Note on cell d18 we have the calculated tstat for the problem. Since it is outside the critical values we reject Ho and go with the alternative. The p-value approach is that if the p-value < alpha we reject the null. The p-value printed in Excel in this area is a two tail p- value. Since we have a p-value of essentially 0 we can reject the null.

Confidence Interval For the Slope 10 You may recall from previous work that when get a point estimate we often want to build in an interval around the point estimate because we know about sampling variability. Excel also gives the confidence interval for the slope estimate. Page on page 579 for our apparel example we see in cells f18 and g18 the lower and upper interval values. We have the interval (1.3280, ). This interval has the interpretation that we are 95% confidence that the UNKNOWN POPULATION SLOPE is somewhere in this interval.

11 This is to be used on an assignment. Note in cell E57 the notation E-11. The E-11 means move the decimal 11 places to the left. So the p-value is If the E has a plus after it you move the decimal to the right.

12