Calculating the Least Squares Regression Line

Slides:



Advertisements
Similar presentations
Least-Squares Regression Section 3.3. Correlation measures the strength and direction of a linear relationship between two variables. How do we summarize.
Advertisements

Coefficient of Determination- R²
AP Statistics Section 3.2 B Residuals
Section 10-3 Regression.
HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Section 12.2.
Least Squares Regression
AP Statistics.  Least Squares regression is a way of finding a line that summarizes the relationship between two variables.
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Lesson Diagnostics on the Least- Squares Regression Line.
Definition  Regression Model  Regression Equation Y i =  0 +  1 X i ^ Given a collection of paired data, the regression equation algebraically describes.
Scatter Diagrams and Linear Correlation
Haroon Alam, Mitchell Sanders, Chuck McAllister- Ashley, and Arjun Patel.
SECTION 2.2 BUILDING LINEAR FUNCTIONS FROM DATA BUILDING LINEAR FUNCTIONS FROM DATA.
Warm-up with 3.3 Notes on Correlation
Correlation Correlation measures the strength of the LINEAR relationship between 2 quantitative variables. Labeled as r Takes on the values -1 < r < 1.
Chapter 10 Correlation and Regression. SCATTER DIAGRAMS AND LINEAR CORRELATION.
Lesson Least-Squares Regression. Knowledge Objectives Explain what is meant by a regression line. Explain what is meant by extrapolation. Explain.
Least-Squares Regression Section 3.3. Why Create a Model? There are two reasons to create a mathematical model for a set of bivariate data. To predict.
Section 4.2 Least Squares Regression. Finding Linear Equation that Relates x and y values together Based on Two Points (Algebra) 1.Pick two data points.
AP STATISTICS LESSON 3 – 3 LEAST – SQUARES REGRESSION.
Linear Regression Least Squares Method: the Meaning of r 2.
Warm-up with 3.3 Notes on Correlation Universities use SAT scores in the admissions process because they believe these scores provide some insight into.
Sec 1.5 Scatter Plots and Least Squares Lines Come in & plot your height (x-axis) and shoe size (y-axis) on the graph. Add your coordinate point to the.
Modeling a Linear Relationship Lecture 47 Secs – Tue, Apr 25, 2006.
Section 4.1 Scatter Diagrams and Correlation. Definitions The Response Variable is the variable whose value can be explained by the value of the explanatory.
Variation and Prediction Intervals
Section 3.2C. The regression line can be found using the calculator Put the data in L1 and L2. Press Stat – Calc - #8 (or 4) - enter To get the correlation.
Statistics Describing, Exploring and Comparing Data
Modeling a Linear Relationship Lecture 44 Secs – Tue, Apr 24, 2007.
9.1B – Computing the Correlation Coefficient by Hand
Sec. 2-4: Using Linear Models. Scatter Plots 1.Dependent Variable: The variable whose value DEPENDS on another’s value. (y) 2.Independent Variable: The.
AP Statistics HW: p. 165 #42, 44, 45 Obj: to understand the meaning of r 2 and to use residual plots Do Now: On your calculator select: 2 ND ; 0; DIAGNOSTIC.
Residuals Recall that the vertical distances from the points to the least-squares regression line are as small as possible.  Because those vertical distances.
^ y = a + bx Stats Chapter 5 - Least Squares Regression
Warm-up O Turn in HW – Ch 8 Worksheet O Complete the warm-up that you picked up by the door. (you have 10 minutes)
Chapter 8 Linear Regression. Fat Versus Protein: An Example 30 items on the Burger King menu:
Calculating the Least Squares Regression Line Lecture 40 Secs Wed, Dec 6, 2006.
1.5 Linear Models Warm-up Page 41 #53 How are linear models created to represent real-world situations?
Lecture 10 Introduction to Linear Regression and Correlation Analysis.
Chapters 8 Linear Regression. Correlation and Regression Correlation = linear relationship between two variables. Summarize relationship with line. Called.
The simple linear regression model and parameter estimation
Chapter 4.2 Notes LSRL.
Sections Review.
10.3 Coefficient of Determination and Standard Error of the Estimate
Ch12.1 Simple Linear Regression
Quantitative Methods Simple Regression.
Econ 3790: Business and Economics Statistics
Using the TI84 Graphing Calculator
AP Stats: 3.3 Least-Squares Regression Line
Correlation: How Strong Is the Linear Relationship?
Warm Up Please sit down and clear your desk. Do not talk. You will have until lunch to finish your quiz.
Least-Squares Regression
^ y = a + bx Stats Chapter 5 - Least Squares Regression
Least Squares Method: the Meaning of r2
Modeling a Linear Relationship
Calculating the Least Squares Regression Line
Least-Squares Regression
Calculating the Least Squares Regression Line
11C Line of Best Fit By Eye, 11D Linear Regression
Section 3.2: Least Squares Regressions
The Five-Number Summary
A medical researcher wishes to determine how the dosage (in mg) of a drug affects the heart rate of the patient. Find the correlation coefficient & interpret.
Describing Bivariate Relationships
Which graph best describes your excitement for …..
Section 11.1 Correlation.
Modeling a Linear Relationship
The Squared Correlation r2 – What Does It Tell Us?
Calculating the Least Squares Regression Line
Calculating the Least Squares Regression Line
The Squared Correlation r2 – What Does It Tell Us?
Presentation transcript:

Calculating the Least Squares Regression Line Lecture 45 Sec. 13.3.2 Fri, Nov 30, 2007

The Least Squares Regression Line The equation of the regression line is of the form y^ = a + bx. We need to find the coefficients a and b from the data.

The Least Squares Regression Line The formula for b is The formula for a is or

Formula 1 Consider again the data set x y 1 8 3 12 4 9 5 14 16 20 11 17 15 24

Formula 1 Compute the x and y deviations. x y x –x y –y 1 8 -6 -7 3 12 -4 -3 4 9 5 14 -2 -1 16 20 2 11 17 15 24

Formula 1 Compute the squared deviations. x y x –x y –y (x –x)2 8 -6 -7 36 49 42 3 12 -4 -3 16 9 4 18 5 14 -2 -1 2 20 25 10 11 17 15 24 64 81 72

Formula 1 Find the sums. x y x –x y –y (x –x)2 (y –y)2 1 8 -6 -7 36 49 42 3 12 -4 -3 16 9 4 18 5 14 -2 -1 2 20 25 10 11 17 15 24 64 81 72 150 206 165

Formula 1 Compute b: Then compute a: The equation is

Formula 2 Consider yet again the data set x y 1 8 3 12 4 9 5 14 16 20 11 17 15 24

Formula 2 Square x and y and find xy. x y x2 y2 xy 1 8 64 3 12 9 144 36 4 16 81 5 14 25 196 70 256 128 20 400 180 11 17 121 289 187 15 24 225 576 360

Formula 2 Add up the columns. x y x2 y2 xy 1 8 64 3 12 9 144 36 4 16 81 5 14 25 196 70 256 128 20 400 180 11 17 121 289 187 15 24 225 576 360 56 120 542 2006 1005

Method 2 Compute b: Then compute a as before: The equation is

Example The second method is usually easier (really) if you are doing it by hand. By either method, we get the equation y^ = 7.3 + 1.1x.

TI-83 – Regression Line On the TI-83, we could use 2-Var Stats to get the basic summations. Then use Formula 2 for b. Try it: Enter 2-Var Stats L1, L2.

TI-83 – Regression Line 2-Var Stats L1, L2 reports that x = 56 x2 = 542 y = 120 y2 = 2006 xy = 1005 Then use the formulas.

TI-83 – Regression Line Or we can use the LinReg function (#8). Put the x values in L1 and the y values in L2. Select STAT > CALC > LinReg(a+bx). Press Enter. LinReg(a+bx) appears in the display. Enter L1, L2. Press Enter.

TI-83 – Regression Line The following appear in the display. The title LinReg. The equation y = a + bx. The value of a. The value of b. The value of r2 (to be discussed later). The value of r (to be discussed later).

TI-83 – Regression Line To graph the regression line along with the scatterplot, Put the x values in L1 and the y values in L2. Select STAT > CALC > LinReg(a+bx). Press Enter. LinReg(a+bx) appears in the display. Enter L1, L2, Y1 Press Enter.

TI-83 – Regression Line To graph the regression line along with the scatterplot, Put the x values in L1 and the y values in L2. Select STAT > CALC > LinReg(a+bx). Press Enter. LinReg(a+bx) appears in the display. Enter L1, L2, Y1 Press Enter. Add this

TI-83 – Regression Line Press Y= to see the equation. Press ZOOM > ZoomStat to see the graph.

Free Lunch Participation vs. Graduation Rate Find the equation of the regression line for the school-district data on the free-lunch participation rate vs. the graduation rate. Let x be the free-lunch participation. Let y be the graduation rate.

Free Lunch Participation vs. Graduation Rate District Free Lunch Grad. Rate Amelia 41.2 68.9 King and Queen 59.9 64.1 Caroline 40.2 62.9 King William 27.9 67.0 Charles City 45.8 67.7 Louisa 44.9 80.1 Chesterfield 22.5 80.5 New Kent 13.9 77.0 Colonial Hgts 25.7 73.0 Petersburg 61.6 54.6 Cumberland 55.3 63.9 Powhatan 12.2 89.3 Dinwiddie 45.2 71.4 Prince George 30.9 85.0 Goochland 23.3 76.3 Richmond 74.0 46.9 Hanover 13.7 90.1 Sussex 74.8 59.0 Henrico 30.2 81.1 West Point 19.1 82.0 Hopewell 63.1 63.4

Free Lunch Participation vs. Graduation Rate The regression equation is y^ = 91.047 – 0.494x.

Scatter Plot 90 80 Graduation Rate 70 60 50 Free Lunch Rate 20 30 40

Scatter Plot with Regression Line 90 80 Graduation Rate 70 60 50 Free Lunch Rate 20 30 40 50 60 70 80

Predicting y What graduation rate would we predict in a district if we knew that the free-lunch participation rate was 50%?

Scatter Plot with Regression Line 90 Predicted Point 80 Graduation Rate 70 60 50 Free Lunch Rate 10 20 30 40 50 60 70 80

Scatter Plot with Regression Line 90 Predicted Point 80 Graduation Rate 70 66.3 60 Predicted Graduation Rate 50 Free Lunch Rate 10 20 30 40 50 60 70 80

Variation in the Model There is a very simple relationship between the variation in the observed y values and the variation in the predicted y values.

Observed y and Predicted y 90 Observed y Predicted y 80 Graduation Rate 70 60 50 Free Lunch Rate 10 20 30 40 50 60 70 80

SST = Variation in the Observed y 90 80 Graduation Rate 70 60 50 Free Lunch Rate 10 20 30 40 50 60 70 80

SSR = Variation in the Predicted y 90 80 Graduation Rate 70 60 50 Free Lunch Rate 10 20 30 40 50 60 70 80

Variation in Observed y The variation in the observed y is measured by SST (same as SSY). For graduation rate data (L2), SST = 2598.18.

Variation in Predicted y The variation in the predicted y is measured by SSR. For predicted graduation rate data, let L3 = Y1(L1). SSR = 1896.67.

SSE = Residual Sum of Squares It turns out that SST = SSE + SSR. That is,

Sum Squared Error In the example, SST – SSR = 2598.18 – 1896.67 = 701.51. If we compute the sum of the squared residuals directly, we get SSE = 701.52.

Explaining the Variability In the equation SST = SSE + SSR, SSR is the amount of variability in y that is explained by the model. SSE is the amount of variability in y that is not explained by the model.

Explaining the Variability In the last example, how much variability in graduation rate is explained by the model (by free-lunch participation)?