Modeling a Linear Relationship

Slides:



Advertisements
Similar presentations
AP Statistics Section 3.2 B Residuals
Advertisements

Residuals.
Least Squares Regression
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression Chapter.
Lesson Diagnostics on the Least- Squares Regression Line.
Haroon Alam, Mitchell Sanders, Chuck McAllister- Ashley, and Arjun Patel.
1 Chapter 10 Correlation and Regression We deal with two variables, x and y. Main goal: Investigate how x and y are related, or correlated; how much they.
Relationship of two variables
Correlation Correlation measures the strength of the LINEAR relationship between 2 quantitative variables. Labeled as r Takes on the values -1 < r < 1.
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
Section 5.2: Linear Regression: Fitting a Line to Bivariate Data.
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
Modeling a Linear Relationship Lecture 47 Secs – Tue, Apr 25, 2006.
Regression Regression relationship = trend + scatter
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.2 Least-Squares.
Examining Bivariate Data Unit 3 – Statistics. Some Vocabulary Response aka Dependent Variable –Measures an outcome of a study Explanatory aka Independent.
Modeling a Linear Relationship Lecture 44 Secs – Tue, Apr 24, 2007.
^ y = a + bx Stats Chapter 5 - Least Squares Regression
Section 1.6 Fitting Linear Functions to Data. Consider the set of points {(3,1), (4,3), (6,6), (8,12)} Plot these points on a graph –This is called a.
Simple Linear Regression The Coefficients of Correlation and Determination Two Quantitative Variables x variable – independent variable or explanatory.
Calculating the Least Squares Regression Line Lecture 40 Secs Wed, Dec 6, 2006.
Chapter 5 Lesson 5.2 Summarizing Bivariate Data 5.2: LSRL.
Chapters 8 Linear Regression. Correlation and Regression Correlation = linear relationship between two variables. Summarize relationship with line. Called.
Section 1.3 Scatter Plots and Correlation.  Graph a scatter plot and identify the data correlation.  Use a graphing calculator to find the correlation.
Page 1 Introduction to Correlation and Regression ECONOMICS OF ICMAP, ICAP, MA-ECONOMICS, B.COM. FINANCIAL ACCOUNTING OF ICMAP STAGE 1,3,4 ICAP MODULE.
Part II Exploring Relationships Between Variables.
Two-Variable Data Analysis
Lecture Slides Elementary Statistics Twelfth Edition
Linear Regression Essentials Line Basics y = mx + b vs. Definitions
The simple linear regression model and parameter estimation
CHAPTER 3 Describing Relationships
Chapter 4.2 Notes LSRL.
Sections Review.
CHAPTER 3 Describing Relationships
The Least-Squares Regression Line
Exercise 4 Find the value of k such that the line passing through the points (−4, 2k) and (k, −5) has slope −1.
CHAPTER 10 Correlation and Regression (Objectives)
Suppose the maximum number of hours of study among students in your sample is 6. If you used the equation to predict the test score of a student who studied.
Regression and Residual Plots
Simple Linear Regression
Ice Cream Sales vs Temperature
^ y = a + bx Stats Chapter 5 - Least Squares Regression
Chapter 3: Describing Relationships
Unit 4 Vocabulary.
Calculating the Least Squares Regression Line
Least Squares Regression
Review of Chapter 3 Examining Relationships
Modeling a Linear Relationship
Objectives (IPS Chapter 2.3)
Calculating the Least Squares Regression Line
Least-Squares Regression
Lesson 2.2 Linear Regression.
Correlation and Regression
Calculating the Least Squares Regression Line
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
3.2 – Least Squares Regression
Section 3.2: Least Squares Regressions
Homework: pg. 180 #6, 7 6.) A. B. The scatterplot shows a negative, linear, fairly weak relationship. C. long-lived territorial species.
A medical researcher wishes to determine how the dosage (in mg) of a drug affects the heart rate of the patient. Find the correlation coefficient & interpret.
Describing Bivariate Relationships
Lesson 2.2 Linear Regression.
Linear Models We will determine and use linear models, and use correlation coefficients.
Chapters Important Concepts and Terms
9/27/ A Least-Squares Regression.
Calculating the Least Squares Regression Line
Calculating the Least Squares Regression Line
Review of Chapter 3 Examining Relationships
MATH 2311 Section 5.3.
Presentation transcript:

Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Dec 6, 2005

Bivariate Data Data is called bivariate if each observations consists of a pair of values (x, y). x is the explanatory variable. y is the response variable. x is also called the independent variable. y is also called the dependent variable.

Scatterplots Scatterplot – A display in which each observation (x, y) is plotted as a point in the xy plane.

Example Draw a scatterplot of the percent on-time arrivals vs. percent on-time departures for the 22 airports listed in Exercise 4.29, p. 252, and also in Exercise 13.5, p 822. OnTimeArrivals.xls. Does there appear to be a relationship? How can we tell? How would we describe that relationship?

Linear Association Draw an oval around the data set. If the set of points forms a tilted oval, then there is some linear association. If the oval is tilted upwards from left to right, then there is positive association. If the oval is tilted downwards from left to right, then there is negative association. If the oval is not tilted at all, then there is no association.

Positive Linear Association y x

Positive Linear Association y x

Negative Linear Association y x

Negative Linear Association y x

No Linear Association y x

No Linear Association y x

Practice Draw a scatterplot of the data in Example 13.2, p. 816. How should we label the x-axis? How should we label the y-axis?

Example Is there a linear association? Is it positive or negative?

Strong vs. Weak Association The association is strong if the oval is narrow. The association is weak if the oval is wide.

Strong Positive Linear Association y x

Strong Positive Linear Association y x

Weak Positive Linear Association y x

Weak Positive Linear Association y x

Example In Example 13.2, In Exercise 13.5, Is there the linear association strong or is it weak? In Exercise 13.5, How should we describe the association?

TI-83 - Scatterplots To set up a scatterplot, Enter the x values in L1. Enter the y values in L2. Press 2nd STAT PLOT. Select Plot1 and press ENTER.

TI-83 - Scatterplots The Stat Plot display appears. Select On and press ENTER. Under Type, select the first icon (a small image of a scatterplot) and press ENTER. For XList, enter L1. For YList, enter L2. For Mark, select the one you want and press ENTER.

TI-83 - Scatterplots To draw the scatterplot, Press ZOOM. The Zoom menu appears. Select ZoomStat (#9) and press ENTER. The scatterplot appears. Press TRACE and use the arrow keys to inspect the individual points.

Simple Linear Regression To quantify the linear relationship between x and y, we wish to find the equation of the line that “best” fits the data. Typically, there will be many lines that all look pretty good. How do we measure how well a line fits the data?

Measuring the Goodness of Fit Start with the scatterplot. y x

Measuring the Goodness of Fit Draw a line through the scatterplot. y x

Measuring the Goodness of Fit Measure the vertical distances from every point to the line y x

Measuring the Goodness of Fit Each of these represents a deviation, called a residual e, from the line. y e x

Residuals The i th residual – The difference between the observed value of yi and the predicted value of yi. Use yi^ for the predicted yi. The formula for the ith residual is Notice that the residual is positive if the data point is above the line and it is negative if the data point is below the line.

Measuring the Goodness of Fit Find the sum of the squared residuals. y e x

Measuring the Goodness of Fit The smaller the sum of squared residuals, the better the fit. y e x

Example Consider the data points x y 2 3 5 9 6 12 16

Example 15 10 5 2 3 4 5 6 7 8 9

Least Squares Line Let’s see how good the fit is for the line y^ = -1 + 2x, where y^ represents the predicted value of y, not the observed value.

Sum of Squared Residuals Begin with the data set. x y 2 3 5 9 6 12 16

Sum of Squared Residuals Compute the predicted y, using y^ = -1 + 2x. x y y^ 2 3 5 9 6 12 11 16 17

Sum of Squared Residuals Compute the residuals, y – y^. x y y^ y – y^ 2 3 5 9 6 12 11 1 16 17 -1

Sum of Squared Residuals Compute the squared residuals. x y y^ y – y^ (y – y^)2 2 3 5 9 6 12 11 1 16 17 -1

Sum of Squared Residuals Compute the sum of the squared residuals. x y y^ y – y^ (y – y^)2 2 3 5 9 6 12 11 1 16 17 -1 (y – y^)2 = 2.00

Sum of Squared Residuals Now let’s see how good the fit is for the line y^ = -0.5 + 1.9x.

Sum of Squared Residuals Begin with the data set. x y 2 3 5 9 6 12 16

Sum of Squared Residuals Compute the predicted y, using y^ = -0.5 + 1.9x. x y y^ 2 3 3.3 5 5.2 9 9.0 6 12 10.9 16 16.6

Sum of Squared Residuals Compute the residuals, y – y^. x y y^ y – y^ 2 3 3.3 -0.3 5 5.2 -0.2 9 9.0 0.0 6 12 10.9 1.1 16 16.6 -0.6

Sum of Squared Residuals Compute the squared residuals. x y y^ y – y^ (y – y^)2 2 3 3.3 -0.3 0.09 5 5.2 -0.2 0.04 9 9.0 0.0 0.00 6 12 10.9 1.1 1.21 16 16.6 -0.6 0.36

Sum of Squared Residuals Compute the sum of the squared residuals. x y y^ y – y^ (y – y^)2 2 3 3.3 -0.3 0.09 5 5.2 -0.2 0.04 9 9.0 0.0 0.00 6 12 10.9 1.1 1.21 16 16.6 -0.6 0.36 (y – y^)2 = 1.70

Sum of Squared Residuals We conclude that y^ = -0.5 + 1.9x is a better fit than y^ = -1 + 2x.

Sum of Squared Residuals y^ = -1 + 2x 15 10 5 2 3 4 5 6 7 8 9

Sum of Squared Residuals y^ = -0.5 + 1.9x 15 10 5 2 3 4 5 6 7 8 9

Least Squares Line Least squares line – The line for which the sum of the squares of the distances is as small as possible. The least squares line is also called the regression line or the line of best fit.

Example For all the lines that one could draw through this data set, it turns out that 1.70 is the smallest possible value for the sum of the squares of the residuals. x y 2 3 5 9 6 12 16

Example Therefore, y^ = -0.5 + 1.9x is the regression line for this data set.

Regression Line We will write regression line as a is the y-intercept. b is the slope. This is the usual slope-intercept form y = mx + b with the two terms rearranged.

TI-83 – Computing Residuals It is not hard to compute the residuals and the sum of their squares on the TI-83. Later, we will see a faster method. Enter the x-values in list L1 and the y-values in list L2. Compute a + b*L1 and store in list L3 (y^ values). Compute (L2 – L3)2. This is a list of the squared residuals. Compute sum(Ans). This is the sum of the squared residuals.

TI-83 – Computing Residuals Enter the data set and use the equation y^ = -0.5 + 1.9x to compute the sum of squared residuals. x y 2 3 5 9 6 12 16

Prediction Use the regression line to predict y when x = 4 x = 7 x = 20 Interpolation – Using an x value within the observed extremes of x values to predict y. Extrapolation – Using an x value beyond the observed extremes of x values to predict y.

Interpolation vs. Extrapolation Interpolated values are more reliable then extrapolated values. The farther out the values are extrapolated, the less reliable they are.