R-Squared Explained The Coefficient Of Determination.

Slides:



Advertisements
Similar presentations
AP Statistics Section 3.2 C Coefficient of Determination
Advertisements

The Role of r2 in Regression Target Goal: I can use r2 to explain the variation of y that is explained by the LSRL. D4: 3.2b Hw: pg 191 – 43, 46, 48,
LSRLs: Interpreting r vs. r2
 Coefficient of Determination Section 4.3 Alan Craig
2nd Day: Bear Example Length (in) Weight (lb)
Warm up Use calculator to find r,, a, b. Chapter 8 LSRL-Least Squares Regression Line.
CHAPTER 3 Describing Relationships
Haroon Alam, Mitchell Sanders, Chuck McAllister- Ashley, and Arjun Patel.
Least Squares Regression Line (LSRL)
Regression, Residuals, and Coefficient of Determination Section 3.2.
Simple Linear Regression 1. 2 I want to start this section with a story. Imagine we take everyone in the class and line them up from shortest to tallest.
Least-Squares Regression Section 3.3. Why Create a Model? There are two reasons to create a mathematical model for a set of bivariate data. To predict.
AP STATISTICS LESSON 3 – 3 LEAST – SQUARES REGRESSION.
Section 5.2: Linear Regression: Fitting a Line to Bivariate Data.
Chapter 3 Section 3.1 Examining Relationships. Continue to ask the preliminary questions familiar from Chapter 1 and 2 What individuals do the data describe?
Simple Linear Regression. Deterministic Relationship If the value of y (dependent) is completely determined by the value of x (Independent variable) (Like.
Chapters 8 & 9 Linear Regression & Regression Wisdom.
Chapter 5 Residuals, Residual Plots, & Influential points.
Chapter 5 Residuals, Residual Plots, Coefficient of determination, & Influential points.
Objective: Understanding and using linear regression Answer the following questions: (c) If one house is larger in size than another, do you think it affects.
STATISTICS 12.0 Correlation and Linear Regression “Correlation and Linear Regression -”Causal Forecasting Method.
Section 3.2C. The regression line can be found using the calculator Put the data in L1 and L2. Press Stat – Calc - #8 (or 4) - enter To get the correlation.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.2 Least-Squares.
STA291 Statistical Methods Lecture LINEar Association o r measures “closeness” of data to the “best” line. What line is that? And best in what terms.
Chapter 9: Correlation and Regression Analysis. Correlation Correlation is a numerical way to measure the strength and direction of a linear association.
CHAPTER 5 CORRELATION & LINEAR REGRESSION. GOAL : Understand and interpret the terms dependent variable and independent variable. Draw a scatter diagram.
The correlation coefficient, r, tells us about strength (scatter) and direction of the linear relationship between two quantitative variables. In addition,
3.2 Least-Squares Regression Objectives SWBAT: INTERPRET the slope and y intercept of a least-squares regression line. USE the least-squares regression.
 Chapter 3! 1. UNIT 7 VOCABULARY – CHAPTERS 3 & 14 2.
POD 09/19/ B #5P a)Describe the relationship between speed and pulse as shown in the scatterplot to the right. b)The correlation coefficient, r,
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 3 Association: Contingency, Correlation, and Regression Section 3.3 Predicting the Outcome.
Residuals Recall that the vertical distances from the points to the least-squares regression line are as small as possible.  Because those vertical distances.
^ y = a + bx Stats Chapter 5 - Least Squares Regression
CHAPTER 3 Describing Relationships
LEAST-SQUARES REGRESSION 3.2 Role of s and r 2 in Regression.
LSRLs: Interpreting r vs. r 2 r – “the correlation coefficient” tells you the strength and direction between two variables (x and y, for example, height.
Unit 4 Lesson 3 (5.3) Summarizing Bivariate Data 5.3: LSRL.
Stat 1510: Statistical Thinking and Concepts REGRESSION.
The correlation coefficient, r, tells us about strength (scatter) and direction of the linear relationship between two quantitative variables. In addition,
3.2 - Residuals and Least Squares Regression Line.
Method 3: Least squares regression. Another method for finding the equation of a straight line which is fitted to data is known as the method of least-squares.
Chapter 5 Lesson 5.2 Summarizing Bivariate Data 5.2: LSRL.
Chapters 8 Linear Regression. Correlation and Regression Correlation = linear relationship between two variables. Summarize relationship with line. Called.
CHAPTER 5: Regression ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
Descriptive measures of the degree of linear association R-squared and correlation.
1 Objective Given two linearly correlated variables (x and y), find the linear function (equation) that best describes the trend. Section 10.3 Regression.
Describing Bivariate Relationships. Bivariate Relationships When exploring/describing a bivariate (x,y) relationship: Determine the Explanatory and Response.
1. Analyzing patterns in scatterplots 2. Correlation and linearity 3. Least-squares regression line 4. Residual plots, outliers, and influential points.
Simple Linear Regression In many scientific investigations, one is interested to find how something is related with something else. For example the distance.
Residuals, Residual Plots, & Influential points. Residuals (error) - The vertical deviation between the observations & the LSRL always zerothe sum of.
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Simple Linear Regression
AP Stats: 3.3 Least-Squares Regression Line
EQ: How well does the line fit the data?
CHAPTER 3 Describing Relationships
Least-Squares Regression
CHAPTER 3 Describing Relationships
Least-Squares Regression
Least-Squares Regression
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
A medical researcher wishes to determine how the dosage (in mg) of a drug affects the heart rate of the patient. Find the correlation coefficient & interpret.
Homework: PG. 204 #30, 31 pg. 212 #35,36 30.) a. Reading scores are predicted to increase by for each one-point increase in IQ. For x=90: 45.98;
CHAPTER 3 Describing Relationships
Presentation transcript:

R-Squared Explained The Coefficient Of Determination

What is r 2 ? The coefficient of determination (r-sq, r 2 ) Mathematically it is the r-value squared (r 2 = r * r)

The LSRL (least-squares regression line or y-hat) is just one way to model data. Remember…since our goal is to predict the y-values with the least amount of error, the LSRL minimizes the vertical (y) distance from each point to the line.

What if we didn’t have the LSRL?

Because a modeling equation is used to predict y-values, maybe we could try predicting all y- values will be the average (mean) y. Without fancy tools (like a calculator), we might do something simple.

Would a horizontal line be a good fit here? Why? Why not?

Not a good fit! The error (difference between “best fit” line and data) is shown by the red lines.

The y-values are not all the same. (Can you see that the y-values are 2, 4, 6, 8, 15? The y-values vary!) We need a line that slants!!

Look at how small the total error is now. That is, the y-hat equation gets closer to all y-values of the data.

In a way, r-Squared tells us how much BETTER the y-hat equation is to predict y-values than using the mean (y-bar).

Another way to think about it: If we used the mean to make predictions, the variation in y would not be taken into account (all predicted y-values would be the same). R-squared tells us what percent of the variation (differences) in y-values is explained by using a slanted line (y-hat) to make predictions instead (so that we get variety in predicted y-values).

R-squared is… _____ percent of the variation in the _______ that is due to (or explained by) the linear relationship with between _______ and ____. Write r-sq as a % y-variable name x-variable name

Example: From the other day, we found that the r-value for coffee price and deforestation was r = Therefore, r 2 = =.9123 So…91.23% of the variation in the deforestation can be explained by the linear relationship between coffee price and deforestation percent. So, our y-hat equation is a very good fit and predictor.

r 2 can be any number 0 to 1. When r close to 0 … then the y-hat equation is not good for making predictions. When r close to 1 … then the y-hat equation makes reliable predictions.

EXAMPLE Using the data for height and hand-span… (Let’s say for this example that I am using the data to try to predict heights (y) using hand span (x).) – For my Room 12 Stats class, the r-value is.7927 – For my Room 17 Stats class, the r-value is.4859 Determine r 2 for each class and interpret each. (Interpretation sentence starter: “___ % of the variation…”) Should I use the corresponding y-hat equations to predict for each class?

EXAMPLE ANSWERS - For my Room 12 Stats class, the r-value is.7927 r 2 =.6284 So 62.84% of the variation in heights can be explained by the linear relationship between hand span and height. – For my Room 17 Stats class, the r-value is.4859 r 2 =.2361 So 23.61% of the variation in heights can be explained by the linear relationship between hand span and height. Should I use the corresponding y-hat equations to predict for each class? We definitely should not use the Room 17 y-hat equation to predict (it is a very bad fit). We might use the Room 12 y- hat equation to predict, but honestly, I wouldn’t. 62% is still too low for good predictions.