Linear Regression.

Slides:



Advertisements
Similar presentations
Section 10-3 Regression.
Advertisements

1 Functions and Applications
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Probabilistic & Statistical Techniques Eng. Tamer Eshtawi First Semester Eng. Tamer Eshtawi First Semester
Correlation and Regression
Chapter 4 The Relation between Two Variables
Definition  Regression Model  Regression Equation Y i =  0 +  1 X i ^ Given a collection of paired data, the regression equation algebraically describes.
Chapter 6: Exploring Data: Relationships Lesson Plan
Chapter 9: Correlation and Regression
Linear Regression Larson/Farber 4th ed. 1 Section 9.2.
Math 227 Elementary Statistics Math 227 Elementary Statistics Sullivan, 4 th ed.
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Linear Regression and Linear Prediction Predicting the score on one variable.
Least Squares Regression
Least Squares Regression Line (LSRL)
1 Chapter 10 Correlation and Regression We deal with two variables, x and y. Main goal: Investigate how x and y are related, or correlated; how much they.
Correlation & Regression
Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or dichotomous variables (code as 0-1) n mean interpreted.
Descriptive Methods in Regression and Correlation
Introduction to Linear Regression and Correlation Analysis
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Section 10-3 Regression.
Relationship of two variables
Slide Copyright © 2008 Pearson Education, Inc. Chapter 4 Descriptive Methods in Regression and Correlation.
Is there a relationship between the lengths of body parts ?
Chapter 13 Statistics © 2008 Pearson Addison-Wesley. All rights reserved.
Biostatistics Unit 9 – Regression and Correlation.
Chapter 6: Exploring Data: Relationships Lesson Plan Displaying Relationships: Scatterplots Making Predictions: Regression Line Correlation Least-Squares.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
© 2008 Pearson Addison-Wesley. All rights reserved Chapter 1 Section 13-6 Regression and Correlation.
1 Chapter 10 Correlation and Regression 10.2 Correlation 10.3 Regression.
Section 5.2: Linear Regression: Fitting a Line to Bivariate Data.
Chapter 10 Correlation and Regression
Correlation & Regression
Scatterplot and trendline. Scatterplot Scatterplot explores the relationship between two quantitative variables. Example:
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 1 of 20 Chapter 4 Section 2 Least-Squares Regression.
Basic Concepts of Correlation. Definition A correlation exists between two variables when the values of one are somehow associated with the values of.
Objective: Understanding and using linear regression Answer the following questions: (c) If one house is larger in size than another, do you think it affects.
9.2A- Linear Regression Regression Line = Line of best fit The line for which the sum of the squares of the residuals is a minimum Residuals (d) = distance.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
Examining Bivariate Data Unit 3 – Statistics. Some Vocabulary Response aka Dependent Variable –Measures an outcome of a study Explanatory aka Independent.
Creating a Residual Plot and Investigating the Correlation Coefficient.
Chapter 3-Examining Relationships Scatterplots and Correlation Least-squares Regression.
1 Association  Variables –Response – an outcome variable whose values exhibit variability. –Explanatory – a variable that we use to try to explain the.
Linear Regression 1 Section 9.2. Section 9.2 Objectives 2 Find the equation of a regression line Predict y-values using a regression equation.
Least Squares Regression.   If we have two variables X and Y, we often would like to model the relation as a line  Draw a line through the scatter.
1 Data Analysis Linear Regression Data Analysis Linear Regression Ernesto A. Diaz Department of Mathematics Redwood High School.
Business Statistics for Managerial Decision Making
^ y = a + bx Stats Chapter 5 - Least Squares Regression
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Chapter 10 Correlation and Regression 10-2 Correlation 10-3 Regression.
1 Simple Linear Regression and Correlation Least Squares Method The Model Estimating the Coefficients EXAMPLE 1: USED CAR SALES.
Chapter 5 Lesson 5.2 Summarizing Bivariate Data 5.2: LSRL.
Lines of Best Fit When data show a correlation, you can estimate and draw a line of best fit that approximates a trend for a set of data and use it to.
CHAPTER 5: Regression ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
Copyright © Cengage Learning. All rights reserved. 8 9 Correlation and Regression.
Slide 1 Copyright © 2004 Pearson Education, Inc. Chapter 10 Correlation and Regression 10-1 Overview Overview 10-2 Correlation 10-3 Regression-3 Regression.
1 Objective Given two linearly correlated variables (x and y), find the linear function (equation) that best describes the trend. Section 10.3 Regression.
Describing Bivariate Relationships. Bivariate Relationships When exploring/describing a bivariate (x,y) relationship: Determine the Explanatory and Response.
1. Analyzing patterns in scatterplots 2. Correlation and linearity 3. Least-squares regression line 4. Residual plots, outliers, and influential points.
10.2 Regression By the end of class you will be able to graph a regression line and use it to make predictions.
Lecture Slides Elementary Statistics Twelfth Edition
Linear Regression Essentials Line Basics y = mx + b vs. Definitions
LSRL Least Squares Regression Line
SIMPLE LINEAR REGRESSION MODEL
CHAPTER 10 Correlation and Regression (Objectives)
Lecture Slides Elementary Statistics Thirteenth Edition
Chapter 10 Correlation and Regression
Lecture Notes The Relation between Two Variables Q Q
Algebra Review The equation of a straight line y = mx + b
9/27/ A Least-Squares Regression.
Created by Erin Hodgess, Houston, Texas
Presentation transcript:

Linear Regression

Essentials: Regression (Predictions based upon the known.) Understand what the regression process does - prediction. Be able to state the steps we use leading up to the decision to conduct regression. Be able to calculate the slope of a line and the y-intercept. Be able to calculate a regression equation and apply it to the prediction of other values. Know that these are estimates, not necessarily the actual values that might occur. Know what the Least Squares Property and Line of Best Fit. Residual – what’s that?

Cartesian Coordinate System We’ve seen the positive section of this. 1 2 3 4 5 4 3 2 1 -1 -2 -3 -4 -5 -4 -3 -2 -1 x y Cartesian Coordinate System We’ve seen the positive section of this. It also has a negative section.

1 2 3 4 5 4 3 2 1 -1 -2 -3 -4 -5 -4 -3 -2 -1 x y . (-1, 4) . (-2, 2) Points (x, y) are plotted in the negative section in the same way we plot points in the positive section. Here we see the points (-2, 2) and (-1, 4) -x -y

A Linear Equation in One Independent Variable b is the y-intercept (the point at which the line intersects the y-axis). It is the value of y when x = 0. y is the dependent variable (also called the response variable). Its value depends on the value of x. y = mx + b A linear equation will always have a straight line as its graph. x and y are variable m and b are fixed values. Linear equations are many times used to express the relationship between 2 variables. x is the independent variable (also known as the predictor variable.) m is the slope of the line. The slope indicates how much the y-value increases (or decreases if the slope is negative) when the x-value increases by 1 unit. When m is positive, the line will have an upward slope. When m is negative, the line will have a downward slope.

. y=mx+b y=2x+1 y x y 0 1 1 3 x 2 5 -1 -1 -2 -3 -3 -5 1 2 3 4 5 4 3 2 1 2 3 4 5 4 3 2 1 -1 -2 -3 -4 -5 -4 -3 -2 -1 x y . y=mx+b y=2x+1 x y 0 1 1 3 2 5 -1 -1 -2 -3 -3 -5

Regression Definitions y = b0 + b1x Regression Line Regression Equation Given a collection of paired data, the regression equation y = b0 + b1x ^ algebraically describes the relationship between the two variables Regression Line (line of best fit or least-squares line) The regression line is the graph of the regression equation

Always Look at a Scatterplot First You should be able to “see” a straight line being passed through the data points. If data doesn’t exhibit a linear pattern, then what we are about to do does not apply, and therefore should not be used.

Regression Line Plotted on Scatterplot We said that a “good” line, would be one that minimizes the distance from each point to the line. In this block, we will find the best line to do this, for a given set of data.

The Regression Line is calculated to minimize the distance of the line from the observed values. We said that a “good” line, would be one that minimizes the distance from each point to the line. In this block, we will find the best line to do this, for a given set of data.

The Regression Equation x is the independent variable (predictor variable) ^ y is the dependent variable (response variable) ^ y = b0 +b1x Where: b0 = y intercept b1 = slope (recall, y = mx +b )

Notation for Regression Equation Population Parameter Sample Statistic y-intercept of regression equation 0 b0 Slope of regression equation 1 b1 Equation of the regression line y = 0 + 1x y = b0 + b1x1 ^

Formulas for b0 and b1 Slope: y-intercept: HOWEVER (NEXT SLIDE) y-intercept: NOTE: If you do not find b1 first, then b0 may be determined by:

The Regression Line y = b0 +b1x ^ Fits the sample points best. Distances between this line and the sample points are at a minimum.

When is it reasonable to do Regression Start by asking the following: Does it make sense to look at the relationship between these two variables? Does a scatter plot present a relationship (either positive or negative)? If yes to both, calculate r (the correlation). Is the correlation statistically significant? Yes - go on to regression No – best estimate becomes the mean of the y variable Conduct regression analysis (if yes above) Use the regression equation to calculate (estimate) a y-value given a specific x-value.

Predictions In predicting a value of y based on some given value of x ... 1. If there is not a significant linear correlation, the best predicted y-value is y. 2. If there is a significant linear correlation, the best predicted y-value is found by substituting the x-value into the regression equation.

Predicting the Value of a Variable Start Calculate the value of r and test the hypothesis that  = 0 Is there a significant linear correlation ? Use the regression equation to make predictions. Substitute the given value in the regression equation. Yes No Given any value of one variable, the best predicted value of the other variable is its sample mean.

Guidelines for Using The Regression Equation If there is no significant linear correlation, don’t use the regression equation to make predictions. When using the regression equation for predictions, stay within the scope of the available sample data. A regression equation based on old data is not necessarily valid now. Don’t make predictions about a population that is different from the population from which the sample data was drawn.

Definitions Marginal Change Outlier Influential Points the amount a variable changes when the other variable changes by exactly one unit Outlier a point lying far away from the other data points Influential Points points which strongly affect the graph of the regression line The slope b1 in the regression equation represents the marginal change in y that occurs when x changes by one unit.

Residuals and the Least-Squares Property Definitions Residual For a sample of paired (x,y) data, the difference (y - y) between an observed sample y-value and the value of y-hat, which is the value of y that is predicted by using the regression equation. Least-Squares Property A straight line satisfies this property if the sum of the squares of the residuals is the smallest sum possible. ^

Residuals and the Least-Squares Property x 1 2 4 5 y 4 24 8 32 y = 5 + 4x 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 1 3 5 • x y Residual = 7 Residual = -13 Residual = -5 Residual = 11 ^

Example : Orion Cars Orion Cars: The age and price for a sample of 11 Orions are noted below. Calculate a correlation coefficient and , if appropriate, a regression equation for the relationship. Determine the value of cars that are 4.5 years and 10 years old. Car Age (yrs.) Price ($100’s) 1 5 85 2 4 103 3 6 70 4 5 82 5 5 89 6 5 98 7 6 66 8 6 95 9 2 169 10 7 70 11 7 48

Example : Orion Cars

Example : Orion Cars

Example : Orion Cars (Price in thousands)

Example : Orion Cars (Price in thousands)

Example : Orion Cars (Price in thousands)

With influential point Without influential point (Price in thousands)