Chapter 5 Regression. Chapter outline The least-squares regression line Facts about least-squares regression Residuals Influential observations Cautions.

Slides:



Advertisements
Similar presentations
Chapter 6: Exploring Data: Relationships Lesson Plan
Advertisements

Looking at Data-Relationships 2.1 –Scatter plots.
BPS - 5th Ed. Chapter 51 Regression. BPS - 5th Ed. Chapter 52 u Objective: To quantify the linear relationship between an explanatory variable (x) and.
CHAPTER 3 Describing Relationships
Ch 2 and 9.1 Relationships Between 2 Variables
Basic Practice of Statistics - 3rd Edition
Chapter 5 Regression. Chapter 51 u Objective: To quantify the linear relationship between an explanatory variable (x) and response variable (y). u We.
2.4: Cautions about Regression and Correlation. Cautions: Regression & Correlation Correlation measures only linear association. Extrapolation often produces.
Chapter 6: Exploring Data: Relationships Lesson Plan Displaying Relationships: Scatterplots Making Predictions: Regression Line Correlation Least-Squares.
AP STATISTICS LESSON 3 – 3 LEAST – SQUARES REGRESSION.
BPS - 3rd Ed. Chapter 51 Regression. BPS - 3rd Ed. Chapter 52 u Objective: To quantify the linear relationship between an explanatory variable (x) and.
BPS - 5th Ed. Chapter 51 Regression. BPS - 5th Ed. Chapter 52 u Objective: To quantify the linear relationship between an explanatory variable (x) and.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.2 Least-Squares.
Examining Bivariate Data Unit 3 – Statistics. Some Vocabulary Response aka Dependent Variable –Measures an outcome of a study Explanatory aka Independent.
CHAPTER 5 Regression BPS - 5TH ED.CHAPTER 5 1. PREDICTION VIA REGRESSION LINE NUMBER OF NEW BIRDS AND PERCENT RETURNING BPS - 5TH ED.CHAPTER 5 2.
Chapter 5 Regression. u Objective: To quantify the linear relationship between an explanatory variable (x) and response variable (y). u We can then predict.
Chapter 3-Examining Relationships Scatterplots and Correlation Least-squares Regression.
Lecture 5 Chapter 4. Relationships: Regression Student version.
Chapter 2 Examining Relationships.  Response variable measures outcome of a study (dependent variable)  Explanatory variable explains or influences.
^ y = a + bx Stats Chapter 5 - Least Squares Regression
CHAPTER 3 Describing Relationships
BPS - 3rd Ed. Chapter 51 Regression. BPS - 3rd Ed. Chapter 52 u To describe the change in Y per unit X u To predict the average level of Y at a given.
Stat 1510: Statistical Thinking and Concepts REGRESSION.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
CHAPTER 5: Regression ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
Describing Relationships. Least-Squares Regression  A method for finding a line that summarizes the relationship between two variables Only in a specific.
Chapter 5: 02/17/ Chapter 5 Regression. 2 Chapter 5: 02/17/2004 Objective: To quantify the linear relationship between an explanatory variable (x)
CHAPTER 3 Describing Relationships
Statistics 101 Chapter 3 Section 3.
Essential Statistics Regression
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Cautions about Correlation and Regression
Data Analysis and Statistical Software I ( ) Quarter: Autumn 02/03
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
^ y = a + bx Stats Chapter 5 - Least Squares Regression
Chapter 3: Describing Relationships
Least-Squares Regression
Chapter 3: Describing Relationships
Daniela Stan Raicu School of CTI, DePaul University
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Basic Practice of Statistics - 3rd Edition Regression
Chapter 3: Describing Relationships
Least-Squares Regression
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
3.2 – Least Squares Regression
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Basic Practice of Statistics - 3rd Edition Lecture Powerpoint
Chapter 3: Describing Relationships
9/27/ A Least-Squares Regression.
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Presentation transcript:

Chapter 5 Regression

Chapter outline The least-squares regression line Facts about least-squares regression Residuals Influential observations Cautions about correlation and regression Association does not imply causation

Correlation and Regression Regression effects are depicted by the slope of the line. Correlation can be seen as the spread of points around the regression line. The greater the amount of spread of points around the regression line, the less predictive is X of Y and consequently, the weaker the correlation.

Correlation r = 1

Imperfect Correlation and Relationships We rarely see perfect correlation While Correlation is never perfect, we can draw a line to summarize the trend in the data points. This is the Regression Line

Regression Line Regression Line: A straight line that describes how a response variable y changes as an explanatory variable x changes. It can sometimes be used to predict the value of y for a given value of x.

Making Predictions

Where do we Draw the Line?

Minimize the sum of the distances between the points and the line Square the Distances

The best fitting line would minimize the sum of the squared distance of every point in the scatterplot from the regression line Minimize  This line -- the best-fitting line -- is that line which -- compared to any other line you could plot through the points -- produced the smallest sum of squared deviations.

The slope b is the change in y when x increases by 1. The intercept a is the predicted value of y when x = 0.

Finding the equation of the regression line Exercise 5.16 (Page 125)

Facts about least-squares regression line Fact 1:It is a mathematical model for the data. Fact 2: The distinction between explanatory and response variables is essential in regression. Fact 3: There is a close connection between correlation and the slope of least squares line. Fact 4: The least-squares regression line always passes through the point, where is the mean of the x values, and is the mean of the y values. Fact 5: The correlation r describes the strength of a straight-line relationship. In the regression setting, this description takes a specific form: the square of the correlation, r 2, is the fraction of the variation in the value of y that is explained by the least squares regression of y on x.

Residual plots is a scatterplot of the regression residuals against the explanatory variable. Residual plots help us assess the fit of a regression line. A residual plot is a scatterplot of the regression residuals against the explanatory variable. Residual plots help us assess the fit of a regression line. is the difference between an observed of the response variable and the value predicted by the regression line. That is, A residual is the difference between an observed of the response variable and the value predicted by the regression line. That is, Residual =observed y – predicted y =

Outliers and Influential Observations is an observation that lies outside the overall pattern of the other observations An outlier is an observation that lies outside the overall pattern of the other observations An observation is influential for a statistical calculation if removing it would markedly change the result of the calculation. An observation is influential for a statistical calculation if removing it would markedly change the result of the calculation. Points that are outliers in the of a scatterplot are often influential for the least-squares regression line. Influential observations can also be described as outliers. Points that are outliers in the x direction of a scatterplot are often influential for the least-squares regression line. Influential observations can also be described as outliers.

Outliers and Influential Observations

Beware extrapolation Extrapolation Extrapolation is the use of a regression line for prediction far outside the range of values of the explanatory variable x that you used to obtain the line. Such predictions are often not accurate. Example Suppose Angela was 1.20m tall on January 1st 1975, and 1.40m tall on January 1st By extrapolation, estimate her height on January 1st By extrapolation, it could be estimated that by January 1st 1977 she would have grown another 0.20m to be 1.60m tall. This however assumes that she continued to grow at the same rate. This must eventually become a false assumption, otherwise by January 1st 1980, she would be a giantess.

Lurking variable A lurking variable A lurking variable is a variable that has an important effect on the relationship among the variable in a study but is not included among the variables studied. Example: Studies of relationship between treatment of heart disease and the patients’ gender show that women are in general treated less aggressively than men with similar symptoms. Women are less likely to undergo bypass operation. Question: Might this be discrimination? Answer: No. Be aware of the lurking variable: Although half of heart disease victim are women, they are on the average much older than male victim.

Association does not imply causation Example: Sales of rum and number of Methodist ministers is positively correlated, but a large number of ministers does not encourage rum drinking. Is there a lurking variable that influences both rum sales and Methodist ministers? The the previous example, both the sales of rum and the number of Methodists ministers were correlated with the number of people in the U.S. As the number of people increases, it causes an increase in demand for both Methodist ministers and for rum.