Basic Practice of Statistics - 3rd Edition

Slides:



Advertisements
Similar presentations
AP Statistics Chapters 3 & 4 Measuring Relationships Between 2 Variables.
Advertisements

Chapter 2: Looking at Data - Relationships /true-fact-the-lack-of-pirates-is-causing-global-warming/
Looking at Data-Relationships 2.1 –Scatter plots.
BPS - 5th Ed. Chapter 51 Regression. BPS - 5th Ed. Chapter 52 u Objective: To quantify the linear relationship between an explanatory variable (x) and.
CHAPTER 3 Describing Relationships
Chapter 5 Regression. Chapter 51 u Objective: To quantify the linear relationship between an explanatory variable (x) and response variable (y). u We.
Chapter 5 Regression. Chapter outline The least-squares regression line Facts about least-squares regression Residuals Influential observations Cautions.
2.4: Cautions about Regression and Correlation. Cautions: Regression & Correlation Correlation measures only linear association. Extrapolation often produces.
Chapter 3 concepts/objectives Define and describe density curves Measure position using percentiles Measure position using z-scores Describe Normal distributions.
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
AP STATISTICS LESSON 3 – 3 LEAST – SQUARES REGRESSION.
AP Statistics Chapter 8 & 9 Day 3
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Chapter 15 Describing Relationships: Regression, Prediction, and Causation Chapter 151.
Essential Statistics Chapter 41 Scatterplots and Correlation.
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
Chapter 151 Describing Relationships: Regression, Prediction, and Causation.
BPS - 3rd Ed. Chapter 51 Regression. BPS - 3rd Ed. Chapter 52 u Objective: To quantify the linear relationship between an explanatory variable (x) and.
Chapter 5 Regression BPS - 5th Ed. Chapter 51. Linear Regression  Objective: To quantify the linear relationship between an explanatory variable (x)
BPS - 5th Ed. Chapter 51 Regression. BPS - 5th Ed. Chapter 52 u Objective: To quantify the linear relationship between an explanatory variable (x) and.
Lecture Presentation Slides SEVENTH EDITION STATISTICS Moore / McCabe / Craig Introduction to the Practice of Chapter 2 Looking at Data: Relationships.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.2 Least-Squares.
Examining Bivariate Data Unit 3 – Statistics. Some Vocabulary Response aka Dependent Variable –Measures an outcome of a study Explanatory aka Independent.
CHAPTER 5 Regression BPS - 5TH ED.CHAPTER 5 1. PREDICTION VIA REGRESSION LINE NUMBER OF NEW BIRDS AND PERCENT RETURNING BPS - 5TH ED.CHAPTER 5 2.
Chapter 5 Regression. u Objective: To quantify the linear relationship between an explanatory variable (x) and response variable (y). u We can then predict.
AP STATISTICS LESSON 4 – 2 ( DAY 1 ) Cautions About Correlation and Regression.
Chapter 3-Examining Relationships Scatterplots and Correlation Least-squares Regression.
^ y = a + bx Stats Chapter 5 - Least Squares Regression
CHAPTER 3 Describing Relationships
BPS - 3rd Ed. Chapter 51 Regression. BPS - 3rd Ed. Chapter 52 u To describe the change in Y per unit X u To predict the average level of Y at a given.
Stat 1510: Statistical Thinking and Concepts REGRESSION.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Get out p. 193 HW and notes. LEAST-SQUARES REGRESSION 3.2 Interpreting Computer Regression Output.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
CHAPTER 5: Regression ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
Describing Relationships. Least-Squares Regression  A method for finding a line that summarizes the relationship between two variables Only in a specific.
Chapter 5: 02/17/ Chapter 5 Regression. 2 Chapter 5: 02/17/2004 Objective: To quantify the linear relationship between an explanatory variable (x)
Chapter 4.2 Notes LSRL.
Statistics 101 Chapter 3 Section 3.
Essential Statistics Regression
Cautions About Correlation and Regression
Cautions about Correlation and Regression
Basic Practice of Statistics - 3rd Edition Lecture PowerPoint Slides
Chapter 2 Looking at Data— Relationships
Chapter 2: Looking at Data — Relationships
Chapter 2 Looking at Data— Relationships
Least-Squares Regression
Chapter 3: Describing Relationships
^ y = a + bx Stats Chapter 5 - Least Squares Regression
Least-Squares Regression
Chapter 2 Looking at Data— Relationships
Examining Relationships
Basic Practice of Statistics - 5th Edition Regression
HS 67 (Intro Health Stat) Regression
Daniela Stan Raicu School of CTI, DePaul University
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Least-Squares Regression
Basic Practice of Statistics - 3rd Edition Regression
Chapter 3: Describing Relationships
Least-Squares Regression
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Basic Practice of Statistics - 3rd Edition Lecture Powerpoint
Chapter 3: Describing Relationships
Presentation transcript:

Basic Practice of Statistics - 3rd Edition CHAPTER 5: REGRESSION The Basic Practice of Statistics 6th Edition David S. Moore Chapter 5

Chapter 5 Concepts Regression Lines Least-Squares Regression Line Facts about Regression Residuals Influential Observations Cautions about Correlation and Regression Correlation Does Not Imply Causation

Chapter 5 Objectives Quantify the linear relationship between an explanatory variable (x) and a response variable (y). Use a regression line to predict values of (y) for values of (x). Calculate and interpret residuals. Describe cautions about correlation and regression.

Regression Line A regression line is a straight line that describes how a response variable y changes as an explanatory variable x changes. We can use a regression line to predict the value of y for a given value of x. Example: Predict the number of new adult birds that join the colony based on the percent of adult birds that return to the colony from the previous year. If 60% of adults return, how many new birds are predicted?

Regression Line

Regression Line ^ Regression equation: y = a + bx x is the value of the explanatory variable. “y-hat” is the predicted value of the response variable for a given value of x. b is the slope, the amount by which y changes for each one-unit increase in x. a is the intercept, the value of y when x = 0.

Least-Squares Regression Line Since we are trying to predict y, we want the regression line to be as close as possible to the data points in the vertical (y) direction. Least-Squares Regression Line (LSRL): The line that minimizes the sum of the squares of the vertical distances of the data points from the line.

Regression equation: y = a + bx Equation of the LSRL ^ Regression equation: y = a + bx where sx and sy are the standard deviations of the two variables, and r is their correlation.

Prediction via Regression Line For the returning birds example, the LSRL is y-hat = 31.9343  0.3040x y-hat is the predicted number of new birds for colonies with x percent of adults returning Suppose we know that an individual colony has 60% returning. What would we predict the number of new birds to be for just that colony? For colonies with 60% returning, we predict the average number of new birds to be: 31.9343  (0.3040)(60) = 13.69 birds

Facts about Least-Squares Regression The distinction between explanatory and response variables is essential. The slope b and correlation r always have the same sign. The LSRL always passes through (x-bar, y-bar). The square of the correlation, r2, is the fraction of the variation in the values of y that is explained by the least- squares regression of y on x. r2 is called the coefficient of determination.

Residuals A residual is the difference between an observed value of the response variable and the value predicted by the regression line: residual = y  y ^ A residual plot is a scatterplot of the regression residuals against the explanatory variable used to assess the fit of a regression line look for a “random” scatter around zero

Gesell Adaptive Score and Age at First Word

Gesell Adaptive Score and Age at First Word

Outliers and Influential Points An outlier is an observation that lies far away from the other observations. Outliers in the y direction have large residuals. Outliers in the x direction are often influential for the least-squares regression line, meaning that the removal of such points would markedly change the equation of the line.

Outliers and Influential Points Gesell Adaptive Score and Age at First Word After removing child 18 r2 = 11% From all of the data r2 = 41% Chapter 5

Cautions about Correlation and Regression Both describe linear relationships. Both are affected by outliers. Always plot the data before interpreting. Beware of extrapolation. predicting outside of the range of x Beware of lurking variables. These have an important effect on the relationship among the variables in a study, but are not included in the study. Correlation does not imply causation!

Caution: Beware of Extrapolation Sarah’s height was plotted against her age. Can you predict her height at age 42 months? Can you predict her height at age 30 years (360 months)?

Caution: Beware of Extrapolation Regression line: y-hat = 71.95 + .383 x Height at age 42 months? y-hat = 88 Height at age 30 years? y-hat = 209.8 She is predicted to be 6’10.5” at age 30!

Basic Practice of Statistics - 3rd Edition Caution: Beware of Lurking Variables Example: Meditation and Aging (Noetic Sciences Review, Summer 1993, p. 28) Explanatory variable: observed meditation practice (yes/no) Response variable: level of age-related enzyme General concern for one’s well being may also be affecting the response (and the decision to try meditation). Chapter 5

Basic Practice of Statistics - 3rd Edition Correlation Does Not Imply Causation Even very strong correlations may not correspond to a real causal relationship (changes in x actually causing changes in y). Correlation may be explained by a lurking variable Social Relationships and Health House, J., Landis, K., and Umberson, D. “Social Relationships and Health,” Science, Vol. 241 (1988), pp 540-545. Does lack of social relationships cause people to become ill? (there was a strong correlation) Or, are unhealthy people less likely to establish and maintain social relationships? (reversed relationship) Or, is there some other factor that predisposes people both to have lower social activity and become ill? Chapter 5

Basic Practice of Statistics - 3rd Edition Evidence of Causation A properly conducted experiment may establish causation. Other considerations when we cannot do an experiment: The association is strong. The association is consistent. The connection happens in repeated trials. The connection happens under varying conditions. Higher doses are associated with stronger responses. Alleged cause precedes the effect in time. Alleged cause is plausible (reasonable explanation). Chapter 5

Regression Case Study Country Per Capita GDP (x) Life Expectancy (y) Per Capita Gross Domestic Product and Average Life Expectancy for Countries in Western Europe Country Per Capita GDP (x) Life Expectancy (y) Austria 21.4 77.48 Belgium 23.2 77.53 Finland 20.0 77.32 France 22.7 78.63 Germany 20.8 77.17 Ireland 18.6 76.39 Italy 21.5 78.51 Netherlands 22.0 78.15 Switzerland 23.8 78.99 United Kingdom 21.2 77.37

Regression Case Study y = 68.716 + 0.420x ^ Least-Squares Regression Line Equation: y = 68.716 + 0.420x ^

Chapter 5 Objectives Review Quantify the linear relationship between an explanatory variable (x) and a response variable (y). Use a regression line to predict values of (y) for values of (x). Calculate and interpret residuals. Describe cautions about correlation and regression.