C HAPTER 4: M ORE ON T WO V ARIABLE D ATA Sec. 4.2 – Cautions about Correlation and Regression.

Slides:



Advertisements
Similar presentations
Section 4.2. Correlation and Regression Describe only linear relationship. Strongly influenced by extremes in data. Always plot data first. Extrapolation.
Advertisements

Least-Squares Regression Section 3.3. Correlation measures the strength and direction of a linear relationship between two variables. How do we summarize.
MATH 2400 Chapter 5 Notes. Regression Line Uses data to create a linear equation in the form y = ax + b where “a” is the slope of the line (unit rate.
Chapter 4: More on Two- Variable Data.  Correlation and Regression Describe only linear relationships Are not resistant  One influential observation.
AP Statistics Section 4.3 Establishing Causation
AP Statistics Chapters 3 & 4 Measuring Relationships Between 2 Variables.
Looking at Data-Relationships 2.1 –Scatter plots.
Lecture 18 Simple Linear Regression (Chapters )
BPS - 5th Ed. Chapter 51 Regression. BPS - 5th Ed. Chapter 52 u Objective: To quantify the linear relationship between an explanatory variable (x) and.
Basic Practice of Statistics - 3rd Edition
C HAPTER 2 S CATTER PLOTS, C ORRELATION, L INEAR R EGRESSION, I NFERENCES FOR R EGRESSION By: Tasha Carr, Lyndsay Gentile, Darya Rosikhina, Stacey Zarko.
Chapter 5 Regression. Chapter outline The least-squares regression line Facts about least-squares regression Residuals Influential observations Cautions.
Chapter 4 Section 3 Establishing Causation
The Question of Causation
HW#9: read Chapter 2.6 pages On page 159 #2.122, page 160#2.124,
2.4: Cautions about Regression and Correlation. Cautions: Regression & Correlation Correlation measures only linear association. Extrapolation often produces.
The Practice of Statistics Third Edition Chapter 4: More about Relationships between Two Variables Copyright © 2008 by W. H. Freeman & Company Daniel S.
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
1 Chapter 10, Part 2 Linear Regression. 2 Last Time: A scatterplot gives a picture of the relationship between two quantitative variables. One variable.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
4.3: Establishing Causation Both correlation and regression are very useful in describing the relationship between two variables; however, they are first.
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
BPS - 3rd Ed. Chapter 51 Regression. BPS - 3rd Ed. Chapter 52 u Objective: To quantify the linear relationship between an explanatory variable (x) and.
Chapter 5 Regression BPS - 5th Ed. Chapter 51. Linear Regression  Objective: To quantify the linear relationship between an explanatory variable (x)
BPS - 5th Ed. Chapter 51 Regression. BPS - 5th Ed. Chapter 52 u Objective: To quantify the linear relationship between an explanatory variable (x) and.
L URKING V ARIABLES & T HEIR C ONSEQUENCES L URKING V ARIABLES Lurking Variable Variable not in your study that can (and probably does) effect the interpretation.
Does Association Imply Causation? Sometimes, but not always! What about: –x=mother's BMI, y=daughter's BMI –x=amt. of saccharin in a rat's diet, y=# of.
Lecture Presentation Slides SEVENTH EDITION STATISTICS Moore / McCabe / Craig Introduction to the Practice of Chapter 2 Looking at Data: Relationships.
Chapter 4 More on Two-Variable Data YMS 4.1 Transforming Relationships.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.2 Least-Squares.
WARM-UP Do the work on the slip of paper (handout)
CHAPTER 5 Regression BPS - 5TH ED.CHAPTER 5 1. PREDICTION VIA REGRESSION LINE NUMBER OF NEW BIRDS AND PERCENT RETURNING BPS - 5TH ED.CHAPTER 5 2.
Chapter 5 Regression. u Objective: To quantify the linear relationship between an explanatory variable (x) and response variable (y). u We can then predict.
AP STATISTICS LESSON 4 – 2 ( DAY 1 ) Cautions About Correlation and Regression.
C ORRELATION V S. C AUSATION 4.2 C AUTIONS ABOUT C ORRELATION AND R EGRESSION Correlation and Regression ONLY describe only linear relationships r and.
Business Statistics for Managerial Decision Making
Chapter 4 Day Six Establishing Causation. Beware the post-hoc fallacy “Post hoc, ergo propter hoc.” To avoid falling for the post-hoc fallacy, assuming.
1. Plot the data. What kind of growth does it exhibit? (plot by hand but you may use calculators to confirm answers.) 2. Use logs to transform the data.
Cautions About Correlation and Regression Section 4.2.
BPS - 3rd Ed. Chapter 51 Regression. BPS - 3rd Ed. Chapter 52 u To describe the change in Y per unit X u To predict the average level of Y at a given.
The Question of Causation
Stat 1510: Statistical Thinking and Concepts REGRESSION.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
1 Regression Line Part II Class Class Objective After this class, you will be able to -Evaluate Regression and Correlation Difficulties and Disasters.
Section Causation AP Statistics ww.toddfadoir.com/apstats.
CHAPTER 5: Regression ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
The Question of Causation 4.2:Establishing Causation AP Statistics.
Chapter 5: 02/17/ Chapter 5 Regression. 2 Chapter 5: 02/17/2004 Objective: To quantify the linear relationship between an explanatory variable (x)
2.7 The Question of Causation
Cautions About Correlation and Regression Section 4.2
Chapter 4.2 Notes LSRL.
CHAPTER 3 Describing Relationships
Cautions About Correlation and Regression
Cautions about Correlation and Regression
Chapter 2: Looking at Data — Relationships
Chapter 2 Looking at Data— Relationships
Cautions about Correlation and Regression
7 Minutes of Silence Determine if the data is linear or exponential.
Chapter 2 Looking at Data— Relationships
The Question of Causation
HS 67 (Intro Health Stat) Regression
Least-Squares Regression
Basic Practice of Statistics - 3rd Edition Regression
CHAPTER 3 Describing Relationships
EQ: What gets in the way of a good model?
4.2 Cautions about Correlation and Regression
Section 6.2 Establishing Causation
Basic Practice of Statistics - 3rd Edition Lecture Powerpoint
Chapter 4: More on Two-Variable Data
Presentation transcript:

C HAPTER 4: M ORE ON T WO V ARIABLE D ATA Sec. 4.2 – Cautions about Correlation and Regression

C AUTIONS ABOUT C ORRELATION AND R EGRESSION Recall from chapter 3: That correlation and regression describe only linear relationships That correlation and the LSRL are not resistant One influential point or incorrectly entered data point can completely change the data. Always plot your data before interpreting regression or correlation

E XTRAPOLATION Extrapolation is the use of a regression line far outside the domain of values of the explanatory variable x that you used to obtain the line or curve. Such predictions are not accurate Example Suppose that you have data on a child’s growth between the years 3 and 8. You find a strong linear relationship between age x and height y. If you fit a regression line to these data and use it to predict the child’s height at 25 years old you would predict them to be 8 feet tall Don’t stray far from the domain of x that actually appears in your data

L URKING V ARIABLES Sometimes the relationship between two variables is influenced by other variables that we did not measure or even think about lurking variable A lurking variable is a variable that is not among the explanatory or response variables in study and yet may influence the interpretation of relationships among those variables. The relationship between two variables can be strongly influenced by lurking variables. A lurking variable can falsely suggest a strong relationship between x and y or it can hide a relationship that is really there.

Because lurking variables are often unrecognized and unmeasured, detecting their effect is a challenge Many lurking variables change systematically over time. One method of detecting if time has an influence is to plot residuals and response variables against the time order if available. L URKING V ARIABLES

T HE Q UESTION OF C AUSATION In many studies of the relationship between two variables, the goal is to establish that changes in the explanatory variable cause changes in the response variable. Even when a strong association is present, the conclusion that this association is due to a causal linking in the variables is often elusive.

Strong Associations can generally be explained by one of three relationships. 1. Causation 2. Common Response 3. Confounding Variable x and y show a strong association (dashed line). This association may be the result of any of several causal relationships (solid arrow). E XPLAINING A SSOCIATION

Confounding: x may cause y, but y may instead be caused by a confounding variable z CommonResponse Common Response: x and y are reacting to a lurking variable z Causation Causation: x causes y

C AUSATION Causation is not easily established. The best evidence for causation comes from experiments that change x while holding all other factors fixed. Even a very strong association between two variables is not by itself good evidence that there is a cause-and-effect link between the variables.

E XAMPLES OF D IRECT C AUSATION The following relationships are examples of direct causation, but “causation” is not a simple idea. Refer to p.233 for explanations 1.x = mother’s BMI y = daughter’s BMI 2.x = amount of saccharin in a rat’s diet y = count of tumors in the rat’s bladder

Beware of lurking variables when thinking about an association between two variables. The observed association between the variables x and y is explained by a lurking variable z. Both x and y change to changes in z. This common response creates an association even though there may be no direct causal link between x and y. C OMMON R ESPONSE

E XAMPLES OF COMMON R ESPONSE The following relationships are examples of how common response can create an association. Refer to p.233 for explanations 3.x = a high school senior’s SAT score y = the student’s first-year college GPA 4.x = monthly flow of money into stock mutual funds y = monthly rate of return for the stock market

Two variables are confounded when their effects on a response variable cannot be distinguished from each other. The confounded variables may be either explanatory variables or lurking variables. Confounding of several variables often prevents us from drawing conclusions about causation. C ONFOUNDING

E XAMPLES OF C ONFOUNDING The following relationships are examples of confounding Refer to p.234 for explanations 5.x = whether a person regularly attends religious services y = how long the person lives 6.x = the number of years of education a worker has y = the worker’s income

Homework: p #’s 33-36, 38 & 41