The Practice of Statistics Third Edition Chapter 4: More about Relationships between Two Variables Copyright © 2008 by W. H. Freeman & Company Daniel S.

Slides:



Advertisements
Similar presentations
Section 4.2. Correlation and Regression Describe only linear relationship. Strongly influenced by extremes in data. Always plot data first. Extrapolation.
Advertisements

Chapter 4 Review: More About Relationship Between Two Variables
 Objective: To identify influential points in scatterplots and make sense of bivariate relationships.
Regression BPS chapter 5 © 2006 W.H. Freeman and Company.
Regression Wisdom Chapter 9.
AP Statistics Chapters 3 & 4 Measuring Relationships Between 2 Variables.
Regression Wisdom.
Chapter 9: Regression Wisdom
1-1 Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 8, Slide 1 Chapter 8 Regression Wisdom.
AP Statistics Causation & Relations in Categorical Data.
Chapter 2: Looking at Data - Relationships /true-fact-the-lack-of-pirates-is-causing-global-warming/
Chapter 9: Regression Alexander Swan & Rafey Alvi.
Ch 2 and 9.1 Relationships Between 2 Variables
ASSOCIATION: CONTINGENCY, CORRELATION, AND REGRESSION Chapter 3.
2.4: Cautions about Regression and Correlation. Cautions: Regression & Correlation Correlation measures only linear association. Extrapolation often produces.
Looking at data: relationships - Caution about correlation and regression - The question of causation IPS chapters 2.4 and 2.5 © 2006 W. H. Freeman and.
Copyright © 2010 Pearson Education, Inc. Chapter 9 Regression Wisdom.
Regression Wisdom.  Linear regression only works for linear models. (That sounds obvious, but when you fit a regression, you can’t take it for granted.)
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 9 Regression Wisdom.
Chapter 3 concepts/objectives Define and describe density curves Measure position using percentiles Measure position using z-scores Describe Normal distributions.
1 Chapter 4: More on Two-Variable Data 4.1Transforming Relationships 4.2Cautions 4.3Relations in Categorical Data.
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 9 Regression Wisdom.
1 Chapter 4: More on Two-Variable Data 4.1Transforming Relationships 4.2Cautions 4.3Relations in Categorical Data.
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
Chapter 4 More on Two-Variable Data “Each of us is a statistical impossibility around which hover a million other lives that were never destined to be.
Regression BPS chapter 5 © 2010 W.H. Freeman and Company.
Chapter 4 More on Two-Variable Data YMS 4.1 Transforming Relationships.
1 Chapter 4: More on Two-Variable Data 4.1Transforming Relationships 4.2Cautions 4.3Relations in Categorical Data.
Chapter 3.3 Cautions about Correlations and Regression Wisdom.
WARM-UP Do the work on the slip of paper (handout)
Copyright © 2010 Pearson Education, Inc. Chapter 9 Regression Wisdom.
Relationships Scatterplots and correlation BPS chapter 4 © 2006 W.H. Freeman and Company.
Lesson Correlation and Regression Wisdom. Knowledge Objectives Recall the three limitations on the use of correlation and regression. Explain what.
Slide 9-1 Copyright © 2004 Pearson Education, Inc.
Chapter 2 Examining Relationships.  Response variable measures outcome of a study (dependent variable)  Explanatory variable explains or influences.
Copyright © 2010 Pearson Education, Inc. Slide The lengths of individual shellfish in a population of 10,000 shellfish are approximately normally.
Chapter 8 Linear Regression HOW CAN A MODEL BE CREATED WHICH REPRESENTS THE LINEAR RELATIONSHIP BETWEEN TWO QUANTITATIVE VARIABLES?
Regression Wisdom Chapter 9. Getting the “Bends” Linear regression only works for linear models. (That sounds obvious, but when you fit a regression,
Regression Wisdom. Getting the “Bends”  Linear regression only works for linear models. (That sounds obvious, but when you fit a regression, you can’t.
Chapter 9 Regression Wisdom. Getting the “Bends” Linear regression only works for data with a linear association. Curved relationships may not be evident.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 9 Regression Wisdom.
Regression Wisdom Copyright © 2010, 2007, 2004 Pearson Education, Inc.
Do Now Examine the two scatterplots and determine the best course of action for other countries to help increase the life expectancies in the countries.
Statistics 9 Regression Wisdom. Getting the “Bends” Linear regression only works for linear models. (That sounds obvious, but when you fit a regression,
1-1 Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 8, Slide 1 Chapter 8 Regression Wisdom.
AP Statistics. Issues Interpreting Correlation and Regression  Limitations for r, r 2, and LSRL :  Can only be used to describe linear relationships.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 9 Regression Wisdom.
AP Statistics.  Linear regression only works for linear models. (That sounds obvious, but when you fit a regression, you can’t take it for granted.)
Chapter 3: Describing Relationships
Chapter 9 Regression Wisdom Copyright © 2010 Pearson Education, Inc.
Chapter 8 Regression Wisdom.
Chapter 2 Looking at Data— Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Cautions about Correlation and Regression
Least-Squares Regression
Chapter 3: Describing Relationships
Chapter 2 Looking at Data— Relationships
Review of Chapter 3 Examining Relationships
Chapter 3: Describing Relationships
Least-Squares Regression
Chapter 3: Describing Relationships
Least-Squares Regression
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Presentation transcript:

The Practice of Statistics Third Edition Chapter 4: More about Relationships between Two Variables Copyright © 2008 by W. H. Freeman & Company Daniel S. Yates

Section 4.1 Modeling Nonlinear Data Linear relationship xy  y/  x Constant value  y/  x indicate linear relationship y = a + bx

Section 4.1 Modeling Nonlinear Data Exponential relationship xyy n /y n Constant value y n /y n-1 called common ratio indicates exponential relationship y = ab x

Section 4.1 Modeling Nonlinear Data Power relationship xyy n /y n-1  y/  x Neither y n /y n-1 or  y/  x are constant indicates possible power relationship y = ax b

Section 4.1 Modeling Nonlinear Data Many important real world situations exhibit exponential or power relationships. Exponential and power relationships can be transformed into linear forms so linear regression analysis can be utilized.

Linear regression only works for linear models. (That sounds obvious, but when you fit a regression, you can’t take it for granted.) A curved relationship between two variables might not be apparent when looking at a scatterplot alone, but will be more obvious in a plot of the residuals. –Remember, we want to see “nothing” in a plot of the residuals.

No regression analysis is complete without a display of the residuals to check that the linear model is reasonable. Residuals often reveal subtleties that were not clear from a plot of the original data.

Section 4.1 Modeling Nonlinear Data For exponential relationship – log y is linear with respect to x For power relationship - log y is linear with respect to log x

Linear models give a predicted value for each case in the data. We cannot assume that a linear relationship in the data exists beyond the range of the data. Once we venture into new x territory, such a prediction is called an extrapolation.

Section 4.2 Interpreting Correlation and Regression r and LSRL describe only linear relationships r and LSRL are strongly influenced by a few extreme observations – influential points Always plot your data The use of a regression line to predict outside the domain of values of the explanatory variable x is called extrapolation and cannot be trusted.

Lurking Variables and Causation No matter how strong the association, no matter how large the R 2 value, no matter how straight the line, there is no way to conclude from a regression alone that one variable causes the other. –There’s always the possibility that some third variable is driving both of the variables you have observed. With observational data, as opposed to data from a designed experiment, there is no way to be sure that a lurking variable is not the cause of any apparent association.

Section 4.2 Interpreting Correlation and Regression Lurking variables are variables that can influence the relationship of two variables. Lurking variables are not measured or even considered. Lurking variables can falsely suggest a strong relationship between two variables or even hide a relationship.

Lurking Variables and Causation (cont.) The following scatterplot shows that the average life expectancy for a country is related to the number of doctors per person in that country:

Lurking Variables and Causation (cont.) This new scatterplot shows that the average life expectancy for a country is related to the number of televisions per person in that country:

Lurking Variables and Causation (cont.) Since televisions are cheaper than doctors, send TVs to countries with low life expectancies in order to extend lifetimes. Right? How about considering a lurking variable? That makes more sense… –Countries with higher standards of living have both longer life expectancies and more doctors (and TVs!). –If higher living standards cause changes in these other variables, improving living standards might be expected to prolong lives and increase the numbers of doctors and TVs.

Strong association of variables x and y can reflect any of the following underlying relationships –Causation - changes in x cause changes in y –ex. Consuming more calories with no change in physical activity causes weight gain. –Common response – both x and y respond to some unobserved variable or variables. –ex. There may be perceived cause and effect between SAT scores and undergrad GPA but both variables are likely responding to student knowledge and ability

–Confounding – the effect on y of the explanatory variable x is mixed up with the effects on y of other lurking variables. –ex. Minority students have lower ave. SAT scores than whites; but minorities on average grew up in poorer households and attended poorer schools. These socioeconomic variables make cause and effect suspect.

Strong Association

A carefully designed experiment is the best way to get evidence that x causes y. Lurking variables must be kept under control.

Section 4.3 Relations in Categorical Data Categorical data may be inherently categorical such as; sex,race and occupation. Categorical data may be created by grouping quantitative data. Two way tables – hold categorical data

Income Total 0- 19,999 4,5062,7383,40010,644 20, ,999 8,7245,6224,78919,135 40, ,999 12,64316,8937,64237,178 Total25,87325,25315,83166,957 Age Group example Row variable – Income Column variable - Age

The totals of the rows and column are called marginal distributions. The totals may be off from the table data due to rounding error. The data may also be represented by percents. Relationships between categorical data may be calculated from the two way table. Data may be represented by a bar chart.

Conditional distributions satisfy a certain condition on the table. –Ex. Distribution of income level for year olds. –Ex. Distribution of age for people making $20,000 - $39,999

Example OutcomeHospital A Hospital B Total Died 63 (3%) 16 (2%) 79 Survived 2037 (97%) 784 (98%) 2,821 Total 2, ,900

OutcomeHospital A Hospital B Hospital A Hospital B Died6 (1%) 8 (1.3%) 57 (3.8%) 8 (4%) Survived594 (99%) 592 (98.7%) 1,443 (96.2%) 192 (96%) Total600 1, Good ConditionPoor Condition Lurking Variable