Presentation on theme: " Chapter 7 Scatterplots, Association, and Correlation."— Presentation transcript:
Chapter 7 Scatterplots, Association, and Correlation
Scatterplots Displays the relationship between 2 quantitative valuables measured on the same cases very common very effective way to display relationships see patterns and trends
Examples Relationships between variables are often at the heart of what we would like to learn from data. Are grades actually higher now than they used to be? Do people tend to reach puberty at a younger age than in previous generations? Does applying magnets to parts of the body relieve pain? If so, are stronger magnets more effective? Do students learn better with the use of computer technology? These questions relate two quantitative variables and ask whether there is an association between them.
Unusual Features Be sure to mention any outliers or subgroups
Cartesian Plane Created by René Descartes (1596 – 1650)
Variables x - variable Explanatory variable Predictor variable Accounts for, explains, predicts or is otherwise responsible for the y – variable y - variable Response variable The variable you hope to predict or explain
Assigning the Variables We want to compare peak period freeway speed to cost per person per year. x = speed and y = cost the slower you go, the more it costs in delays x = cost and y = speed the more you spend on highway improvements the speed would increase
Determining Variables Do heavier smokers develop lung cancer at younger ages? Is birth order an important factor in predicting future income? Can we estimate a person’s % body fat more simply by just measuring waist or wrist size?
Examples: Describe what the scatterplot might look like. Drug dosage and degree of pain relief Calories consumed and weight loss Hours of sleep and score on a test Show size and grade point average Time for a mile run and age Age of car and cost of repairs
Calculator Making scatterplots Naming lists
Correlation measures the strength of the linear association between two quantitative variables The sign of the correlation coefficient gives the direction of the association Always between -1 and 1 -1 and 1 would be a perfect straight line (possible but very rare) Correlation treats x and y symmetrically No units NOT affected by changes in the center or scale of either variable Correlation depends on the z-scores Measures the strength of ONLY LINEAR plots Sensitive to outliers a single value can drastically change your coefficient
Correlation Conditions Quantitative Variables Condition: correlation applies only to quantitative variables. Check to make sure you know the variables units and what they measure Straight Enough Condition: the correlation coefficient tells us the strength of LINEAR scatterplots only Outlier Conditions: outliers can distort the correlation dramatically. When you see an outlier, you should report the correlation with AND without the outlier.
Checking In Your Statistics teacher tells you that the correlation between the scores (points out of 50) on Exam 1 and Exam 2 was.75 Before answering any questions about the correlation, what would you like to see? Why? If she added 10 points to each Exam 1 scores, how will this change the correlation? If she standardizes both scores, how will this affect the correlation? In general, if someone does poorly on Exam 1, are they likely to do poorly or well on Exam 2? Explain. If someone does poorly on Exam 1, will they definitely do poorly on Exam 2 as well?
Looking at Association When your blood pressure is measured, it is reported at two values, systolic blood pressure and diastolic blood pressure. How are these variable related to each other? Do they tend to be both hih or both low?
Think!! Plan I’ll examine the relationship between two measures of blood pressure. Variables Systolic blood pressure and diastolic blood pressure, both measured in millimeters of mercury W’s: 1406 participants in a health study in Framingham MA Plot Create a scatterplot
Show!! Mechanics We will calculate correlation on the calculator Correlation =.792
Tell!! Conclusion The scatterplot shows a positive direction, with a higher SBP going with a higher DBP. The plot is generally straight with a moderate amount of scatter. The correlation of.792 is consistent with what I saw in the scatterplot. A few cases stand out with unusually high SBP compared with their DBP. It seems far less common for the DBP to be high by itself.