Presentation on theme: "Scatterplots By Wendy Knight. Review of Scatterplots Scatterplots – Show the relationship between 2 quantitative variables measured on the same individual."— Presentation transcript:
Review of Scatterplots Scatterplots – Show the relationship between 2 quantitative variables measured on the same individual Can not be done with categorical data Explanatory Variable – Explains or causes the changes in the response variable (plotted on the x-axis) Response Variable – measures an outcome or results of a study (plotted on the y-axis) If there is no explanatory-response distinction, you can put the variables on either axis
Example Label the graph Label the Axis No breaks in the graph Plot the points
Example 2 Label the graph Label the Axis No breaks in the graph Plot the points
Interpreting Scatterplots Look for the overall pattern Describe the overall pattern using DIRECTION, FORM, and STRENGTH of the relationship Look for outliers
Direction Positively Associated: slopes upward from left to right Negatively Associated: slopes downward from left to right No Association PositiveNegativeNo CorrelationPositive
Form Strength Linear Non-linear Quadratic Exponential Trigonometric Determines how closely the points follow the form
Example The scatter plot below shows a relationship between hours worked and money earned. Which best describes the relationship between the variables?
Review Scatterplots display directions, form and strength of the relationship between two variables Straight-line relationships are simple patterns and common A straight-line relation is strong if the points lie close to the line; weak if they are widely scattered.
Thought Question 1: Use following two pictures to speculate on what influence outliers have on correlation. For each picture, do you think the correlation is higher or lower than it would be without the outlier? (Hint: Correlation measures how closely points fall to a straight line.)
Thought Question 2: A strong correlation has been found in a certain city in the northeastern United States between weekly sales of hot chocolate and weekly sales of facial tissues. Would you interpret that to mean that hot chocolate causes people to need facial tissues? Explain.
Thought Question 3: Researchers have shown that there is a positive correlation between the average fat intake and the breast cancer rate across countries. In other words, countries with higher fat intake tend to have higher breast cancer rates. Does this correlation prove that dietary fat is a contributing cause of breast cancer? Explain.
Thought Question 4: If you were to draw a scatterplot of number of women in the work force versus number of Christmas trees sold in the United States for each year between 1930 and the present, you would find a very strong correlation. Why do you think this would be true? Does one cause the other?
How can we counteract this? We can standardize the correlation with a numerical value Find the R value with the outlier and without
Correlation “r” Correlation describes the direction and strength of a straight-line relationship between two quantitative variables. Correlation is usually written as r. Positive r indicates positive association between variables Negative r indicates negative association r always falls between -1 and 1 Because r uses standardized scores, the correlation does not change when we change units of measurement Correlation ignores distinction between explanatory and response variables Correlation measures the strength of only straight-line association between two variables Correlation is strongly affected by a few outlying observations
Example 1: Highway Deaths and Speed Limits Correlation between death rate and speed limit is 0.55. If Italy removed, correlation drops to 0.098. If then Britain removed, correlation jumps to 0.70