Scatterplots, Association, and Correlation

Slides:



Advertisements
Similar presentations
Scatterplots and Correlation
Advertisements

CHAPTER 4: Scatterplots and Correlation. Chapter 4 Concepts 2  Explanatory and Response Variables  Displaying Relationships: Scatterplots  Interpreting.
CHAPTER 4: Scatterplots and Correlation
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 3: Describing Relationships Section 3.1 Scatterplots and Correlation.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 3: Describing Relationships Section 3.1 Scatterplots and Correlation.
CHAPTER 4: Scatterplots and Correlation ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
Chapter 6: Exploring Data: Relationships Lesson Plan Displaying Relationships: Scatterplots Making Predictions: Regression Line Correlation Least-Squares.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.1 Scatterplots.
CHAPTER 4: Scatterplots and Correlation ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.1 Scatterplots.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
+ Warm Up Tests 1. + The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 3: Describing Relationships Section 3.1 Scatterplots.
Chapter 4 Scatterplots and Correlation. Chapter outline Explanatory and response variables Displaying relationships: Scatterplots Interpreting scatterplots.
Unit 3: Describing Relationships
Notes Chapter 7 Bivariate Data. Relationships between two (or more) variables. The response variable measures an outcome of a study. The explanatory variable.
Lecture 8 Sections Objectives: Bivariate and Multivariate Data and Distributions − Scatter Plots − Form, Direction, Strength − Correlation − Properties.
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 6: Exploring Data: Relationships Lesson Plan
Chapter 3: Describing Relationships
Chapter 6: Exploring Data: Relationships Lesson Plan
Chapter 7 Part 1 Scatterplots, Association, and Correlation
Chapter 3: Describing Relationships
Chapter 2 Looking at Data— Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Describing Relationships
Warmup In a study to determine whether surgery or chemotherapy results in higher survival rates for a certain type of cancer, whether or not the patient.
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3 Scatterplots and Correlation.
3.1: Scatterplots & Correlation
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
September 25, 2013 Chapter 3: Describing Relationships Section 3.1
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Summarizing Bivariate Data
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
AP Stats Agenda Text book swap 2nd edition to 3rd Frappy – YAY
CHAPTER 3 Describing Relationships
Chapters Important Concepts and Terms
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Bivariate Data Response Variable: measures the outcome of a study (aka Dependent Variable) Explanatory Variable: helps explain or influences the change.
Chapter 3: Describing Relationships
Correlation Coefficient
Basic Practice of Statistics - 3rd Edition
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
3.2 Correlation Pg
Presentation transcript:

Scatterplots, Association, and Correlation Week 4 Lecture 1 Chapter 6. Scatterplots, Association, and Correlation

Relationship Between Two Quantitative Variables Previously, we explored single quantitative variables. Now, we can look at two quantitative variables and examine their relationship (association). First tool (graphical display) is to construct a scatterplot. Scatterplot: plots values of two quantitative variables against each other. Sometimes one variable is an “outcome” or response; and, the other explains the outcome, an explanatory variable. When examining the relationship between two or more variables, first think about which variables are response (dependent) variables and which are explanatory (independent) variables. In determining which variable(s) is/are response, and which one(s) is/are explanatory, think about the context of the study and the research question(s) that the study aims at investigating.

Relationship Between Two Quantitative Variables Let’s consider a data set on geographic socio-political areas in each state in the United States that includes information about: Births of low birthweight as a percent of all birth– A quantitative variable percent of adults who smoke – A quantitative variable Research Question: Is there a relationship between percent of low birthweight and percent of adults who smoke? Response variable: ? Explanatory variable: ?

Relationship Between Two Quantitative Variables Research Question: Is there a relationship between percent of low birthweight and percent of adults who smoke? Or we can phrase the question as: Do percent of low birthweight depend on percent of adults smoke? In other words, can variation in percent of adults who smoke explain the variation in percent of low birthweights? Response variable: percent of low birthweights Explanatory variable: percent of adults who smoke

Describing a Scatterplot Describing the Scatterplot of Percent of Low Birthweights and Percent of Adult Smokers Scatterplot of between percent of low birthweight and percent of adults who smoke Describing a Scatterplot Look for overall pattern and striking deviation form that pattern. The overall pattern of a scatterplot can be described by the form, direction, and strength of the relationship. An important kind of deviation is an outlier, an individual value that falls outside the overall pattern. Obtaining Scatterplot in StatCrunch: Graph > Scatter Plot X variable: Adult Smokers % Y variable: Low Birthweights % Click Compute

Describing the Scatterplot of Percent of Low Birthweights and Percent of Adult Smokers Direction: There is an upward trend. The points are forming an upward trend. The association is, thus, positive. Form: Overall, the points are forming a trend that is close to a straight line. No curvature is observed in the overall pattern. Strength: There appears to be a moderate linear association. In the context of this study, we interpret this scatterplot as: There appears to be a moderate positive linear association between percent of adults who smoke and percent of low birth birthweights. This, furthermore, means that higher percent of adult smokers are associated with higher percent of low birthweight.

Correlation Between Two Quantitative Variables Correlation measures the direction and the strength of linear association between two quantitative variables. We estimate the correlation in the population based our sample information. We denote the estimated correlation with the letter r. r is always between -1 and 1. Values of r near 0 indicate a very weak linear relationship. The strength of the linear relationship increases as r moves away from 0. r values close to -1 and 1 indicate that the point lie close to a straight line. r is not resistant (r is sensitive) and it is affected by extreme outliers in the data. When you calculate a correlation, it doesn’t matter which variable is 𝑥 (explanatory variable) and which is 𝑦 (response variable).

Correlation Between Percent of Low Birthweights and Percent of Adult Smokers We can use StatCrunch to obtain the estimated correlation. Stat > Summary Stats > Correlation Select the two variables: Adult Smoker Correlation between Adult Smokers % and Low Birthweight % is: 0.38384072 so, r = 0.38 r is a positive value here because the direction of the points are, overall, positive; thus, our interpretation about the relationship between percent of low birthweights and percent of adult smokers matches the r value. Also, r is about 0.38, which indicates moderate association between the two variables.

Correlation Does Not Imply Causation Important note: Correlation does not imply causation. Why? Because, there are other variables (perhaps not included into study’s data analysis) that could contribute to understanding the variation in the Percent of Low Birthweights. These variables are often called lurking or hidden variables. Can you propose some variables that you think they could contribute to understanding the variation in percent of low birthweights?

Correlation Does NOT Prove Causation In many studies of the relationship between two variables the goal is to establish that changes in the explanatory variable cause changes in response variable. Even a strong association between two variables, does not necessarily imply a casual link between the variables. Some explanations for an observed association: The dashed double arrow lines show an association. The solid arrows show a cause and effect link. The variable 𝑥 is explanatory, 𝑦 is response, and 𝑧 is a lurking variable.

Correlation Does Not Depend On Units of Measurement Correlation does not depend on the units of measurement for the response and explanatory variables. Why? Consider the relationship between ACT scores and percent of adult smokers. The unit of measurement is different for ACT score (score as its unit of measurement) and percent of adult smokers (percent as its unit of measurement). Correlation between Adult Smokers % and ACT is: -0.44544125 r is a negative value here because, overall, the direction of the points are negative. That means higher values of ACT scores go with lower values of percent of adult smokers. Note: The estimated correlation uses the standardized values of observations for the response and explanatory variables. Thus, r has no units of measurement.