Describing Relationships

Slides:



Advertisements
Similar presentations
Scatterplots and Correlation
Advertisements

Chapter 3: Describing Relationships
EXAMINING RELATIONSHIPS Section 3.2 Correlation. Recall from yesterday…  A scatterplot displays form, direction, and strength of the relationship between.
CHAPTER 4: Scatterplots and Correlation. Chapter 4 Concepts 2  Explanatory and Response Variables  Displaying Relationships: Scatterplots  Interpreting.
CHAPTER 4: Scatterplots and Correlation
CHAPTER 3 Describing Relationships
Stat 1510: Statistical Thinking and Concepts Scatterplots and Correlation.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.1 Scatterplots.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 3: Describing Relationships Section 3.1 Scatterplots and Correlation.
CHAPTER 4: Scatterplots and Correlation ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.1 Scatterplots.
+ Warm Up Tests 1. + The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 3: Describing Relationships Section 3.1 Scatterplots.
The Practice of Statistics
4.2 Correlation The Correlation Coefficient r Properties of r 1.
Chapter 4 Scatterplots and Correlation. Chapter outline Explanatory and response variables Displaying relationships: Scatterplots Interpreting scatterplots.
Unit 3: Describing Relationships
DESCRIBING RELATIONSHIPS 3.1 Scatterplots. Questions To Ask What individuals do the data describe? What are the variables? How are they measured? Are.
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Describing Relationships
Describing Relationships
CHAPTER 7 LINEAR RELATIONSHIPS
Section 3.1 Scatterplots.
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Aim – How can we analyze bivariate data using scatterplots?
Chapter 3: Describing Relationships
Describing Bivariate Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Describing Relationships
Warmup In a study to determine whether surgery or chemotherapy results in higher survival rates for a certain type of cancer, whether or not the patient.
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 4 - Scatterplots and Correlation
Chapter 3 Scatterplots and Correlation.
3.1: Scatterplots & Correlation
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
September 25, 2013 Chapter 3: Describing Relationships Section 3.1
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Summarizing Bivariate Data
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
AP Stats Agenda Text book swap 2nd edition to 3rd Frappy – YAY
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Correlation Coefficient
Basic Practice of Statistics - 3rd Edition
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Presentation transcript:

Describing Relationships 3.1 Scatterplots

Questions To Ask What individuals do the data describe? What are the variables? How are they measured? Are all of the variables quantitative or is at least one a categorical variable? What did we do before? Plot the data. Describe the overall distribution (SOCS) Look at numerical summaries Check for Normality

Explanatory vs. Response Domain / Range Independent/ Dependent x / y Input / Output Cause / Effect Outcome Predicts changes in the outcome

Example p. 144 – Explanatory or Response? Linking SAT Math and Critical Reading Scores Julie asks, “Can I predict a state’s mean SAT Math score if I know its mean SAT Critical Reading Score?” Jim wants to know how the mean SAT Math and Critical Reading scores this year in the 50 states related to each other. For each student, identify the explanatory variable and the response variable if possible. Julie – treating the mean SAT Critical reading score as the explanatory variable and the mean SAT Math score as the response variable. Jim – just interested in exploring the relationship between the two variables. No clear explanatory and response variables.

Be careful with “cause” Be careful with “cause”. Just because two variables have a relationship, does not mean one causes the other!!!!

Scatterplots Shows the relationship between two quantitative variables measured on the same individuals. One variable on the horizontal axis, the other on the vertical. (eXplanatory variable goes on the x-axis) Each individual is represented by a point on the plot. We had several ways to plot one-variable distributions. Scatterplots are the only way to plot two quantitative variables.

How to make a Scatterplot 1. Decide which variable should go on each axis. 2. Label and scale your axes. 3. Plot individual data values. Many students lose credit because they do not label their graphs. Including the proper labels is more important than graphing each point in precisely the right place. Pick nice values to mark each axis. Will not always start at 0.

Example p. 148 – The Endangered Manatee The identified point represents the year 1996. In 1996, there were 732,000 powerboat registrations in Florida. That year, 60 manatees were killed by boats. Powerboats registered in Florida (1000s) and number of manatees killed from 1977 to 2010.

Describing Scatterplots - FODS Form – One big group? Clusters? Linear? Curved? Outliers – Any points that deviate significantly from the overall pattern. Direction – positively associated (+ slope) negatively associated (- slope) When were describing one-variable distributions we used SOCS. Now that we are describing two quantitative variables, we will use FODS. Strength – how closely do the points follow the overall pattern?

Example p. 148 – The Endangered Manatee Form – Overall linear pattern Outliers – No clear outliers Direction – Positive association Strength – Fairly strong USE MODIFIERS! Don’t freak about strength. We will discuss this more later.

Example p. 149 Form – Roughly linear with two clusters Outliers – No clear outliers Direction – Positive association Strength – Fairly strong Scatterplot shows the relationship between the duration of an eruption and the time until the next eruption. Why are there clusters? There seem to be a lot of shorter eruptions that last around 2 minutes and longer eruptions that last around 4.5 minutes.

Adding Categorical Variables To add categorical variables, use different types of marks (●, ○, □, +) for your points. WV, GA, SC have lower SAT scores than we would expect. DC city rather than a state.

Which one is stronger? Our eyes are not always reliable when looking to see how strong a linear relationship is. This is why we rely on a number to help us.

Measuring Linear Association: Correlation The correlation r measures the direction and strength of the linear relationship between two quantitative variables. r is always a number between -1 and 1 r > 0 indicates a positive association. r < 0 indicates a negative association. Values of r near 0 indicate a very weak linear relationship. strength increases as r moves away from 0 towards -1 or 1. r = -1 and r = 1 occur only in the case of a perfect linear relationship.

Correlation Practice 4. r ≈ 0.9 For each graph, estimate the correlation r and interpret it in context. Answer choices: 1. r ≈ -- 0.6 2. r ≈ -- 0.1 3. r ≈ 0.3 4. r ≈ 0.9 5. r ≈ -- 0.8 6. r ≈ 0.5 4. r ≈ 0.9 Pretty strong, positive relationship r ≈ 0.9

Correlation Practice 6. r ≈ 0.5 For each graph, estimate the correlation r and interpret it in context. Answer choices: 1. r ≈ -- 0.6 2. r ≈ -- 0.1 3. r ≈ 0.3 4. r ≈ 0.9 5. r ≈ -- 0.8 6. r ≈ 0.5 6. r ≈ 0.5 (b) Moderate, positive relationship r ≈ 0.5

Correlation Practice 3. r ≈ 0.3 For each graph, estimate the correlation r and interpret it in context. Answer choices: 1. r ≈ -- 0.6 2. r ≈ -- 0.1 3. r ≈ 0.3 4. r ≈ 0.9 5. r ≈ -- 0.8 6. r ≈ 0.5 3. r ≈ 0.3 (c) Weak, positive relationship r ≈ 0.3

Correlation Practice 2. r ≈ - 0.1 For each graph, estimate the correlation r and interpret it in context. Answer choices: 1. r ≈ -- 0.6 2. r ≈ -- 0.1 3. r ≈ 0.3 4. r ≈ 0.9 5. r ≈ -- 0.8 6. r ≈ 0.5 2. r ≈ - 0.1 (d) Weak, negative relationship r ≈ -0.1

Example, p. 153 r = 0.936 Interpret the value of r in context. The correlation of 0.936 confirms what we see in the scatterplot; there is a strong, positive linear relationship between points per game and wins in the SEC.

Example, p. 153 r = 0.936 The point highlighted in red on the scatterplot is Mississippi. What effect does Mississippi have on the correlation. Justify your answer. Mississippi makes the correlation closer to 1 (stronger). If Mississippi were not included, the remaining points wouldn’t be as tightly clustered in a linear pattern.

Calculating Correlation How to Calculate the Correlation r Suppose that we have data on variables x and y for n individuals. The values for the first individual are x1 and y1, the values for the second individual are x2 and y2, and so on. The means and standard deviations of the two variables are x-bar and sx for the x-values and y-bar and sy for the y-values. The correlation r between x and y is: Notice what the formula has in it. Z-scores

Facts About Correlation Correlation makes no distinction between explanatory and response variables. r does not change when we change the units of measurement of x, y, or both. The correlation r itself has no unit of measurement.

Cautions Correlation requires that both variables be quantitative. Correlation does not describe curved relationships between variables, no matter how strong the relationship is. Because of this, r = - 1 or 1 does not guarantee a linear relationship. Correlation is not resistant. r is strongly affected by a few outlying observations. Correlation is not a complete summary of two-variable data.

This set of data has a correlation close to – 1, but we can see a slight curve in the scatterplot. Always plot your data!

Example p. 157 – Why correlation doesn’t tell the whole story Scoring Figure Skaters. Until a scandal at the 2002 Olympics brought change, figure skating was scored by judges on a scale from 0.0 to 6.0. The scores were often controversial. We have the scores awarded by two judges, Pierre and Elena, for many skaters. How well do they agree? We calculate that the correlation between their scores is r = 0.9. But the mean of Pierre’s scores is 0.8 point lower than Elena’s mean. These facts don’t contradict each other. They simply give different kinds of information. The mean scores show that Pierre awards lower scores than Elena. But because Pierre gives every skater a score about 0.8 point lower than Elena does, the correlation remains high. Adding the same number to all values of either x or y does not change the correlation. If both judges score the same skaters, the competition is scored consistently because Pierre and Elena agree on which performances are better than others. The high r shows their agreement. But if Pierre scores some skaters and Elena others, we should add 0.8 point to Pierre’s scores to arrive at a fair comparison.

HW Due: Friday p. 159 # 5, 7, 15, 17, 21, 27, 28