Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 7 Scatterplots, Association, and Correlation Stats: modeling the world Second edition Raymond Dahlman IV.

Similar presentations


Presentation on theme: "Chapter 7 Scatterplots, Association, and Correlation Stats: modeling the world Second edition Raymond Dahlman IV."— Presentation transcript:

1 Chapter 7 Scatterplots, Association, and Correlation Stats: modeling the world Second edition
Raymond Dahlman IV

2 Scatterplots Scatter plots are an effective way of displaying data, and lets us to visibly see if there is an association between 2 quantitative sets of variables. When looking at a scatter plot, we look for the direction, form, strength, and any unusual features like outliers in the data

3 Features of Scatterplots
Direction The direction of data determines whether the variables have a positive or negative correlation

4 Features of Scatterplots
Form The form, or the overall shape of the data, tells us how the two data sets are related, either linearly or by a non-linear function.

5 Features of Scatterplots
Strength The strength of the data is the relationship of each data point to the overall expected value, the line of best fit.

6 Determining Variables
When presented with 2 sets of data, it is necessary to assign each variable to its respective category. The data set that you believe that determines the other is the independent, or explanatory variable. The other variable, is the dependent, or response variable. When graphing, the explanatory variable is the x-axis, while the response variable is the y-axis.

7 Correlation Correlation is the measurement of the linear strength between 2 quantitative sets of data. To find the measure of correlation, or 𝑟 , we use the correlation coefficient. 𝑟= 𝑧 𝑥 𝑧 𝑦 𝑛−1

8 Correlation Conditions
Because the correlation coefficient is only useful for linear trends, the data must follow these conditions. The Quantitative Variables Condition: The 2 variables must be quantitative data in the appropriate units The Straight Enough Condition: The data must have a linear trend. If it does not, there are methods to straighten the data. The Outlier Condition: Be aware of outliers that could potentially distort the correlation coeficant.

9 Correlation Properties
Correlation is always −1≤𝑟≤1 with the sign of 𝑟 determining if is a negative or positive correlation. When r is equal to 1 or -1, all the data points fall along a single straight line, a rare occurrence. A correlation near 0 corresponds to a weak linear association. Correlation treats the variables symmetrically, where x:y is equal to y:x Correlation only depends on z-scores, and will not change unless the individual z-scores are changed.

10 Straightening Scatterplots
When a scatterplot shows a bent from that consistently increases or decreases, we can often straighten the form of the plot by re- expressing one or both variables, enabling us to apply the correlation coefficient.

11 Problem #33 People who responded to a July 2004 Discovery Channel poll named the 10 best roller coasters in the United States. The table below shows the length of the initial drop (in feet) and the duration of the ride (in seconds). What do these data indicate about the heght of a roller coaster and the length of the ride you can expect?

12 Make a Scatterplot

13 Drop (ft) Duration (s) 𝜇 171 133 𝜎 74.75 51.33
After making a scatterplot, we must determine the mean and standard deviation. Then we will take these values and change them into z-scores using z = 𝑥−𝜇 𝜎 Now, we can plug this into our correlation coefficient equation, and we come up with a r value of ~0.35 The overall all trend is a weak positive correlation between the drop length and the duration of the ride.

14 Problem #35 Is there any pattern to the locations of the planets in our solar system? The table shows the average distance of each of the nine planets from the sun. a) Make a scatterplot and describe the association. (Remember: direction, form, and strength!)  b) Why would you not want to talk about the correlation between planet position and distance from the sun?  c) Make a scatterplot showing the logarithm of distance vs. position. What is better about this scatterplot? 

15 A: The relation between the position and distance is non-linear with a positive association. There is very little scatter in the data. B: The relation is not linear.

16 C: The relation between the position and the log of the distance appears to be roughly linear.


Download ppt "Chapter 7 Scatterplots, Association, and Correlation Stats: modeling the world Second edition Raymond Dahlman IV."

Similar presentations


Ads by Google