Unit 3 Correlation. Homework Assignment For the A: 1, 5, 7,11, 13, 14 - 18, 21, 27 - 32, 35, 37, 39, 41, 43, 45, 47 – 51, 55, 58, 59, 61, 63, 65, 69,

Slides:



Advertisements
Similar presentations
Linear Regression (LSRL)
Advertisements

Chapter 3 Bivariate Data
2nd Day: Bear Example Length (in) Weight (lb)
Warm up Use calculator to find r,, a, b. Chapter 8 LSRL-Least Squares Regression Line.
Scatter Diagrams and Linear Correlation
LSRL Least Squares Regression Line
The Simple Regression Model
Linear Regression.
Chapter 5 Correlation. Suppose we found the age and weight of a sample of 10 adults. Create a scatterplot of the data below. Is there any relationship.
Chapter 3 Correlation. Suppose we found the age and weight of a sample of 10 adults. Create a scatterplot of the data below. Is there any relationship.
Correlation.
Quantitative Data Essential Statistics. Quantitative Data O Review O Quantitative data is any data that produces a measurement or amount of something.
Chapter 3 Section 3.1 Examining Relationships. Continue to ask the preliminary questions familiar from Chapter 1 and 2 What individuals do the data describe?
Chapter 5 Residuals, Residual Plots, & Influential points.
Chapter 5 Residuals, Residual Plots, Coefficient of determination, & Influential points.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
Examining Bivariate Data Unit 3 – Statistics. Some Vocabulary Response aka Dependent Variable –Measures an outcome of a study Explanatory aka Independent.
CHAPTER 5 Regression BPS - 5TH ED.CHAPTER 5 1. PREDICTION VIA REGRESSION LINE NUMBER OF NEW BIRDS AND PERCENT RETURNING BPS - 5TH ED.CHAPTER 5 2.
Linear regression Correlation. Suppose we found the age and weight of a sample of 10 adults. Create a scatterplot of the data below. Is there any relationship.
Chapter 3-Examining Relationships Scatterplots and Correlation Least-squares Regression.
Summarizing Bivariate Data
AP Statistics Monday, 26 October 2015 OBJECTIVE TSW investigate the role of correlation in statistics. EVERYONE needs a graphing calculator. DUE NOW –Gummi.
Least Squares Regression Lines Text: Chapter 3.3 Unit 4: Notes page 58.
Unit 4 Lesson 3 (5.3) Summarizing Bivariate Data 5.3: LSRL.
Chapter 7 Linear Regression. Bivariate data x – variable: is the independent or explanatory variable y- variable: is the dependent or response variable.
Chapter 5 Summarizing Bivariate Data Correlation.
Chapter 5 Lesson 5.2 Summarizing Bivariate Data 5.2: LSRL.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 3: Describing Relationships Section 3.2 Least-Squares Regression.
CHAPTER 5: Regression ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
Describing Relationships. Least-Squares Regression  A method for finding a line that summarizes the relationship between two variables Only in a specific.
1. Analyzing patterns in scatterplots 2. Correlation and linearity 3. Least-squares regression line 4. Residual plots, outliers, and influential points.
Chapter 3 LSRL. Bivariate data x – variable: is the independent or explanatory variable y- variable: is the dependent or response variable Use x to predict.
Chapter 5 LSRL. Bivariate data x – variable: is the independent or explanatory variable y- variable: is the dependent or response variable Use x to predict.
Residuals, Residual Plots, & Influential points. Residuals (error) - The vertical deviation between the observations & the LSRL always zerothe sum of.
Quantitative Data Essential Statistics.
Chapter 5 Correlation.
Correlation.
Examining Relationships
Unit 4 LSRL.
LSRL.
Least Squares Regression Line.
Chapter 5 LSRL.
LSRL Least Squares Regression Line
Chapter 4 Correlation.
Chapter 3.2 LSRL.
Residuals, Residual Plots, and Influential points
Least Squares Regression Line LSRL Chapter 7-continued
residual = observed y – predicted y residual = y - ŷ
Chapter 3: Describing Relationships
Chapter 5 Correlation.
Residuals, Residual Plots, & Influential points
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Influential points.
Chapter 5 LSRL.
Chapter 5 LSRL.
Chapter 5 LSRL.
Chapter 5 Correlation.
Correlation.
Chapter 3: Describing Relationships
Examining Relationships Chapter 7
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Summarizing Bivariate Data
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Honors Statistics Review Chapters 7 & 8
Presentation transcript:

Unit 3 Correlation

Homework Assignment For the A: 1, 5, 7,11, 13, , 21, , 35, 37, 39, 41, 43, 45, 47 – 51, 55, 58, 59, 61, 63, 65, 69, R1 – R6 For the C: 1, 7, 13, 21, 32, 35, 39, 43, 47 – 51, 55, 58, 61, 65, 69, 71 – 78 R1 – R6 For the D- : 1, 7, 21, 32, 39, 43, 47 – 51, 55, 58, 65, 69, R1 – R6 All problems must be complete, including explanations with complete sentences and or work to show if the question asks for it.

The Guessing Game Activity

Suppose we found the age and weight of a sample of 10 adults. Create a scatterplot of the data below. Is there any relationship between the age and weight of these adults? Age Wt

Suppose we found the height and weight of a sample of 10 adults. Create a scatterplot of the data below. Is there any relationship between the height and weight of these adults? Ht Wt Is it positive or negative? Weak or strong?

The closer the points in a scatterplot are to a straight line - the stronger the relationship. The farther away from a straight line – the weaker the relationship

positive negativeno Identify as having a positive association, a negative association, or no association. 1.Heights of mothers & heights of their adult daughters + 2. Age of a car in years and its current value 3.Weight of a person and calories consumed 4.Height of a person and the person’s birth month 5.Number of hours spent in safety training and the number of accidents that occur - + NO -

Self Check #14

Correlation Coefficient (r)- quantitativeA quantitative assessment of the strength & direction of the linear relationship between bivariate, quantitative data Pearson’s sample correlation is used most parameter -  rho) statistic - r

Calculate r. Interpret r in context. Speed Limit (mph) Avg. # of accidents (weekly) There is a strong, positive, linear relationship between speed limit and average number of accidents per week.

Moderate Correlation Strong correlation Properties of r (correlation coefficient) legitimate values of r is [-1,1] No Correlation Weak correlation

unitvalue of r does not depend on the unit of measurement for either variable x (in mm) y Find r. Change to cm & find r. The correlations are the same.

value of r does not depend on which of the two variables is labeled x x y Switch x & y & find r. The correlations are the same.

non-resistantvalue of r is non-resistant x y Find r. Outliers affect the correlation coefficient

Internet Regression Activity

linearlyvalue of r is a measure of the extent to which x & y are linearly related A value of r close to zero does not rule out any strong relationship between x and y. definite r = 0, but has a definite relationship!

Methodist Ministers vs. Cuban Rum Give handout

Minister data: r =.9999 cause So does an increase in ministers cause an increase in consumption of rum?

Correlation does not imply causation

Self Check #15

Multiple Choice Test #5

Assignment #6

Least Square Regression Line LSRL

Bivariate data x – variable: is the independent or explanatory variable y- variable: is the dependent or response variable Use x to predict y

Least Squares Regression Line LSRL bestThe line that gives the best fit to the data set minimizesThe line that minimizes the sum of the squares of the deviations from the line

Sum of the squares = y =.5(0) + 4 = 4 0 – 4 = -4 (0,0) (3,10) (6,2) (0,0) y =.5(3) + 4 = – 5.5 = 4.5 y =.5(6) + 4 = 7 2 – 7 = -5

(0,0) (3,10) (6,2) Sum of the squares = 54 Use a calculator to find the line of best fit Find y - y -3 6 What is the sum of the deviations from the line? Will it always be zero? minimizes LSRL The line that minimizes the sum of the squares of the deviations from the line is the LSRL.

Regression Activity

Slope: unitx increase/decrease by For each unit increase in x, there is an approximate average increase/decrease of b in y. Interpretations Correlation coefficient: direction, strength, linear xy There is a direction, strength, linear of association between x and y.

The ages (in months) and heights (in inches) of seven children are given. x y Find the LSRL. Interpret the slope and correlation coefficient in the context of the problem.

Correlation coefficient: strong, positive, linear age and height of children There is a strong, positive, linear association between the age and height of children. Slope: age of one month increase.34 inches in heights of children. For an increase in age of one month, there is an approximate increase of.34 inches in heights of children.

The ages (in months) and heights (in inches) of seven children are given. x y Predict the height of a child who is 4.5 years old. Predict the height of someone who is 20 years old.

Extrapolation should notThe LSRL should not be used to predict y for values of x outside the data set. It is unknown whether the pattern observed in the scatterplot continues outside this range.

The ages (in months) and heights (in inches) of seven children are given. x y Calculate x & y. Plot the point (x, y) on the LSRL. Will this point always be on the LSRL?

Disk Game

Regression Assignment

non-resistant The correlation coefficient and the LSRL are both non-resistant measures.

Formulas – on chart

The following statistics are found for the variables posted speed limit and the average number of accidents. Find the LSRL & predict the number of accidents for a posted speed limit of 50 mph.

Matching Descriptions to Scatter Plots

Self Check #16

Regression Assignment #2

Residuals, Residual Plots, & Influential points

Residual plot A scatterplot of the (x, residual) pairs. Residuals can be graphed against other statistics besides x linear associationPurpose is to tell if a linear association exist between the x & y variables no pattern linearIf no pattern exists between the points in the residual plot, then the association is linear.

Linear Not linear

AgeRange of Motion One measure of the success of knee surgery is post-surgical range of motion for the knee joint following a knee dislocation. Is there a linear relationship between age & range of motion? Sketch a residual plot. Since there is no pattern in the residual plot, there is a linear relationship between age and range of motion x Residuals

AgeRange of Motion Plot the residuals against the y- hats. How does this residual plot compare to the previous one? Residuals

Residual plots are the same no matter if plotted against x or y-hat. x Residuals

Coefficient of determination- r 2 variationygives the proportion of variation in y that can be attributed to an approximate linear relationship between x & y remains the same no matter which variable is labeled x

AgeRange of Motion Let’s examine r 2. Suppose you were going to predict a future y but you didn’t know the x-value. Your best guess would be the overall mean of the existing y’s. SS y = Sum of the squared residuals (errors) using the mean of y.

AgeRange of Motion Now suppose you were going to predict a future y but you DO know the x-value. Your best guess would be the point on the LSRL for that x-value (y-hat). Sum of the squared residuals (errors) using the LSRL. SS y =

AgeRange of Motion By what percent did the sum of the squared error go down when you went from just an “overall mean” model to the “regression on x” model? SS y = SS y = This is r 2 – the amount of the variation in the y-values that is explained by the x-values.

AgeRange of Motion How well does age predict the range of motion after knee surgery? Approximately 30.6% of the variation in range of motion after knee surgery can be explained by the linear regression of age and range of motion.

Interpretation of r 2 r 2 % y xy Approximately r 2 % of the variation in y can be explained by the LSRL of x & y.

Computer-generated regression analysis of knee surgery data: PredictorCoefStdevTP Constant Age s = 10.42R-sq = 30.6%R-sq(adj) = 23.7% What is the equation of the LSRL? Find the slope & y-intercept. NEVER use adjusted r 2 ! before Be sure to convert r 2 to decimal before taking the square root! What are the correlation coefficient and the coefficient of determination?

Residual Notes and Assignment

Outlier – largeIn a regression setting, an outlier is a data point with a large residual

Influential point- A point that influences where the LSRL is located If removed, it will significantly change the slope of the LSRL

RacketResonance Acceleration (Hz) (m/sec/sec) One factor in the development of tennis elbow is the impact-induced vibration of the racket and arm at ball contact. Sketch a scatterplot of these data. Calculate the LSRL & correlation coefficient. Does there appear to be an influential point? If so, remove it and then calculate the new LSRL & correlation coefficient.

(189,30) could be influential. Remove & recalculate LSRL

(189,30) was influential since it moved the LSRL

Which of these measures are resistant? LSRL Correlation coefficient Coefficient of determination NONE NONE – all are affected by outliers

Self Check #17

Multiple Choice Test #6

Assignment #7