Scatter Diagrams and Linear Correlation

Slides:



Advertisements
Similar presentations
Residuals.
Advertisements

Chapter 3 Bivariate Data
AP Statistics Chapters 3 & 4 Measuring Relationships Between 2 Variables.
Looking at Data-Relationships 2.1 –Scatter plots.
CHAPTER 3 Describing Relationships
C HAPTER 3: E XAMINING R ELATIONSHIPS. S ECTION 3.3: L EAST -S QUARES R EGRESSION Correlation measures the strength and direction of the linear relationship.
September In Chapter 14: 14.1 Data 14.2 Scatterplots 14.3 Correlation 14.4 Regression.
1 Chapter 3: Examining Relationships 3.1Scatterplots 3.2Correlation 3.3Least-Squares Regression.
Ch 3 – Examining Relationships YMS – 3.1
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
AP STATISTICS LESSON 3 – 3 LEAST – SQUARES REGRESSION.
Chapter 3 Section 3.1 Examining Relationships. Continue to ask the preliminary questions familiar from Chapter 1 and 2 What individuals do the data describe?
3.3 Least-Squares Regression.  Calculate the least squares regression line  Predict data using your LSRL  Determine and interpret the coefficient of.
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.2 Least-Squares.
Examining Bivariate Data Unit 3 – Statistics. Some Vocabulary Response aka Dependent Variable –Measures an outcome of a study Explanatory aka Independent.
CHAPTER 5 Regression BPS - 5TH ED.CHAPTER 5 1. PREDICTION VIA REGRESSION LINE NUMBER OF NEW BIRDS AND PERCENT RETURNING BPS - 5TH ED.CHAPTER 5 2.
Chapter 3-Examining Relationships Scatterplots and Correlation Least-squares Regression.
Chapter 2 Examining Relationships.  Response variable measures outcome of a study (dependent variable)  Explanatory variable explains or influences.
^ y = a + bx Stats Chapter 5 - Least Squares Regression
CHAPTER 3 Describing Relationships
CHAPTER 5: Regression ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
Describing Relationships. Least-Squares Regression  A method for finding a line that summarizes the relationship between two variables Only in a specific.
CHAPTER 3 Describing Relationships
Describing Relationships
CHAPTER 3 Describing Relationships
Statistics 101 Chapter 3 Section 3.
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
AP Stats: 3.3 Least-Squares Regression Line
Least-Squares Regression
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
^ y = a + bx Stats Chapter 5 - Least Squares Regression
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Least-Squares Regression
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Least-Squares Regression
Chapter 3: Describing Relationships
Least-Squares Regression
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Warmup A study was done comparing the number of registered automatic weapons (in thousands) along with the murder rate (in murders per 100,000) for 8.
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
3.2 – Least Squares Regression
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapters Important Concepts and Terms
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Honors Statistics Review Chapters 7 & 8
Review of Chapter 3 Examining Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Presentation transcript:

Scatter Diagrams and Linear Correlation Chapter 1-3 single variable data Examples or two variables: age of person vs. time to master cell phone task , grade point average vs. time studying, grade point average vs. time playing video games, amount of smoking vs. rate of lung cancer Scatter diagram: (x,y) data plotted as individual points x – explanatory variable (independent) y – response variable (dependent) Evaluate scatterplot data y vs x values – shows relationship between 2 quantitative variables measured on the same individual

Scatter Diagrams and Linear Correlation Look at overall pattern Any striking deviation (outliers)? Describe by a) form (linear or curved) b) direction - positively associated +slope negatively associated – slope c) strength - how closely do points follow form Examples: age of person vs. time to master cell phone task , grade point average vs. time studying, grade point average vs. time playing video games, amount of smoking vs. rate of lung cancer

Degrees of correlation

Scatter Diagrams and Linear Correlation Tips for drawing scatterplot Scale axis: intervals for each axis must be the same; scale can be different for each axis Label both axis Adopt a scale that uses entire grid (do not compress plot into 1 corner of grid

Scatter Diagrams and Linear Correlation Correlation coefficient (r) Assesses strength and direction of linear relationship between x and y. Unit less -1≤ r ≤ 1 r = -1 or 1 perfect correlation (all points exactly on the line) Closer to 1or -1; better line describes relationship; better fit of data r > 0 positive association at x, y  r < 0 negative association a x , y  x and y are interchangeable in calculating r r does not change if either (or both) variables have unit changes (inches to cm, or F to C)

Linear and non-linear correlations

Scatter Diagrams and Linear Correlation r = 1 Σ( x-x . y-y_) n-1 sx sy Using TI-83 ex p.129 (number of police vs. muggings) Cautions : Association does not imply causation Lurking variables may play rate r only good for linear models Correlation between averages higher than between individual point.

Scatter Diagrams and Linear Correlation Facts No distinction between x and y variable. The value of r is unaffected by switching x and y Both x and y must be quantitative Only good for linear relationships Not resistant to outliers Correlation or r is not a complete description of 2-variable data, the x and y standard deviations and means should be included HW: p131 2,4,6,8 a,b,c, 10 a,b,c, 12 a,b,c For “c” use calculator to compute r

4.2 Least Squares Regression Method for finding a line (best fit) that summarizes the relationship between 2 variables a x (explanatory) and y (response) Use the line to predict value of y for a given x Must have specific response variable y and explanatory variable x (cannot switch like r)

4.2 Least Squares Regression Least Squares Regression Line (LSRL) Minimizes square of error (y-values) Error = observed –predicted value Σ(y-ŷ)2 (y actual value, ŷ is predicted value) (ŷ is called y hat) Line of y on x that makes the sum of the squares of data points to fitted line as small as possible

4.2 Least Squares Regression LSRL Equation ŷ = a + bx ŷ predicted value of y Slope b = r(sy/sx) y – intercept a = y – bx x and y are means for all x and y data, respectively and are on the LSLR (x, y) sy sx are std. deviations of x,y data r correlation

4.2 Least Squares Regression TI-83 – enter data into L1, L2 (x,y) Use STAT CALC , select #8:LinReg(a+bx) to get the best fit required Slope: important for interpretation of data Rate of change of y for each increase of x Intercept – may not be practically important for problems.

4.2 Least Squares Regression Plot LSLR: using formula ŷ = a + bx find 2 values on the line. (x1, ŷ1) and (x2, ŷ2) make sure x1 and x2 are near opposite ends of the data Influential observations and outliers Influential – extreme in the x-direction if we remove an influential point it will affect the LSLR significantly Outliers – extreme in the y-direction does not significantly change the LSLR

Coefficient of Determination r2 – coefficient of determination r – describes the strength and direction of a straight line relationship r2 - fraction of variation in values of y that is explained by LSRL of y on x r = 1, r2 = 1 perfect correlation 100% of the variation explained by LSRL r = 0.7, r2 = 0.49 about 49% of y is explained by LSLR

Residuals Residuals – difference between observed value and predicted value Residual = y –ŷ Mean of least square residuals = 0 Residual plots – scatterplot of regression residuals against explanatory variable (x) Useful in accessing fit of regression line i.e. do we have a straight line? Linear –uniform scatter Curved indicates relationship not linear Increasing/ decreasing indicates predicting of y will be less accurate for larger x