2 Some Vocabulary Response Variable Explanatory Variable Scatterplot Measures an outcome of a studyAKA dependent variableExplanatory VariableAttempts to explain the observed outcomesAKA independent variableScatterplotShows the relationship between two quantitative variables measured on the same individuals
4 Scatterplots Examining Drawing Categorical Variables Look for overall pattern and any deviationsDescribe pattern with form, strength, and directionDrawingUniformly scale the vertical and horizontal axesLabel both axesAdopt a scale that uses the entire available gridCategorical VariablesAdd a different color/shape to distinguish between categorical variablesClasswork p125 #3.7,Homework: #3.16, 3.22 and 3.2 Blueprint
6 CorrelationMeasures the direction and strength of the linear relationship between two quantitative variables
7 Facts About Correlation Makes no distinction between explanatory and response variablesRequires both variables be quantitativeDoes not change units when we change units of measurementSign of r indicates positive or negative associationr is inclusive from -1 to 1Only measures strength of linear relationshipsIs not resistant
8 Correlation Guessing Game In Class Exercisesp146 #3.28, 3.34 and 3.37Correlation Guessing GameHomework3.3 Blueprint
11 Regression Regression Line LSRL of y on x Describes how a response variable y changes as an explanatory variable x changesLSRL of y on xMakes the sum of the squares of the vertical distances of the data points from the line as small as possibleLine should be as close as possible to the points in the vertical directionError = Observed (Actual) – Predicted
13 Coefficient of determination – r2 The fraction of the variation in the values of y that is explained by the least-squares regression of y on xMeasures the contribution of x in predicting yIf x is a poor predictor of y, then the sum of the squares of the deviations about the mean (SST) and the sum of the squares of deviations about the regression line (SSE) would be approximately the same.
14 Understanding r-squared: A single point simplification Al CoonsBuckingham Browne & Nichols SchoolCambridge, MA
15 yError w.r.t. mean modelError eliminated by y-hat modelProportion of error eliminated by Y-hat modelError eliminated by y-hat model=Error w.r.t. mean modelr2 = proportion of variability accounted for by the given model (w.r.t the mean model).
16 y ~ Error w.r.t. mean model Error eliminated by y-hat model Proportion of error eliminated by Y-hat modelError eliminated by y-hat model=Error w.r.t. mean model=~
17 Facts about Least-Squares Regression Distinction between explanatory and response variables is essentialA change of one standard deviation in x corresponds to a change of r standard deviations in yLSRL always passes through the pointThe square of the correlation is the fraction of the variation in the values of y that is explained by the least-squares regression of y on xClasswork: Transformations and LSRL WSHomework: #3.39 and ABS Matching to Plots Extension Question (we’ll finish the others in class)
18 Residuals observed y – predicted y or Positive values show that data point lies above the LSRLThe mean of residuals is always zero
19 Residual PlotsA scatterplot of the regression residuals against the explanatory variableHelps us assess the fit of a regression lineWant a random patternWatch for individual points with large residuals or that are extreme in the x direction
20 Outliers vs. Influential Observations An observation that lies outside the overall pattern of the other observationsInfluential observationRemoving this point would markedly change the result of the calculationClasswork: Residual Plots WSHomework: p177 #3.52 and 3.61
21 Doctor’s for the Poor This will be graded for accuracy Doctor’s for the Poor This will be graded for accuracy! Ch 3 Review p176 # , 3.56, 3.59, 3.69,