Looking at Data-Relationships 2.1 –Scatter plots.

Presentation on theme: "Looking at Data-Relationships 2.1 –Scatter plots."— Presentation transcript:

Looking at Data-Relationships 2.1 –Scatter plots

Definitions Scatter plot-shows relationship between two quantitative variables measured on the same individuals Explanatory variable-a variable may explain or even cause changes in another Response variable-a variable changes with explanatory variables Scatter plot axis –x axis(explanatory variable), y axis(response variable) Examining a scatter plot – Overall pattern(linear, non-linear, quadratic, etc) and deviations – Overall pattern of scatter plot by form( line, parabola),direction( positive, negative), and strength( strong, weak) of the relationship – An important kind of deviation is an outlier – Positive association(high values of the two variables tend to occur together) – Negative association(high values of one variable tend to occur with low values of the other variable) – Strength-the strength of a relationship is determined by how close the points in the scatter plot lie to simple form such as line

Prep work  Do problem 2.7 in the text book. Store second-test scores in list L3 & Final-exam score in list L4  Do problem 2.11

Looking at Data-Relationships 2.2 –Correlation

Definitions Correlation r- measures the direction and strength of the linear(straight line) association between two quantitative variables x & y You can calculate a correlation for any scatter plot, r measures only linear relationships r>0 ->positive association r negative association r between -1 & 1 including endpoints Perfect correlation, r=+ or – 1 occurs only when the points lie exactly on a straight line Formula for the correlation coefficient between x & y- standard deviation of x= Correlation ignores the distinction between explanatory and response variables Correlation not resistant-outliers can greatly change the value of r

Prep work- – Do problem 2.29. Store price in list L5 & deforestation in list L6

Looking at Data-Relationships 2.3 –Least-Squares Regression

Definitions Regression line-  a straight line that describes the relationship between x & y  Requires an explanatory variable & a response variable Fitting a line to data- Extrapolation- Use of a regression line for prediction far outside the range of values of the explanatory variable x used to obtain the line

Definitions Least-squares regression line of y on x-  a line that makes the sum of the squares of the vertical distances of the data points from the line as small as possible  Requires an explanatory variable & a response variable Equation of the Least-squares regression line - in regression- is the fraction of the variation in the values of y that is explained by the least-squares regression of y on x

Looking at Data-Relationships 2.4 –Cautions about Correlation and Regression

Definitions Residuals  Difference between an observed value of the response and the value predicted by the regression line  Requires an explanatory variable & a response variable Residual equation Special property: the mean of the least-squares residuals is always zero Residual Plots: a scatter plot of regression residual against the explanatory variable. Help us to assess the fit of a regression line

Definitions Outlier- An observation that lies outside the overall pattern Influential observations-if removed it would change the result of the calculation Lurking variable: a variable that is not among explanatory or response variables but yet may influence the interpretation of relationships among those variables Association does not imply causation

Prep work-Brain Activity vs. Empathy score example Will women who are higher in empathy respond more strongly when their partner has a painful experience? 1)Store empathy scores in list L1 & Brain activity in list L2 2)Use the TI-84 to find the equation of the least-squares regression line of brain activity on empathy score (use 4 decimals for coefficients) 3)Use the equation to predict the empathy score for subject 1 4)Find the residual for subject 1 5)Subject 16 can be considered as a possible outlier, find the equation of the least- squares regression line of brain activity on empathy score without this outlier

Looking at Data-Relationships 2.6 –The Question of Causation

Definitions Some observed associations between two variables are due to a cause-and-effect relationship between these variables, but others are explained by lurking variables The effect of lurking variables can operate through common response if changes in both explanatory and response variables are caused by changes in lurking variables. Confounding of two variables(either explanatory or lurking variables) means that we cannot distinguish their effects on the response variables