13-2 The Goal This chapter talks about methods for 1.Measuring linear correlation between two variables 2.Describing a linear relationship between two variables with a linear equation 3.Making predictions with linear regression model 4.Describing the usefulness of a linear regression model
13-3 Different Values of the Correlation Coefficient
13-4 Measure the linear Relationship: The Simple Correlation Coefficient The linear coefficient (or simple correlation coefficient) r is a numerical measure of the strength of the linear relationship between two variables representing quantitative data.
13-5 Interpret the Correlation Coefficient r If r>0 we say the two variables are positively correlated; if r<0 we say they are negatively correlated. If the absolute value of r is ≥ 0.8, we say the linear relationship is strong; If the absolute value of r is below 0.8 but ≥ 0.5, we say the linear relationship is moderate. If the absolute value of r is below 0.5, we say the linear relationship is weak.
13-6 Properties of the Correlation Coefficient r 1.The value of r is always between -1 and +1. 2. The value for r does not change if all values of either variables are converted to a different scale. 3. The value of r is not affected by the choice of x or y. Interchange all x and y values and the value of r will not change. 4. r measures the strength of a linear relationship. It is not designed to measure the strength of a relationship that is not linear.
13-7 The Simple Linear Regression Model and the Least Squares Point Estimates The dependent (or response) variable is the variable we wish to understand or predict, denoted by Y The independent (or predictor or explanatory ) variable is the variable we will use to understand or predict the dependent variable, denoted by X Regression analysis is a statistical technique that uses observed data to relate the dependent variable to one or more independent variables
13-8 Form of The Simple Linear Regression Model Y = β 0 + β 1 X + ε β 0 + β 1 X is the mean value of the dependent variable Y when the value of the independent variable is X. The mean is in the form of a linear function. The mean determines the overall trend of the relationship between X and Y. β 0 is the y-intercept, the mean of y when X is 0; β 1 is the slope, the change in the mean of Y per unit change in X ε is an error term that describes the effect on Y of all factors other than X y ̂ = b 0 + b 1 x, y ̂ is the estimate of mean value of Y when X=x
13-10 The Simple Linear Regression Model Illustrated
13-11 The Least Squares Point Estimates Estimation/prediction equation y ̂ = b 0 + b 1 x Least squares point estimate of the slope β 1
13-12 The Least Squares Point Estimates Continued Least squares point estimate of the y- intercept 0
13-13 Testing the Significance of the Slope A regression model is not likely to be useful unless there is a significant relationship between x and y To test significance, we use the null hypothesis: H 0 : β 1 = 0 Versus the alternative hypothesis: H a : β 1 ≠ 0
13-14 Testing the Significance of the Slope #2 AlternativeReject H 0 Ifp-Value H a : β 1 ≠ 0|t| > t α/2 * Twice area under t distribution right of |t| * That is t > t α/2 or t < –t α/2 t , t /2 and p-values are based on n–2 degrees of freedom
13-15 The Simple Coefficient of Determination and Correlation How useful is a particular regression model? One measure of usefulness is the simple coefficient of determination It is represented by the symbol r 2, because it is actually equal to the square of (simple) Correlation Coefficient which is denoted by r. It is interpreted as the percentage of variation in Y that could be explained by the linear regression line b 0 + b 1 x
13-16 Prediction To estimate the mean value of Y for X= x 0, one just need to plug x 0 into the regression line formula and calculate the estimate of Y by b 0 + b 1 x 0. We usually denote the estimated mean value of Y from the regression line by y ̂ = b 0 + b 1 x 0 and call y ̂ the fitted value for X= x 0. window