 # Describing Relationships Using Correlation and Regression

## Presentation on theme: "Describing Relationships Using Correlation and Regression"— Presentation transcript:

Describing Relationships Using Correlation and Regression
Chapter 10 Describing Relationships Using Correlation and Regression

Going Forward Your goals in this chapter are to learn:
How to create and interpret a scatterplot What a regression line is When and how to compute the Pearson r How to perform significance testing of the Pearson r The logic of predicting scores using linear regression and

Understanding Correlations

Correlation Coefficient
A correlation coefficient is a statistic that describes the important characteristics of a relationship It simplifies a complex relationship involving many scores into one number that is easily interpreted

Distinguishing Characteristics
A scatterplot is a graph of the individual data points from a set of X-Y pairs When a relationship exists, as the X scores increase, the Y scores change such that different values Y tend to be paired with different values of X

A Scatterplot Showing the Existence of a Relationship Between the Two Variables

Linear Relationships A linear relationship forms a pattern following one straight line The linear regression line is the straight line that summarizes a relationship by passing through the center of the scatterplot

Positive and Negative Relationships
In a positive linear relationship, as the X scores increase, the Y scores also tend to increase In a negative linear relationship, as the scores on the X variable increase, the Y scores tend to decrease

Scatterplot of a Positive Linear Relationship

Scatterplot of a Negative Linear Relationship

Nonlinear Relationships
In a nonlinear relationship, as the X scores increase, the Y scores do not only increase or only decrease: at some point, the Y scores alter their direction of change.

Scatterplot of a Nonlinear Relationship

Strength of a Relationship
The strength of a relationship is the extent to which one value of Y is consistently paired with one and only one value of X The larger the absolute value of the correlation coefficient, the stronger the relationship The sign of the correlation coefficient indicates the direction of a linear relationship

Correlation Coefficients
Correlation coefficients may range between –1 and +1. The closer to ±1 the coefficient is, the stronger the relationship; the closer to 0 the coefficient is, the weaker the relationship. As the variability in the Y scores at each X becomes larger, the relationship becomes weaker

Correlation Coefficient
A correlation coefficient tells you The relative degree of consistency with which Ys are paired with Xs The variability in the group of Y scores paired with each X How closely the scatterplot fits the regression line The relative accuracy of prediction

A Perfect Correlation (±1)

Intermediate Strength Correlation

No Relationship

The Pearson Correlation Coefficient

Pearson Correlation Coefficient
Describes the linear relationship between two interval variables, two ratio variables, or one interval and one ratio variable. The computing formula is

Step-by-Step Step 1. Compute the necessary components:

Step-by-Step Step 2. Use these values to compute the numerator
Step 3. Use these values to compute the denominator and then divide to find r

Significance Testing of the Pearson r

Two-Tailed Test of the Pearson r
Statistical hypotheses for a two-tailed test This H0 indicates the r value we obtained from our sample is because of sampling error The sampling distribution of r shows all possible values of r that occur when samples are drawn from a population in which r = 0

Two-Tailed Test of the Pearson r

Two-Tailed Test of the Pearson r
Find appropriate rcrit from the table based on Whether you are using a two-tailed or one-tailed test Your chosen a The degrees of freedom (df) where df = N – 2, where N is the number of X-Y pairs in the data If robt is beyond rcrit, reject H0 and accept Ha Otherwise, fail to reject H0

One-Tailed Test of the Pearson r
One-tailed, predicting positive correlation One-tailed, predicting negative correlation

An Introduction to Linear Regression

Linear Regression Linear regression is the procedure for predicting unknown Y scores based on known correlated X scores. X is the predictor variable Y is the criterion variable The symbol for the predicted Y score is (pronounced Y prime)

Linear Regression The equation that produces the value of at each X and defines the straight line that summarizes the relationship is called the linear regression equation.

Proportion of Variance Accounted For
The proportion of variance accounted for describes the proportion of all differences in Y scores that are associated with changes in the X variable The proportion of variance accounted for equals

Example 1 For the following data set of interval/ratio scores, calculate the Pearson correlation coefficient. X Y 1 8 2 6 3 4 5

Example 1 Pearson Correlation Coefficient
Determine N Calculate Insert each value into the following formula and

Example 1 Pearson Correlation Coefficient
Y Y 2 XY 1 8 64 2 4 6 36 12 3 9 18 16 5 25 20 SX = 21 SX 2 = 91 SY = 29 SY 2 = 171 SXY = 81

Example 1 Pearson Correlation Coefficient

Example 2 Significance Test of the Pearson r
Conduct a two-tailed significance test of the Pearson r just calculated. Use a = .05. df = N – 2 = 6 – 2 = 4 rcrit = 0.811 Since robt of –0.88 falls beyond the critical value of –0.811, reject H0 and accept Ha. The correlation in the population is significantly different from 0

Example 3 Proportion of Variance Accounted For
Calculate the proportion of variance accounted for, using the given data. Proportion of variance accounted for is