Download presentation

Presentation is loading. Please wait.

1
**PPA 415 – Research Methods in Public Administration**

Lecture 10 – Linear Regression and Correlation

2
Scattergrams The first step in examining a relationship between interval-ratio variables is to prepare a scattergram. A scattergram, like a bivariate table, has two dimensions. The scores on the independent variable (X) are arrayed along the horizontal axis. The scores on the dependent variables (Y) are arrayed along the vertical axis. Each dot in the scattergram represents the X and Y scores for a case.

3
Scattergrams The pattern can be enhanced by drawing a straight line to represent the data that is as close as possible to all of the points in the scattergram. Two variables are associated if the values of Y are conditional on the values of X (increase or decrease depending on X).

4
Scattergrams The strength of the relationship can be visualized by examining how tightly the points fit around the line. Positive relationships will slope up; Negative relationships will slope down. Zero relationships will appear to be a cloud of random points.

5
**Scattergrams – Positive Relationship**

6
**Scattergrams – Negative Relationship**

7
**Scattergram – Little or No Relationship**

8
**Regression and Prediction**

One key assumption underlying the statistical techniques to be discussed here is that the two variables have an essentially linear relationship. That is, the scattergram approximates a straight line. Non-linearity requires adjustments to the model that will not be discussed in this course.

9
**Regression and Prediction**

10
**Regression and Prediction**

The mean of any distribution of scores is the point around which the variation of scores is a minimum (i.e., no other point provides a smaller minimum). The same is true of conditional means.

11
**Regression and Prediction**

The best fitting line will go as close as possible to the conditional means since the conditional means will rarely lie in a straight line. The formula for the best fitting line is:

12
**Regression and Prediction**

The Y intercept is the point at which the regression line crosses the Y axis. The slope of the least-squares regression line is the amount of change in the dependent variable (Y) that is produced by a unit change in the independent variable (X). You can use the formula for the regression line to predict scores not in the data set.

13
**Regression and Prediction**

14
**Regression and Prediction**

15
**Regression and Prediction**

16
**The Correlation Coefficient (Pearson’s r)**

The slope of the regression line is a measure of the effect of X on Y. Because it is measure in the units of measurement of Y, it is not restricted to fall between zero and one. As a result, researchers rely on Pearson’s r as a measure of interval-ratio association. Varies from -1 to 1 with 0 equaling no association.

17
**The Correlation Coefficient (Pearson’s r)**

18
**The Correlation Coefficient (Pearson’s r)**

19
**The Correlation Coefficient (Pearson’s r)**

The r suggests a moderately strong, positive relationship.

20
**Interpreting the Correlation Coefficient: r2**

The interpretation of Pearson’s r focuses on r2, the coefficient of determination. The coefficient of determination is based on the following formula for r-squared.

21
**Interpreting the Correlation Coefficient: r2**

R-squared can be interpreted in two ways: Percentage of total variation explained by the independent variable. Proportional reduction in error. Thus, the r-squared in the ideology-party ID problem calculated on slide 19 is Ideology explains 25.65% of the variation in party identification. Knowing a person’s ideology reduces the error in predicting their party identification by 25.65%. The unexplained variation is represented by the scatter of points around the regression line.

22
**The Five-Step Model for Testing Pearson’s r.**

Step 1. Making assumptions. Random sampling. Interval-ratio measurement. Bivariate normal distribution. Linear relationship. Homoscedasticity. Normal sampling distribution.

23
**The Five-Step Model for Testing Pearson’s r.**

Step 2. Stating the null hypothesis. H0: ρ = 0.0. H1: ρ > 0.0. Step 3. Selecting the sampling distribution and establishing the critical region. Sampling distribution = t distribution. Alpha=.05, one-tailed. Degrees of freedom=N-2=10-2=8. T (critical) =

24
**The Five-Step Model for Testing Pearson’s r.**

Step 4. Computing the test statistic.

25
**The Five-Step Model for Testing Pearson’s r.**

Step 5. Making a decision. T(obtained) is less than t(critical). Therefore, we cannot reject the null hypothesis that the relationship between ideology and party ID is zero in the population.

Similar presentations

© 2019 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google