Presentation is loading. Please wait.

Presentation is loading. Please wait.

September 18-19, 2006 – Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development Conducting and interpreting multivariate analyses.

Similar presentations


Presentation on theme: "September 18-19, 2006 – Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development Conducting and interpreting multivariate analyses."— Presentation transcript:

1 September 18-19, 2006 – Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development Conducting and interpreting multivariate analyses using SPSS and Excel David Patterson, College of Social Work The University of Tennessee

2 September 18-19, 2006 - Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development 2 Regression Analysis of HMIS Data What is regression analysis? –AKA - Linear regression, Ordinary Least Squares (OLS) –Bivariate regression - measures the association or relationship between a dependent variable (DV) and an independent variable (IV). Estimates the measurable difference in the DV for each one-unit of change in an IV. –Multiple regression - Measures the relationship between a single DV and two or more IV.

3 September 18-19, 2006 - Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development 3 Regression Analysis of HMIS Data What can regression analysis tell us about HMIS data? –It can us understand possible causal relationship between certain outcomes (DV) and possible causal factors (IV). E.G., Length of stay prior to housing (DV) and age (IV1), duration of homelessness (IV2), and current income (IV3). Stated another way, how is length of stay prior to housing predicted independently by each IV and through their combined influence?

4 September 18-19, 2006 - Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development 4 Regression Analysis of HMIS Data Challenges of regression analysis with HMIS data –Level of measurement Many HMIS variables are NOT continuous variables, required for the DV in multiple regression. Most are nominal, e.g. race, zip code, gender, disability status. –High levels of missing data in many variables Common in social services data sets Requires careful evaluation of extent and pattern of missing data. Selection and implementation of missing data procedure Added complexity with nominal level data

5 September 18-19, 2006 - Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development 5 Regression Analysis of HMIS Data Assumptions of regression analysis –Normality - scores or observations obtained would be normally distributed in the population of interest. Assumed if sampling is random or includes random assignment. Generally not the case in HMIS data. (Explore with a frequency distribution). Note- Age is not quite normally distributed in this graph.

6 September 18-19, 2006 - Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development 6 Regression Analysis of HMIS Data Equality of Variances - Homoscedasticity –Points in the scatterplot of the residuals (difference between the observed and predicted values) are randomly distributed about a horizontal line from the mean of the residuals.

7 September 18-19, 2006 - Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development 7 Regression Analysis of HMIS Data Independence of Observations –Scores or observations are independent of each other. Independence means that the observations or values independently derived and that one event or value will not depend on another event or value. –Durbin-Watson statistic between 1.5 and 2.5

8 September 18-19, 2006 - Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development 8 Regression Analysis of HMIS Data Linearity - a linear relationship exists between variables. Evaluate with a scatterplot of DV and IV.

9 September 18-19, 2006 - Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development 9 Regression Analysis of HMIS Data Two Methods Compared

10 September 18-19, 2006 - Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development 10 Regression Analysis of HMIS Data Exploratory Research Question –Is there are relationship between duration of homelessness (days) and age, years of education, and weight? Method –Regression analysis with SPSS and Excel using two data sets. Intention –Demonstrate the utility of these two tools in regression analysis with HMIS data. –Demonstrate the challenges of regression analysis with HMIS data.

11 September 18-19, 2006 - Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development 11 Regression Analysis of HMIS Data Steps of SPSS Regression - N= 1550 1.Report downloaded from HMIS data system. 2.Data cleaned and file prepared in Excel. 3.Excel file opened in SPSS.

12 September 18-19, 2006 - Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development 12 Regression Analysis of HMIS Data Opening linear regression dialog window

13 September 18-19, 2006 - Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development 13 Regression Analysis of HMIS Data Linear regression dialog window for stats

14 September 18-19, 2006 - Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development 14 Regression Analysis of HMIS Data Linear regression dialog window for plots

15 September 18-19, 2006 - Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development 15 Regression Analysis of HMIS Data Correlations measure the strength of the relationship between two variables. Correlation values range between 1.0 and -1.0. The closer to zero, the weaker the correlation Note the weak correlations between the DV and each of the IV Mean = average value for each variable. Standard deviation = measures the dispersion of values from the mean. Together they describe the shape of the distribution for each variable

16 September 18-19, 2006 - Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development 16 Regression Analysis of HMIS Data Results - While the results may be significant, are there problems with the model? Note the R Square value. R Square indicates the proportion of variation in the DV explained by the IV. In this model, an R Square of.031 means that the 3 IV account for very little of the variance in the DV. The fact that the model is statistically significant may be due to the large N (1550).

17 September 18-19, 2006 - Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development 17 Regression Analysis of HMIS Data What do the plots tell us? Distribution of the DV is highly skewed. Departure from the straight line indicates the data are not normally distributed.

18 September 18-19, 2006 - Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development 18 Regression Analysis of HMIS Data Second SPSS regression analysis with smaller sample. –N= 626 –(Constrained data set limiting homelessness to > 1 month and < 1 year. Is the distribution normal or skewed? The F-stat used in regression to test the significance of the model, is quite robust to violations of the assumption of normality.

19 September 18-19, 2006 - Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development 19 Regression Analysis of HMIS Data Results Note the weak correlations between the DV and each of the IV. Note there is no statistically significant bivariate correlation between the DV and each of the IV.

20 September 18-19, 2006 - Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development 20 Regression Analysis of HMIS Data Note the regression model is not significant. The results suggest that for this sample age, years of education, and weight cannot be used to predict duration (days) of homelessness.

21 September 18-19, 2006 - Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development 21 Regression Analysis of HMIS Data What do the plots tell us? In this data set (N = 626), the distribution is much less skewed than the (N = 1550) data set. The data are more normally distributed than the (N = 1550) data set.

22 September 18-19, 2006 - Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development 22 Regression Analysis of HMIS Data Steps of Excel regression analysis 1.Report downloaded from HMIS data system. 2.Data cleaned and file prepared in Excel.

23 September 18-19, 2006 - Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development 23 Regression Analysis of HMIS Data Excel produced histograms to examine the shape of the distributions.

24 September 18-19, 2006 - Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development 24 Regression Analysis of HMIS Data Steps of Excel regression analysis –Can use the Chart Wizard to produce scatterplots to examine the bivariate correlation between the DV and the IV of the model. There is a weak correlation between the variables. The variables are not correlated.

25 September 18-19, 2006 - Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development 25 Regression Analysis of HMIS Data Steps of Excel regression analysis 1.Select Data Analysis under Tools in the menu bar. 2.If Data Analysis does not appear, then select Add- ins. The Analysis ToolPak.

26 September 18-19, 2006 - Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development 26 Regression Analysis of HMIS Data Steps of Excel regression analysis Specify the input range for the Y (IV) and the X (DV) variables Check boxes for all plots.

27 September 18-19, 2006 - Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development 27 Regression Analysis of HMIS Data Excel regression statistics are the same as the results from the second SPPS analysis.

28 September 18-19, 2006 - Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development 28 Regression Analysis of HMIS Data Excel Regression Plots Residual plots are used to check for regression assumptions. Significant patterns in the scatterplot suggest a violation of regression assumptions. Use to check for the Normality assumption.

29 September 18-19, 2006 - Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development 29 Regression Analysis of HMIS Data StatSPSSExcel R Square.002.00 F-stat.506.51 Sig..678.68 Durbin-Watson1.972Not reported Results Comparison

30 September 18-19, 2006 - Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development 30 Spreadsheet Data Analysis Resources


Download ppt "September 18-19, 2006 – Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development Conducting and interpreting multivariate analyses."

Similar presentations


Ads by Google