Beyond Bivariate: Exploring Multivariate Analysis.

Slides:



Advertisements
Similar presentations
Selecting a Data Analysis Technique: The First Steps
Advertisements

Relationships Between Two Variables: Cross-Tabulation
4.7 The coefficient of determination r2
The Regression Equation  A predicted value on the DV in the bi-variate case is found with the following formula: Ŷ = a + B (X1)
Soc 3306a Lecture 6: Introduction to Multivariate Relationships Control with Bivariate Tables Simple Control in Regression.
Multiple Regression Fenster Today we start on the last part of the course: multivariate analysis. Up to now we have been concerned with testing the significance.
Regression With Categorical Variables. Overview Regression with Categorical Predictors Logistic Regression.
CORRELATION. Overview of Correlation u What is a Correlation? u Correlation Coefficients u Coefficient of Determination u Test for Significance u Correlation.
Introduction to Logistic Regression. Simple linear regression Table 1 Age and systolic blood pressure (SBP) among 33 adult women.
Multivariate Data Analysis Chapter 4 – Multiple Regression.
Inferential Statistics  Hypothesis testing (relationship between 2 or more variables)  We want to make inferences from a sample to a population.  A.
Social Research Methods
DTC Quantitative Research Methods Three (or more) Variables: Extensions to Cross- tabular Analyses Thursday 13 th November 2014.
Elaboration Elaboration extends our knowledge about an association to see if it continues or changes under different situations, that is, when you introduce.
Multiple Regression 2 Sociology 5811 Lecture 23 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Multiple Regression – Basic Relationships
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
Crosstabs. When to Use Crosstabs as a Bivariate Data Analysis Technique For examining the relationship of two CATEGORIC variables  For example, do men.
Generalized Linear Models
SW388R7 Data Analysis & Computers II Slide 1 Multiple Regression – Basic Relationships Purpose of multiple regression Different types of multiple regression.
SPSS Session 4: Association and Prediction Using Correlation and Regression.
Hypothesis Testing. Outline The Null Hypothesis The Null Hypothesis Type I and Type II Error Type I and Type II Error Using Statistics to test the Null.
Leedy and Ormrod Ch. 11 Gray Ch. 14
Chapter 15 – Elaborating Bivariate Tables
Categorical Data Prof. Andy Field.
Soc 3306a Lecture 10: Multivariate 3 Types of Relationships in Multiple Regression.
Bivariate Relationships Analyzing two variables at a time, usually the Independent & Dependent Variables Like one variable at a time, this can be done.
Multiple Regression 1 Sociology 5811 Lecture 22 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Soc 3306a Multiple Regression Testing a Model and Interpreting Coefficients.
Chapter 9 Analyzing Data Multiple Variables. Basic Directions Review page 180 for basic directions on which way to proceed with your analysis Provides.
Chapter 10: Relationships Between Two Variables: CrossTabulation
Interactions POL 242 Renan Levine March 13/15, 2007.
Multiple Regression Lab Chapter Topics Multiple Linear Regression Effects Levels of Measurement Dummy Variables 2.
Agenda Review Homework 5 Review Statistical Control Do Homework 6 (In-class group style)
Multiple Regression. Multiple Regression  Usually several variables influence the dependent variable  Example: income is influenced by years of education.
Logistic Regression July 28, 2008 Ivan Katchanovski, Ph.D. POL 242Y-Y.
Chapter 16 Data Analysis: Testing for Associations.
Correlation & Regression Correlation does not specify which variable is the IV & which is the DV.  Simply states that two variables are correlated. Hr:There.
The Chicago Guide to Writing about Multivariate Analysis, 2 nd edition. Interpreting multivariate OLS and logit coefficients Jane E. Miller, PhD.
Review for Final Examination COMM 550X, May 12, 11 am- 1pm Final Examination.
Psychology 820 Correlation Regression & Prediction.
Multivariate Analysis Richard LeGates URBS 492. The Elaboration Model History –Developed by Paul Lazarfeld at Columbia in 1946 –Based on Stouffers’ research.
Commonly Used Statistics in the Social Sciences Chi-square Correlation Multiple Regression T-tests ANOVAs.
© 2006 by The McGraw-Hill Companies, Inc. All rights reserved. 1 Chapter 12 Testing for Relationships Tests of linear relationships –Correlation 2 continuous.
Statistics in Applied Science and Technology Supplemental: Elaborating Crosstabs: Adding a Third Variable.
Stats: Getting Started
Chapter 10: Cross-Tabulation Relationships Between Variables  Independent and Dependent Variables  Constructing a Bivariate Table  Computing Percentages.
1 Multivariable Modeling. 2 nAdjustment by statistical model for the relationships of predictors to the outcome. nRepresents the frequency or magnitude.
Warsaw Summer School 2015, OSU Study Abroad Program Advanced Topics: Interaction Logistic Regression.
Multiple Regression  Similar to simple regression, but with more than one independent variable R 2 has same interpretation R 2 has same interpretation.
APPLIED DATA ANALYSIS IN CRIMINAL JUSTICE CJ 525 MONMOUTH UNIVERSITY Juan P. Rodriguez.
Reading article tables Klandermans, Wood & Hughes, McAdam “High Risk” model.
Copyright © 2010 by Nelson Education Limited. Elaborating Bivariate Tables.
Chapter 6 – 1 Relationships Between Two Variables: Cross-Tabulation Independent and Dependent Variables Constructing a Bivariate Table Computing Percentages.
Applied Quantitative Analysis and Practices LECTURE#28 By Dr. Osman Sadiq Paracha.
Bivariate Correlation and Regression. For Quantitative variables measured on the Interval Level Which one is the First? –Correlation or Regression.
(Slides not created solely by me – the internet is a wonderful tool) SW388R7 Data Analysis & Compute rs II Slide 1.
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 3 Multivariate analysis.
RESEARCH METHODS Lecture 32. The parts of the table 1. Give each table a number. 2. Give each table a title. 3. Label the row and column variables, and.
Other tests of significance. Independent variables: continuous Dependent variable: continuous Correlation: Relationship between variables Regression:
Simple Bivariate Regression
Final Project Reminder
Lecture 10 Regression Analysis
Hypothesis Testing.
Final Project Reminder
REGRESSION (R2).
Bi-variate #1 Cross-Tabulation
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
Multiple Regression – Split Sample Validation
Regression and Categorical Predictors
Presentation transcript:

Beyond Bivariate: Exploring Multivariate Analysis

3 Topics Covered 1.Logic of introducing a third variable 2.Multiple linear regression: Which independent (predictor) variables are significantly related to the dependent (outcome) variable? 3.Logistic regression: Binary outcome variable

A Focal Relationship Residential mobility and school achievement This is a negative or inverse relationship: Higher residential mobility  Low achievement WHY?

The 0-Order Bivariate Relationship We are going to call our initial bivariate relationship the 0-order relationship: Residential mobility  School achievement

Spurious Relationship/Explanation Could there be variables that are associated with high levels of residential mobility and with low school achievement, creating an apparent but spurious relationship between residential mobility and achievement — thus EXPLAINING AWAY the initial bivariate relationship?

Spurious Relationship Do taller people like action movies more than shorter people do?  What is the third variable? Do days of high lemonade sales have more drowning fatalities than days with low lemonade sales?  What is the third variable?

Intervening Variables: Interpretation What variables can you suggest that “go in between” residential mobility and school achievement that might help us understand our focal relationship better?  These intervening variables do NOT explain away the relationship — they clarify why/how it comes about.

Intervening Variables: Interpretation Examples Why do women have lower incomes than men?  Maybe they have not acquired the technical and managerial skills that men have.  Maybe they are less interested in promotions into management than men are. (These interpretations suggest that gender discrimination in salary decisions is not the only reason women have lower incomes than men.)

The Difference between Interpretation (Intervening) and Explanation (Spurious) Gender  height  movie preferences  Gender, the third variable, explains away the spurious height  movie preference relationship. Gender  career choices  income  Career choices, the intervening third variable, contributes to interpreting the initial relationship between gender and income.

Specification or Interaction Effects Sometimes when we introduce a third variable, we find that the initial bivariate (0-order) relationship is different for different categories of the third variable.

Specification: Examples [1] In research on school achievement we (Prof. Bootcheck and I) looked at the relationship between living in a nuclear family and grades.  For whites, this relationship was positive.  For all other racial-ethnic categories, there was no relationship.

Specification: Examples [2] Can you think of a variable we could introduce into our statistical analysis technique of the relationship between residential mobility and school achievement that might have different bivariate relationships (one strong, one absent) for different categories of the third variable?

Specification in a Crosstab In a crosstab, this specification or interaction effect would show up as a strong/significant relationship in one of the tables for the layer variable (the third variable), and it would be “Not Significant” in the table for the other category of the layer variable. In other words, the chi-square for one partial table is significant, but it is not significant for the other partial.

Suppressed Effects [1] Introducing a third variable can reveal its suppressed effects, which work in opposing directions, cancelling each other out. Fictitious example: Religious intensity and death penalty views  0-order: There appears to be no relationship.

Suppressed Effects [2]  When we introduce region (north or south), we see that the effects are opposite:  For people living in the north of this fictitious country, high religious intensity goes with opposition to the death penalty.  For people living in the south, high religious intensity goes with support for the death penalty.  The two inverse or opposed relationships cancel each other out, unless we break the data down by the regional variable.

Final Possibility: Replication It is possible that the initial bivariate relationship persists when we introduce the third variable.  The partial tables for the categories of the third (layer) variable look just the same as the initial two-variable table.

Multivariate or Multiple Linear Regression  We specify two or more independent variables.  Each may have a significant and maybe moderate or even strong correlation with the dependent variable.  When they are placed in the regression model, “only the strongest survive.”  If they do not have a relationship with the DV independent of their relationship with each other, they will not be significant in the model.

Examples from the Country Data Set Look at adjusted R 2.  Which variables have significant coefficients?  What do the relative sizes of the betas tell you? Hard to visualize. Building models—all variables entered at the same time or stepwise. See Nardi (2006, p. 97), which is cited in Garner (2010, p. 333).

Logistic Regression [1] Currently, logistic regression is a very popular statistical analysis!  It involves a dichotomous (or binary) outcome variable. We can compute an overall odds ratio for the two possible outcomes of this variable.  It involves examining predictor variables (IVs) to see if each one is related to a change in the odds ratio from its overall level. EXAMPLE: Does growing up in a bilingual family raise or lower an individual’s probability of completing high school, compared to the overall odds of doing so?

Logistic Regression [2] Independent variables need to be interval-ratio or dummied variables (categoric variable broken down into binary variables). Alert: Which categories are defined as 0 and 1 for all the binary variables? Negative coefficients mean lower odds. The odds ratio falls below 1.

Logistic Regression: Example 1 Are income, race-ethnicity, gender, region, and religion related to a vote for the Republican presidential candidate?  What characteristics raise the odds and which lower the odds of a Republican vote?  Which categories are labelled 1? Which 0? (This will make a difference in how to read the table of coefficients.)

Logistic Regression: Example 2 What individual characteristics are related to experiencing foreclosure on one’s home? Binary outcome = foreclosed or not foreclosed so logistic regression Contrast this to a question that could be answered with linear regression.  What neighbourhood characteristics are related to a high foreclosure rate?