# Selecting a Data Analysis Technique: The First Steps

## Presentation on theme: "Selecting a Data Analysis Technique: The First Steps"— Presentation transcript:

Selecting a Data Analysis Technique: The First Steps

Bivariate Analysis (two variables)
The questions we want to answer are these: Are these variables related, or are they independent of each other? Does the variability of one distribution tell us anything about the variability of the other? To select the right technique for answering these questions, we first have to determine THE LEVEL OF MEASUREMENT OF THE VARIABLES.

Both Variables are Categoric
If both variables are categoric variables (nominal, ordinal, or dichotomous), then we examine their relationship using Crosstabs (we make a table) Chi-square (test of significance) Measures of association

Both Variables are Interval-Ratio
If both variables are interval-ratio variables (and, for the independent variable, that can also include dichotomous “dummy” variables) Look at the scatterplot. Does it look linear? Use linear regression analysis: correlation coefficient (r) ordinary least squares regression coefficient (Is it significant?) the coefficient of determination (R2).

Independent Variable is Categoric, Dependent Variable is Interval-Ratio
ANOVA: Analysis of variance, comparing variance within groups and variance between groups Special type of “compare means” procedure (Are the means of the dependent variable different among the independent variable categories?) F-test of significance

Interval-Ratio Independent Variables, Dichotomous Dependent Variable
Use Logistic regression. (This is not bivariate, but we’re just looking ahead.)

Selecting Data Analysis Techniques: Examples
Is an individual’s religious choice (e.g., atheist, Buddhist, Catholic) related to self-description as an “adventurous eater” (agree, not sure, disagree)? Crosstabs Are countries’ suicide rates related to their homicide rates? Regression analysis Do individuals with different sexual orientations (heterosexual, gay/lesbian, bisexual) have different (mean) GPAs at this university? ANOVA Is residential mobility (a lot, some, little) related to performance on standardized achievement tests? ANOVA

Logistic Regression Examples (not bivariate)
Are income, years of education, gender, and race-ethnicity (changed into 0/1 dummy variables) related to voting Republican or not- Republican? Are mother’s years of education, respondent’s household income, and % below poverty line at the high school attended related to a dichotomous variable: childbearing before high school graduation or not.

SPSS/PASW: What to Click
Both variables categoric: Analyze–Descriptive Statistics–Crosstabs. Both variables interval-ratio: Graphs–Legacy Dialogs–Scatter/Dot (for scatterplot) and then Analyze–Regression–Linear and Analyze– Correlate. Categoric IV (with more than two categories) and interval-ratio DV: Analyze–Compare Means–One Way ANOVA.

Warning: A Relationship Does NOT Mean Causality!
When we find that two variables are related to each other (using one of the data analysis techniques), that does NOT necessarily mean that the independent variable is a cause of the dependent variable. What do we mean by “cause”? To be continued….