Presentation on theme: "APPLIED DATA ANALYSIS IN CRIMINAL JUSTICE CJ 525 MONMOUTH UNIVERSITY Juan P. Rodriguez."— Presentation transcript:
APPLIED DATA ANALYSIS IN CRIMINAL JUSTICE CJ 525 MONMOUTH UNIVERSITY Juan P. Rodriguez
Perspective Research Techniques Accessing, Examining and Saving Data Univariate Analysis – Descriptive Statistics Constructing (Manipulating) Variables Association – Bivariate Analysis Association – Multivariate Analysis Comparing Group Means – Bivariate Multivariate Analysis - Regression
Lecture 6 Comparing Group Means Bivariate Analysis
Relationships between categorical and numerical variables ANOVA: Compares group means Test for significance Bar Charts and Box Plots Tests for Differences in means
One Way ANOVA How much the Mean Values of a Numerical Variable differ among the categories of a categorical variable
One Way ANOVA Example: Relationship between television viewing and marital status in GSS98 dataset TVHOURS: numerical variable – number of hours spent watching TV per day MARITAL: categorical variable – married, widowed, divorced, separated and never married
One Way ANOVA Null Hypothesis: No relationship - People in all groups watch, on average, the same amount of television Alternate Hypothesis: There is a relationship – At least 2 of the categories differ in the number of hours of television watched
Post-hoc Tests ANOVA found significant differences among means with respect to TV viewing Are only 2 means significantly different? Are all of them are significantly different? Or anything in between?. Post-hoc tests tell us this
Assumptions in ANOVA Within each sample, the values are independent, and identically normally distributed (same mean and variance).independentnormally distributed The samples are independent of each other.independent The different samples are all assumed to come from populations with the same variance, allowing for a pooled estimate of the variance. pooled estimate of the variance For a multiple comparisons test of the sample means to be meaningful, the populations are viewed as fixed, so that the populations in the experiment include all those of interest.multiple comparisons fixed
Assumptions of ANOVA Distributions are normal: The one-way ANOVA's F test is not affected much if the population distributions are skewed unless the sample sizes are seriously unbalanced.skewed If the sample sizes are balanced, the F test will not be seriously affected by light-tailedness or heavy-tailedness, unless the sample sizes are small (less than 5), or the departure from normality is extreme (kurtosis less than -1 or greater than 2).light-tailednessheavy-tailedness In cases of nonnormality, a nonparametric test or employing a transformation may result in a more powerful test.nonparametric testtransformation
Assumptions of ANOVA Samples are Independent A lack of independence within a sample is often caused by the existence of an implicit factor in the data.independence Values collected over time may be serially correlated (here time is the implicit factor).correlated If the data are in a particular order, consider the possibility of dependence. (If the row order of the data reflect the order in which the data were collected, an index plot of the data [data value plotted against row number] can reveal patterns in the plot that could suggest possible time effects.) index plot
Assumptions of ANOVA Variances are homogeneous: Assessed by examination of the relative size of the sample variances, either informally (including graphically), or by a robust variance test such as Levene's test. graphicallyvariance test Levene's test The risk of having unequal sample variances is incorrectly reporting a significant difference in the means when none exists. The risk is higher with greater differences between variances, particularly if there is one sample variance very much larger than the others.
Assumptions of ANOVA Variances are homogeneous (continued) The F test is fairly robust against inequality of variances if the sample sizes are equalrobust If both nonnormality and unequal variances are present, use a transformationtransformation A nonparametric test like the Kruskal-Wallis test still assumes that the population variances are comparable.nonparametric testKruskal-Wallis test
t Test - Paired Categories are related Are rates of incarceration the same for black (PRC61) and whites (PRC58) in the states dataset? Assumption is that states with high incarceration rates will tend to have high rates for blacks and whites
Your consent to our cookies if you continue to use this website.