12 Inferential Analysis.

12 Inferential Analysis

12.1 Foundations of Analysis for Research Design
The analysis procedure you choose is based on your research design All of the procedures in this chapter are based on the General Linear Model (GLM) A system of equations that is used as the mathematical framework for most of the statistical analyses used in applied social research

12.2 Inferential Statistics
Statistical analyses used to reach conclusions that extend beyond the immediate data alone The GLM uses dummy variables A variable that uses discrete numbers, usually 0 and 1, to represent different groups in your study

12.2 Inferential Statistics – Statistical Significance
Uses the GLM to estimate statistical significance p value: an estimate of the probability of your result if the null hypothesis is true Statistical significance is not enough; we need an effect size, as well

12.2 Statistical and Practical Significance
Table 12.1 Possible outcomes of a study with regard to statistical and practical significance

12.3 General Linear Model Foundation for t-test ANOVA and ANCOVA
Regression, factor, and cluster analyses Multidimensional scaling Discriminant function analysis Canonical correlation

12.3 General Linear Model Assumptions
The relationships between variables are linear Samples are random and independently drawn from the population Variables have equal (homogeneous) variances Variables have normally distributed error

12.3a The Two-Variable Linear Model
Figure 12.2 A bivariate plot. Figure 12.3 A straight-line summary of the data. Linear model: Any statistical model that uses equations to estimate lines.

12.3a The Straight Line Model
Figure 12.4 The straight-line model. Regression line: A line that describes the relationship between two or more variables. Regression analysis: A general statistical analysis that enables us to model relationships in data and test for treatment effects. In regression analysis, we model relationships that can be depicted in graphic form with lines that are called regression lines.

12.3a Estimates Using the Two-Variable Linear Model
Figure 12.5 The two-variable linear model. Figure 12.6 What the model estimates. Error term: A term in a regression equation that captures the degree to which the line is in error (that is, the residual) in describing each point.

12.3b The “General” in the General Linear Model

12.3b The “General” in the General Linear Model (cont’d.)
The GLM allows you to summarize a wide variety of research outcomes The major problem for the researcher who uses the GLM is model specification How to identify the equation that best summarizes the data for a study If the model is misspecified, the estimates of the coefficients (the b-values) that you get from the analysis are likely to be biased

12.3c Dummy Variables Enable you to use a single regression equation to represent multiple groups Act like switches that turn various values on and off in an equation Figure 12.7 Use of a dummy variable in a regression equation

12.3c Using Dummy Variables
Figure 12.8 Using a dummy variable to create separate equations for each dummy variable value. Figure 12.9 Determine the difference between two groups by subtracting the equations generated through their dummy variables.

12.3d The t-Test Assesses whether the means of two groups (for example, the treatment and control groups) are statistically different from each other Figure Idealized distributions for treated and control group posttest values

12.3d Three Scenarios Figure Three scenarios for differences between means.

12.3d Low-, Medium-, and High-Variability Scenarios
Table 12.2 shows the low-, medium-, and high-variability scenarios represented with data that correspond to each case. The first thing to notice about the three situations is that the difference between the means is the same in all three.

12.3d Difference Between the Means
When you are looking at the differences between scores for two groups, you have to judge the difference between their means relative to the spread or variability of their scores The t-test does just this—it determines if a difference exists between the means of two groups

12.3d Formula for the t-Test
Figure Formula for the t-test. (left) Figure Formula for the standard error of the difference between the means. (top right) Figure Final formula for the t-test. (bottom right)

The regression formula for the t-test & ANOVA
12.3d The t-Test t-Value Standard error of the difference Variance Standard deviation (sd) Alpha level (α) Degrees of freedom (df) The regression formula for the t-test & ANOVA Figure The regression formula for the t-test (and also the two-group one-way posttest-only Analysis of Variance or ANOVA model). t-value: The estimate of the difference between the groups relative to the variability of the scores in the groups. Standard error of the difference: A statistical estimate of the standard deviation one would obtain from the distribution of an infinite number of estimates of the difference between the means of two groups. Variance: A statistic that describes the variability in the data for a variable. The variance is the spread of the scores around the mean of a distribution. Specifically, the variance is the sum of the squared deviations from the mean divided by the number of observations minus 1. Standard deviation: The spread or variability of the scores around their average in a single sample. The standard deviation, often abbreviated SD, is mathematically the square root of the variance. The standard deviation and variance both measure dispersion, but because the standard deviation is measured in the same units as the original measure and the variance is measured in squared units, the standard deviation is usually more directly interpretable and meaningful. Alpha level: The p value selected as the significance level. Specifically, alpha is the Type I error, or the probability of concluding that there is a treatment effect when, in reality, there is not. Degrees of freedom (df) A statistical term that is a function of the sample size. In the t-test formula, for instance, the df is the number of persons in both groups minus 2.

12.4a The Two-Group Posttest-Only Randomized Experiment
Meets the following requirements: Has two groups Uses a post-only measure Has a distribution for each group on the response measure, each with an average and variation Assesses treatment effect as the statistical (non-chance) difference between the groups

12.4a The Two-Group Posttest-Only Randomized Experiment (cont’d.)
Three tests meet these requirements, and they all yield the same results Independent t-Test One-way ANOVA Regression analysis

12.4b Factorial Design Analysis
Analysis requires results for two main effects and one interaction effect in a 2 x 2 factorial design Figure Regression model for a 2 x 2 factorial design. Main effect: An outcome that shows consistent differences between all levels of a factor. Interaction effect: An effect that occurs when differences on one factor depend on which level you are on another factor.

12.4c Randomized Block Analysis
The dummy variable Z1 represents the treatment group The other dummy variables indicate the blocks The beta values (Β) reflect the analogous treatment and blocks Figure Regression model for a Randomized Block design

12.4d Analysis of Covariance
An analysis that estimates the difference between the groups on the posttest after adjusting for differences on the pretest Figure Regression model for the ANCOVA.

12.5 Quasi-Experimental Analysis
Quasi-experimental designs still use the GLM, but it has to be adjusted for measurement error Any influence on an observed score not related to what you are attempting to measure This adjustment for error makes the analyses more complicated

12.5a Nonequivalent Groups Analysis
Figure Formula for adjusting pretest values for unreliability in the reliability-corrected ANCOVA Figure The regression model for the reliability corrected ANCOVA Formula for adjusting pretest values for unreliability in the reliability-corrected ANCOVA The regression model for the reliability corrected ANCOVA

12.5b Regression-Discontinuity Analysis
Figure Adjusting the pretest by subtracting the cutoff in the Regression-Discontinuity (RD) analysis model. Figure The regression model for the basic regression-discontinuity design. Adjusting the pretest by subtracting the cutoff in the Regression-Discontinuity (RD) analysis model. The regression model for the basic regression-discontinuity design

12.5c Regression Point Displacement Analysis
Figure The regression model for the RPD design assuming a linear pre-post relationship.

Summary Table Summary of the statistical models for the experimental and quasi-experimental research designs.

Discuss and Debate What is the difference between statistical significance and practical significance? Give an example to support your answer Discuss how the four assumptions underlying the GLM impact the data analysis process Statistical significance simply tells us the probability that there is a difference between groups due to chance alone. Practical significance tells us the degree to which the results have meaning in real life. Examples will vary. By running the descriptive statistics first, researchers can check the data to be sure it conforms to the four assumptions: 1) the relationships between variables are linear 2) samples are random and independently drawn from the population 3) variables have equal (homogeneous) variances, and 4) variables have normally distributed error. A researcher must test these assumptions, or conclusion validity will be threatened.

12 Inferential Analysis.

Similar presentations

Presentation on theme: "12 Inferential Analysis."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

12 Inferential Analysis.

Similar presentations

Presentation on theme: "12 Inferential Analysis."— Presentation transcript:

Similar presentations

About project

Feedback