Presentation is loading. Please wait.

Presentation is loading. Please wait.

EVAL 6970: Experimental and Quasi-Experimental Designs

Similar presentations

Presentation on theme: "EVAL 6970: Experimental and Quasi-Experimental Designs"— Presentation transcript:

1 EVAL 6970: Experimental and Quasi-Experimental Designs
Dr. Chris L. S. Coryn Kristin A. Hobson Fall 2013

2 Agenda Randomized experiments

3 Important Caveats

4 Caveats Not every phenomenon of interest or value can be studied experimentally Many variables of interest cannot be manipulated or isolated in the way required for experiments Most trait variables, particularly including gender, race, and ethnicity These types of variables can still be the subject of cause-probing studies, but cannot be manipulated in the formal sense

5 Caveats Many phenomena of interest also cannot be manipulated or isolated for ethical reasons Can’t withhold potentially effective treatments from participants Example: Tuskegee syphilis study Can’t assign participants to potentially harmful conditions Physiologically - require participants to smoke or expose them to a pathogen Psychologically – Stanford prison experiments

6 Theory of Random Assignment

7 Random Assignment Random assignment is any procedure by which units are assigned (selected to) conditions based only on chance Each unit has a known, nonzero probability of being assigned to a condition This method of assignment reduces the plausibility of many alternative explanations for observed effects—particularly selection By definition, randomization rules out selection threats; random chance cannot introduce systematic bias into the selection process However, this works best with large samples Random assignment attempts to distribute systematic differences (biases) equally over groups on every variable, whether observed or not This is why random assignment is superior to even pretesting with statistical matching

8 Random Assignment Unlike other controls for validity threats (like pretests and nonequivalent dependent variables), random assignment yields unbiased estimates of average treatment effects Here, unbiased means that any between-group differences are due solely to chance, rather than systematic sources of error Regression discontinuity also yields unbiased effect estimates, but randomized experiments are more flexible, and their analysis is often more straightforward

9 Random Assignment versus Random Sampling
Random assignment is not the same thing as random sampling; the two procedures serve entirely different purposes Random sampling Places units from the population into the sample Makes a sample more representative of the population Strengthens external validity Random assignment Places units from the sample into treatment conditions Make samples equivalent to each other Strengthens internal validity

10 Why Randomization Works
Reduces plausibility of threats to validity by distributing them randomly over conditions It equates groups on the expected value of all variables at pretest, regardless of whether those variables are measured It allows the selection process to be completely known and completely modeled. This property is unique to randomized experiments and regression discontinuity designs. Allows valid estimation of error variance that is orthogonal to treatment It ensures that alternative causes are not confounded with a unit’s treatment condition

11 Why Randomization Works
Groups are equated before treatment, eliminating pretest selection differences as a plausible cause of posttest differences The posttest of the control group serves as a very good counterfactual for the treatment group posttest Threats are randomly distributed over conditions, so both control and treatment units have the same average characteristics The only remaining systematic difference between conditions is treatment Note that random assignment equates groups on expectation

12 Randomization Doesn’t Fix Everything
First and foremost! Randomization works best in large samples. The smaller the sample, the more likely that significant differences remain between groups Attrition is the largest threat to randomized experiments (as selection is to quasi-experiments) Attrition is often differential; there are usually differences between those who remain in a study and those who drop out Randomization does nothing for maturation effects, and it cannot prevent the possibility of historical events affecting groups (likewise, pretests can still cause a testing effect, and changes in instrumentation can still occur) However, random assignment does reduce the likelihood that these threats are confounded with treatment effects

13 Randomization Doesn’t Fix Everything
Randomization can also indirectly affect the required amount of statistical power, because attrition reduces the number of units that remain in a study A priori power analysis will provide information about how many units are necessary to achieve a minimum detectable effect size (MDES) Oversampling can help avoid loss of power due to attrition As a general guideline, include 25%-50% more participants than would be required for minimum power… so that when you lose participants, you can still detect the expected effect

14 Randomization and Units
A unit can be viewed as an opportunity to apply or withhold treatment Units can be individuals (like people or animals) or higher order aggregates (like families, job sites, or classrooms) It is often easier to obtain required power with individual units. Higher-order or nested units sometimes require larger sample sizes, because power is based on the unit of randomization If the higher order unit were classrooms, for instance, increasing power requires a larger number of classrooms to increase power

15 Limitations of Randomization
Randomized experiments are often considered the gold standard of cause-probing studies However, randomized experiments are most useful for answering questions about local molar causation The valid generalization of results from randomized experiments relies on correspondence between the units sampled and the population of interest

16 Basic Designs

17 Basic Designs Basic design Two treatments Two treatments and a control
Note that none of these designs use pretests Why would you skip pretests? There might be a concern over sensitization (i.e., a testing effect) Administration might be unfeasible The variable of interest might be a constant, as in studies of mortality (all patients are alive at the start) Why use pretests? Pretests allow you to study attrition! Do those who drop out of one condition differ from those who drop out of another? Basic design Good if treatment A is known to be effective, otherwise no way to determine if both were equally effective or ineffective R X O Two treatments R XA O XB Two treatments and a control R XA O XB Although groups are assumed to be equated…the problem is…

18 Basic Designs Pretest-posttest Alternative-treatments
These designs can be used for dismantling studies (study of specific components or parts of a treatment) They are also used for dose-response studies (differing doses of the same treatment) R O X Alternative-treatments R O XA XB This design allows investigators to explore attrition. It also results in increased power, by using pretests as covariates in ANCOVA Two treatments and a control R O XA XB

19 Factorial Designs In a factorial design, two or more independent variables (factors) are investigated concurrently Each factor must have at least 2 levels (treatment/control, low dose/high dose, etc.) The number of factors and levels within factors determine the number of cells in the design There are 4 cells in a 2 x 2 factorial design There are 8 cells in a 2 x 2 x 2 design There are 12 cells in a 3 x 2 x 2 design The main advantage of factorial designs is that the joint contribution of two or more independent variables can be simultaneously studied (rather than requiring two or more separate studies)

20 Basic Factorial Design
XA1B1 O XA1B2 XA2B1 XA2B2 Factor B Level 1 Level2 Cell A1B1 Cell A1B2 Row Mean for A1 Factor A Cell A2B1 Cell A2B2 Row Mean for A2 Level 2 Column Mean for B1 Column Mean for B2 2 x 2 factorial design Factor A (Level 1 and Level 2) Factor B (Level 1 and Level 2) Results in four cells: A1B1, A1B2, A2B1, and A2B2

21 Notation for Factorial Designs
The number of numbers here is the number of factors in the design. 2 X 3 X 4 3 2 3 4 The numbers themselves indicate the number of levels in each factor.

22 Main Effects and Interactions
In factorial designs we also discuss main effects and interactions In a 2 x 2 design there are two main effects (one for Factor A, and one for Factor B) and one interaction (Factor A x Factor B) Main effects reflect the separate treatment effects of one independent variable (i.e., factor) averaged over the levels of other independent variables Interactions occur when treatment effects are not constant, but vary over levels of another factor The interaction of one factor with another is sometimes referred to as a moderator

23 Example As an example, consider a 2 x 3 factorial design
Factor A is Gender There are 2 levels: male and female Factor B is Age There are 3 levels: young, middle-aged, and old The outcome variable is performance on a mathematical aptitude test Is this a randomized experiment? What would random assignment of participants to conditions look like? If the performance of the male group differs as a function of age (that is, males performed worse as age increases), but the performance of the female group is consistent across age groups, then there is an Age x Gender interaction

24 Longitudinal Designs Allows investigators to study how effects change over time Adds power to small sample sizes However Attrition is a serious problem It can be unethical to withhold effective treatment for a long period of time R O O X O Similar to a time-series, but with fewer pretest and posttest observations Can be used to study different outcomes over time that are causally related aspirations → expectations → achievement → educational success → quality of life (e.g., income, status)

25 Crossover Designs Allows counterbalancing and assessment of order effects The effects of the first treatment must dissipate before another begins (otherwise, future treatments is confounded) This is essentially a variation of the factorial design R O XA XB A variant of the Latin squares design, in which all units and all possible orders of a treatment are presented in a within-subjects design After the first posttest, units cross over to receive treatment they did not previously get If there were three treatment conditions (A, B, and C) then there would be 6 possible orders (ABC, ACB, BAC, BCA, CAB, and CBA), so subjects would be divided into 6 groups

26 Factors Conducive to Randomized Designs
Demand for a treatment outstrips the supply An innovation cannot be delivered to all units at once Experimental units can be temporally isolated Experimental units are spatially/geographically separated, or communication between units is otherwise low Change is mandated, but the quality or effectiveness of solutions is unknown A tie can be broken, or ambiguity about need can be resolved Some persons (participants) express no preference among alternatives Investigators can create their own organization Investigators have control over experimental units Lotteries are an expected portion of treatment

27 Inhibiting Factors Randomized experiments take a lot of time, in both design and execution (a time frame of several years from conceptualization to results is not unusual) Policymakers and other stakeholders often need answers now Randomized experiments provide very precise and valid answers about whether a treatment is effective, at substantial cost Policymakers and other stakeholders may not need such precise answers Randomized experiments can only provide answers to a fairly narrow set of questions, and the investigator must be able to actively manipulate treatment Many questions of interest to policy and decision makers are not necessarily causal or cannot be manipulated

28 Inhibiting Factors Before a randomized experiment is conducted, investigators must demonstrate (have evidence for, have a reasonable expectation of) all of the following: Present conditions need improvement The proposed improvement is of unclear value, or there are several changes whose relationship is unclear The results of the experiment would clarify the situation The results would be used to change the policy or practice relating to present conditions The rights of participants will be protected throughout the process

Download ppt "EVAL 6970: Experimental and Quasi-Experimental Designs"

Similar presentations

Ads by Google