Handling Mobility in Cluster- Randomized Cohort Trials.

Slides:



Advertisements
Similar presentations
Analysis by design Statistics is involved in the analysis of data generated from an experiment. It is essential to spend time and effort in advance to.
Advertisements

Agency for Healthcare Research and Quality (AHRQ)
Designing an impact evaluation: Randomization, statistical power, and some more fun…
A Guide to Education Research in the Era of NCLB Brian Jacob University of Michigan December 5, 2007.
Experimental Design making causal inferences. Causal and Effect The IV precedes the DV in time The IV precedes the DV in time The IV and DV are correlated.
Missing Data Issues in RCTs: What to Do When Data Are Missing? Analytic and Technical Support for Advancing Education Evaluations REL Directors Meeting.
Selection of Research Participants: Sampling Procedures
Chapter 3 Producing Data 1. During most of this semester we go about statistics as if we already have data to work with. This is okay, but a little misleading.
Who are the participants? Creating a Quality Sample 47:269: Research Methods I Dr. Leonard March 22, 2010.
Longitudinal Experiments Larry V. Hedges Northwestern University Prepared for the IES Summer Research Training Institute July 28, 2010.
Dr. Chris L. S. Coryn Spring 2012
Who and How And How to Mess It up
Common Problems in Writing Statistical Plan of Clinical Trial Protocol Liying XU CCTER CUHK.
Statistics Micro Mini Threats to Your Experiment!
Non-Experimental designs: Developmental designs & Small-N designs
Clustered or Multilevel Data
Sampling and Experimental Control Goals of clinical research is to make generalizations beyond the individual studied to others with similar conditions.
11 Populations and Samples.
Incomplete Block Designs
Impact Evaluation Session VII Sampling and Power Jishnu Das November 2006.
FINAL REPORT: OUTLINE & OVERVIEW OF SURVEY ERRORS
Quantitative Research
Analysis of Clustered and Longitudinal Data
Basic Experimental Design Larry V. Hedges Northwestern University Prepared for the IES Summer Research Training Institute June 18 – 29, 2007.
17 June, 2003Sampling TWO-STAGE CLUSTER SAMPLING (WITH QUOTA SAMPLING AT SECOND STAGE)
Sample Design.
Determining Sample Size
COLLECTING QUANTITATIVE DATA: Sampling and Data collection
Chapter 1: Introduction to Statistics
RESEARCH A systematic quest for undiscovered truth A way of thinking
Epidemiology The Basics Only… Adapted with permission from a class presentation developed by Dr. Charles Lynch – University of Iowa, Iowa City.
Sampling and Nested Data in Practice- Based Research Stephen Zyzanski, PhD Department of Family Medicine Case Western Reserve University School of Medicine.
Introduction To Research 589(A)
Povertyactionlab.org Planning Sample Size for Randomized Evaluations Esther Duflo MIT and Poverty Action Lab.
Chap 20-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 20 Sampling: Additional Topics in Sampling Statistics for Business.
Chapter 8 Introduction to Hypothesis Testing
Chapter 1: Research Methods
 Is there a comparison? ◦ Are the groups really comparable?  Are the differences being reported real? ◦ Are they worth reporting? ◦ How much confidence.
Experimental Design making causal inferences Richard Lambert, Ph.D.
 Collecting Quantitative  Data  By: Zainab Aidroos.
Slide 1 Estimating Performance Below the National Level Applying Simulation Methods to TIMSS Fourth Annual IES Research Conference Dan Sherman, Ph.D. American.
Variables, sampling, and sample size. Overview  Variables  Types of variables  Sampling  Types of samples  Why specific sampling methods are used.
Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.
Multilevel Data in Outcomes Research Types of multilevel data common in outcomes research Random versus fixed effects Statistical Model Choices “Shrinkage.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Gile Sampling1 Sampling. Fundamental principles. Daniel Gile
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
5-4-1 Unit 4: Sampling approaches After completing this unit you should be able to: Outline the purpose of sampling Understand key theoretical.
통계적 추론 (Statistical Inference) 삼성생명과학연구소 통계지원팀 김선우 1.
Finishing up: Statistics & Developmental designs Psych 231: Research Methods in Psychology.
Sampling and Nested Data in Practice-Based Research Stephen Zyzanski, PhD Department of Family Medicine Case Western Reserve University School of Medicine.
Chapter 8: Between Subjects Designs
Chapter Eight: Quantitative Methods
Analysis of Experiments
Tutorial I: Missing Value Analysis
ANCOVA.
The inference and accuracy We learned how to estimate the probability that the percentage of some subjects in the sample would be in a given interval by.
RESEARCH METHODS Lecture 28. TYPES OF PROBABILITY SAMPLING Requires more work than nonrandom sampling. Researcher must identify sampling elements. Necessary.
Common Pitfalls in Randomized Evaluations Jenny C. Aker Tufts University.
Choosing and using your statistic. Steps of hypothesis testing 1. Establish the null hypothesis, H 0. 2.Establish the alternate hypothesis: H 1. 3.Decide.
Formulation of the Research Methods A. Selecting the Appropriate Design B. Selecting the Subjects C. Selecting Measurement Methods & Techniques D. Selecting.
RESEARCH METHODS Lecture 28
Graduate School of Business Leadership
Analyzing Intervention Studies
Chapter Eight: Quantitative Methods
Power, Sample Size, & Effect Size:
Common Problems in Writing Statistical Plan of Clinical Trial Protocol
Group Experimental Design
Chapter 11 EDPR 7521 Dr. Kakali Bhattacharya
Sample Sizes for IE Power Calculations.
Presentation transcript:

Handling Mobility in Cluster- Randomized Cohort Trials

Extent of Mobility Cities, neighborhoods, schools, hospitals, clinics 12% of total US population moved during % of those below the poverty line 36% changed schools between K and 3 rd grade Annual student mobility varies widely – 0.3% to 66.7% in Chicago City schools in Even with 30% annual mobility, only 49% are the same students after 3 yrs and only 34% after 3 yrs

Mobility Threatens Validity Internal validity – Differential mobility/attrition – amount of by type Statistical Conclusion validity – ITT analysis may not be possible External validity – Effects generalizable only to those who stay Construct validity – “Non-compliance” with treatment Those who leave or enter get only part of the treatment

Strategies to Address Mobility Two common approaches – Do all you can to minimize mobility/attrition – E.g., Selective sampling or costly retention efforts But selective sampling limits external validity Statistical Adjustments – Multiple imputation – Potential outcome models – Maximum likelihood estimation – Growth curve analysis – Propensity scores

Our Proposal Systematic consideration of: – Internal validity – External validity Cross-classification of: – Focus on person or cluster Assesses all randomized units at end point, regardless of whether or not they complied with the intervention – ITT or non-compliance analysis Take into account whether units complied or not

ITT Analysis Is always important – the “primary” analysis – At the level of the unit of assignment – Yields an unbiased estimate of the effect of random assignment to receive an intervention In presence of attrition or non-compliance, intervention effect estimate is composite of: – Actual effect on those who complied with it – Effect on non-compliers even though they were assigned to receive it – Effect on those who dropped out before receiving the complete treatment

Two major designs Repeated cross-section design – New samples of group members are measured at each time of assessment – No issues of person mobility or attrition Though could still have unit (e.g., schools) attrition – Outcomes assessed as rates at the cluster level Cohort (or panel) design – A cohort or panel of persons is followed over time – Outcomes and covariates are assessed at the individual level But analyzed hierarchically – Moderators at the individual level can be examined Few studies to date have included both design features

Handling of late entrants/joiners Historically not included in analysis – Instead much effort made to track leavers In many contexts, including joiners helps to maintain internal and external validity – In schools, leavers and joiners are often similar But this needs to be checked – Helps statistical conclusion validity by maintaining sample size and power – Helps generalizability to other similar schools Differential dosage is a form of non-compliance with treatment

Missingness Missing data at later waves for leavers Missing data at earlier waves for joiners Both types can be handled with – Imputation – Analytical approaches All need careful consideration of covariates and predictors of leaving or joining Leavers and joiners may be different under some conditions – E.g., changing neighborhoods

Multilevel classification of approaches to mobility Focus of the Trial Nature of the Analysis Intent To TreatCompliance PERSON1.Person-focused ITT approach 3. Person-focused compliance approach CLUSTER2. Cluster-focused ITT approach 4. Cluster-focused compliance approach

1. Person-focused ITT approach Question Answered: What is the impact of intervention assignment on persons when implemented under real world conditions (including mobility)? – Uses data from all persons originally in the clusters assigned to conditions. – Focus is on estimating the person level of program effect. – Persons who leave clusters (e.g., schools) during the trial are followed (i.e., Leavers are retained) – Persons who enter research clusters during the trial are not assessed (or analyzed) (i.e., Joiners are NOT added).

2. Cluster-focused ITT approach Question answered: What is the impact of intervention assignment on clusters of persons? – Uses data from all clusters originally assigned to conditions. – Focus is on estimating the cluster-level intervention effect – i.e., prevalence not incidence. – Requires assessing persons who enter research clusters during the trial. Joiners are added. – Persons who leave research clusters during the trial are not followed. Leavers are dropped.

3. Person-focused compliance approach Question answered: What is the impact of the intervention on those persons who complied with it to receive an adequate “dosage?” – Focus is estimating the effect on persons who comply with the intervention. CACE analysis is preferred to “as-treated “ analysis. – Dropouts are not followed. – Late entrants are not assessed (or analyzed).

4. Cluster-focused compliance approach Question answered: What is the impact of the intervention on clusters that complied with it? – Focus is only on those clusters that stay in the trial or in which the intervention is fully implemented. – Joiners and leavers handled similar to Option 2: Requires assessing persons who enter research clusters during the trial. Joiners are added. Persons who leave research clusters during the trial are not followed. Leavers are dropped.

Flay & Collins (2005) Historical review of school-based randomized trials

Improvements in School-based RCTs approaches to the randomization of whole schools the choice of appropriate comparison or control groups solutions when randomization breaks down limiting and handling of variation in integrity of the intervention received limiting biases introduced by data collection awareness of the effects of intensive and long-term data collection limiting and analysis of subject attrition and other missing data

Improvements in School-based RCTs approaches to obtaining parental consent for children to engage in research design and analysis issues when only small numbers of schools are available or can be afforded the choice of the unit of analysis phases of research optimizing and extending the reach of interventions differential effects in subpopulations

Six Important Issues Sequence planning Time Keeping up with, and being open to, methodological advances Publication of all results Accumulation of knowledge The devil is in the details

More on Sampling: What About Random Sampling? Sampling models are often ignored in intervention research in education and public health BUT: Sampling is where the randomness comes from in education and PH research Sampling therefore has profound consequences for statistical analysis and research designs

Sampling Models Simple random samples are rare in field research Many populations are hierarchically nested: Students in classrooms in schools Schools in districts in states People in communities/neighborhoods Patients in clinics We usually exploit the population structure to sample people (e.g., students) by first sampling places (e.g., schools) Even then, most samples are not probability samples, but they are intended to be representative (of some population)

Sampling Models Survey research calls this strategy multistage (multilevel) clustered sampling We often sample clusters (schools) first then individuals within clusters (students within schools) This is a two-stage (two-level) cluster sample We might sample schools, then classrooms, then students This is a three-stage (three-level) cluster sample

Precision of Estimates Depend on the Sampling Model Suppose the total population variance is σ T 2 and ICC is ρ Consider two samples of size N = mn A simple random sample or stratified sample The variance of the mean is σ T 2 /mn A clustered sample of n students from each of m schools The variance of the mean is (σ T 2 /mn)[1 + (n – 1)ρ] The inflation factor [1 + (n – 1)ρ] is called the design effect For example, if n = 10 and ρ =.1, design effect = 1.9, that is, variance is inflated almost 2 times = almost double what it would be with 100 students who were not clustered in m schools

Precision of Estimates Depend on the Sampling Model Suppose the population variance is σ T 2 School level ICC is ρ S, class level ICC is ρ C Consider two samples of size N = mpn A simple random sample or stratified sample The variance of the mean is σ T 2 /mpn A clustered sample of n students from p classes in m schools The variance is (σ T 2 /mpn)[1 + (pn – 1)ρ S + (n – 1)ρ C ] The three level design effect is [1 + (pn – 1)ρ S + (n – 1)ρ C ]

Precision of Estimates Depend on the Sampling Model Treatment effects in experiments and quasi- experiments are mean differences Therefore precision of treatment effects and statistical power will depend on the sampling model

Sampling Models in ED & PH Research The fact that the population is structured does not mean the sample must be a clustered sample Whether it is a clustered sample depends on: How the sample is drawn (e.g., are schools sampled first then individuals randomly within schools) What the inferential population is (e.g., is the inference these schools studied or a larger population of schools)

Sampling Models in ED & PH Research A necessary condition for a clustered sample is that it is drawn in stages using population subdivisions schools then students within schools schools then classrooms then students However, if all subdivisions in a population are present in the sample, the sample is not clustered, but stratified Stratification has different implications than clustering Whether there is stratification or clustering depends on the definition of the population to which we draw inferences (the inferential population)

Sampling Models in ED & PH Research The clustered/stratified distinction matters because it influences the precision of statistics estimated from the sample If all population subdivisions are included in every sample, there is no sampling (or exhaustive sampling) of subdivisions therefore differences between subdivisions add no uncertainty to estimates If only some population subdivisions are included in the sample, it matters which ones you happen to sample thus differences between subdivisions add to uncertainty

Inference Population Inference Models Fixed and Random Effects Statistical Power Unit of Randomization Courtesy of Larry V. Hedges, Northwestern University IES Summer Research Training Institute June 18 – 29, 2007

Inferential Population and Inference Models The inferential population or inference model has implications for analysis and therefore for the design of experiments Do we make inferences to the schools in this sample or to a larger population of schools? Inferences to the schools or classes in the sample are called conditional inferences Inferences to a larger population of schools or classes are called unconditional inferences

Inferential Population and Inference Models Note that the inferences (what we are estimating) are different in conditional versus unconditional inference models In a conditional inference, we are estimating the mean (or treatment effect) in the observed schools In unconditional inference we are estimating the mean (or treatment effect) in the population of schools from which the observed schools are sampled We are still estimating a mean (or a treatment effect) but they are different parameters with different uncertainties

Fixed and Random Effects When the levels of a factor (e.g., particular blocks included) in a study are sampled and the inference model is unconditional, that factor is called random and its effects are called random effects When the levels of a factor (e.g., particular blocks included) in a study constitute the entire inference population and the inference model is conditional, that factor is called fixed and its effects are called fixed effects

Comparing Fixed and Mixed Effects Statistical Procedures (Hierarchical Design) Conditional and unconditional inference models estimate different treatment effects have different contaminating factors that add uncertainty Mixed procedures are good for unconditional inference The fixed procedures are not generally recommended The fixed procedures have higher power

Comparing Hierarchical Designs to Randomized Block Designs Randomized block designs usually have higher power, but assignment of different treatments within schools or classes may be practically difficult politically infeasible theoretically impossible It may be methodologically unwise because of potential for Contamination or diffusion of treatments compensatory rivalry or demoralization

Applications to Experimental Design We will address the two most widely used experimental designs in education Randomized blocks designs with 2 levels Randomized blocks designs with 3 levels Hierarchical designs with 2 levels Hierarchical designs with 3 levels We also examine the effect of covariates Hereafter, we generally take schools to be random

Precision of the Estimated Treatment Effect Precision is the standard error of the estimated treatment effect Precision in simple (simple random sample) designs depends on: Standard deviation in the population σ Total sample size N The precision is

Precision of the Estimated Treatment Effect Precision in complex (clustered sample) designs depends on: The (total) standard deviation σ T Sample size at each level of sampling (e.g., m clusters, n individuals per cluster) Intraclass correlation structure It is a little harder to compute than in simple designs, but important because it helps you see what matters in design

Precision in Two-level Hierarchical Design With No Covariates The standard error of the treatment effect SE decreases as m (number of schools) increases SE deceases as n increases, but only up to point SE increases as ρ increases

Statistical Power Power in simple (simple random sample) designs depends on: Significance level Effect size Sample size Look power up in a table for sample size and effect size

Fragment of Cohen’s Table d n … … … … …

Computing Statistical Power Power in complex (clustered sample) designs depends on: Significance level Effect size δ Sample size at each level of sampling (e.g., m clusters, n individuals per cluster) Intraclass correlation structure This makes it seem a lot harder to compute

Computing Statistical Power Computing statistical power in complex designs is only a little harder than computing it for simple designs Compute operational effect size (incorporates sample design information) Δ T Look power up in a table for operational sample size and operational effect size This is the same table that you use for simple designs

Randomized Block Designs In randomized block designs, as in hierarchical designs, the intraclass correlation has an impact on precision and power However, in randomized block designs designs there is also a parameter reflecting the degree of heterogeneity of treatment effects across schools We define this heterogeneity parameter ω S in terms of the amount of heterogeneity of treatment effects relative to the heterogeneity of school means Thus ω S = σ T x S 2 /σ S 2

Precision in Two-level Randomized Block Design With No Covariates The standard error of the treatment effect SE decreases as m (number of schools) increases SE deceases as n increases, but only up to point SE increases as ρ increases SE increases as ω S = σ T x S 2 /σ S 2 increases

Power in Two-level Randomized Block Design With No Covariates Basic Idea: Operational Effect Size = (Effect Size) x (Design Effect) Δ T = δ x (Design Effect) For the two-level hierarchical design with no covariates Operational sample size is number of schools (clusters)

What Unit Should Be Randomized? (Schools, Classrooms, or Students) Experiments cannot estimate the causal effect on any individual Experiments estimate average causal effects on the units that have been randomized If you randomize schools the (average) causal effects are effects on schools If you randomize classes, the (average) causal effects are on classes If you randomize individuals, the (average) causal effects estimated are on individuals

What Unit Should Be Randomized? (Schools, Classrooms, or Students) Theoretical Considerations Decide what level you care about, then randomize at that level Randomization at lower levels may impact generalizability of the causal inference (and it is generally a lot more trouble) Suppose you randomize classrooms, should you also randomly assign students to classes? It depends: Are you interested in the average causal effect of treatment on naturally occurring classes or on randomly assembled ones?

What Unit Should Be Randomized? (Schools, Classrooms, or Students) Relative power/precision of treatment effect Assign Schools (Hierarchical Design) Assign Classrooms (Randomized Block) Assign Students (Randomized Block)

What Unit Should Be Randomized? (Schools, Classrooms, or Students) Precision of estimates or statistical power dictate assigning the lowest level possible But the individual (or even classroom) level will not always be feasible or even theoretically desirable

Class Exercise: Questions About Design Divide into 3 groups 4 questions each Discuss then present to class

Questions About Assigning Schools 1.I assigned treatments to schools and am not using classes in the analysis. Do I have to take them into account in the design? Do I have to include classes as a nested factor? 2.My schools all come from two districts, but I am randomly assigning the schools. Do I have to take district into account some way? 3.I didn’t really sample the schools in my experiment (who does?). Do I still have to treat schools as random effects? What population can I generalize to anyway? 4.I am using a randomized block design with fixed effects. Do you really mean I can’t say anything about effects in schools that are not in the sample?

Questions About Failed Randomization 1.We randomly assigned, but our assignment was corrupted by treatment switchers. What do we do? 2.We randomly assigned, but our assignment was corrupted by attrition. What do we do? 3.We randomly assigned but got a big imbalance on characteristics we care about (gender, race, language, SES). What do we do? 4.We randomly assigned but when we looked at the pretest scores of our DV, we see that we got a big imbalance (a “bad randomization”). What do we do?

Questions About Treatment Effects 1.We care about treatment effects, but we really want to know about mechanism. How do we find out if implementation impacts treatment effects? 2.We want to know where (under what conditions) the treatment works. Can we analyze the relation between conditions and treatment effect to find this out? 3.We have a randomized block design and find heterogeneous treatment effects. What can we say about the main effect of treatment in the presence of interactions? 4.I have heard of using “school fixed effects” to analyze a randomized block design. Is it a good alternative to use ANOVA or HLM?