Challenges arising from the analysis of randomized trials in education

Challenges arising from the analysis of randomized trials in education
Professor Steve Higgins School of Education Durham University Randomised Controlled Trials in the Social Sciences 7 - 9 September 2016, University of York

Overview The work of the Education Endowment Foundation The EEF Archive Analysis project (with Adetayo Kasim and ZhiMin Xiao) Trials findings and the Toolkit Some implications for research, policy and practice

The work of the Education Endowment Foundation
Established in 2011 with £125M endowment (aim £200M total) Independent grant-making charity dedicated to breaking the link between family income and educational achievement Identify, test and scale successful approaches and interventions Meta-analytic evidence database to identify promise (from the ‘Teaching and Learning Toolkit’) – “What Works Centre for improving education outcomes for school-aged children” Projects commissioned from a range of organisations Independently evaluated by a team from a ‘panel’ of 25 Results fed back into Toolkit (with other similar project findings and new meta-analyses)

Projects and campaigns
Pilot Efficacy Effectiveness Campaigns Making Best Use of Teaching Assistants (£5M) North East Literacy Campaign (with Northern Rock Foundation - £10M)

EEF since 2011 60 independent evaluation reports published 64%
£75.4m funding awarded to date 64% of school leaders say they have used the toolkit 7,500 schools participating in projects 750,000 pupils currently involved in EEF projects 60 independent evaluation reports published 106 RCTs 127 projects funded to date

The EEF Archive Analysis project
Independent evaluators submit project data to FFT (given appropriate permissions) Data re-matched with National Pupil Database (NPD) data and updated as new data becomes available Released to archive team at Durham twice a year Aims to undertake methodological exploration of trials data provide comparative analyses of impact across projects analyse long term/ follow-up impact as NPD data becomes available develop an R package for educational trials analysis

Principles and caveats
Principles Exploratory analysis to inform EEF’s evaluation strategy Help to explain variation in trial impact Independent evaluator’s peer-reviewed published estimate is always the official EEF finding for impact on attainment Caveats Can’t always match N (4/32 projects greater than 10% difference) NPD data removed and re-matched Some raw scores transformed Gain score distribution different from post-test

R package - eefAnalytics
To support analysis of randomised trials in education (individual, cluster and multi-site) difference-in-means ordinary least squares multi-level models (frequentist & Bayesian) Permutation p-values, bootstrapped confidence intervals Complier Average Causal Effect (CACE) Cumulative quantile analysis eefAnalytics: in development – available on CRAN in November – available to try out – contact us!

Initial findings 17 EEF projects Four analytic models Findings Results converge in larger, unproblematic trials Point estimates and estimates of precision vary when trials are problematic (e.g. testing issues, randomisation), when design and analysis not matched; with different co-variates added; when outcomes are different or transformed (e.g. z-scores) Results tend to diverge if ICC ≥ 0.2 and there are few clusters/schools MLM total variance ‘most conservative’ model (wider CI) Bayesian estimates identical but more precise (narrower CI)

Convergence Pre-test imbalance

Clustering Point estimates similar, CIs vary

Divergence Pre-test imbalance
Post-ANCOVA models aim to correct for this Evaluator gain estimate higher

Analytic heterogeneity
Stem Leaf -0.1 -0.0 0.0 0.1 0.2 4 Stem and leaf plot of differences between evaluator and archive MLM (total variance) models 32 projects 64 outcomes Majority 0.05 or less

Archive Analysis development
R package release - November 2016 Post/gain paper (revisiting Lord’s paradox) Local Influence Index (a binomial index for evaluating intervention benefit) CACE (to help interpret ITT analysis) Follow-up data in NPD Reflections With ITT approach, sample sizes predicated on minimum necessary to detect a probably overestimated effect size, and MLM total variance are we setting ourselves up for disappointment?

Sutton Trust/EEF Teaching & Learning Toolkit
Best ‘buys’ on average from research Key messages for Pupil Premium spending in schools Currently used by over 60% of school leaders

Toolkit overview and aims
Cost effectiveness estimates of a range of educational approaches Based on average effects from meta-analyses and cost estimates of additional outlay to put in place Evidence robustness estimates as ‘padlocks’ To inform professional decision-making about school spending To create a framework for evidence-use To provide a structure to improve evidence utility

Aggregating inferences

Inferences from findings across meta-analyses
Requires assumption of variation, bias and inaccuracy randomly distributed across included studies Probably unwarranted, but the best we’ve got Starting point to improve precision and predictability Better than saying nothing?

Toolkit as a predictor? Toolkit theme EEF Projects Effect Size Digital technology 0.28 Phonics 0.35 Accelerated Reader 0.24 Butterfly Phonics 0.43 Improving Lit and Num 0.20 Fresh start Units of Sound -0.08 Catch up literacy 0.12 Vocabulary Enrichment 0.07 Rapid Phonics -0.05 Meta-cogn & SRL 0.62 Teaching assistants 0.08 Improving writing quality 0.74 Catch up numeracy 0.21 Changing Mindsets 0.18 Talk for Literacy P4C One to one 0.40 Summer school Perry Beeches 0.36 Future Foundations (Eng) 0.17 Switch on Reading Future Foundations (maths) 0.00 Summer Active Reading 0.14 Peer tutoring Shared Maths 0.02 Paired reading -0.02 *As we can see overall the impact of the specific programmes come in agreement with the Toolkit topic!

So, “what works” or “what’s worked”?
Internal validity necessary for external – did it actually work there? Causal ‘black-box’ makes replicability challenging Defining ‘approaches’ or ‘interventions’ – unit of description Problematic ‘populations’ – what inference for whom? Importance of knowing what hasn’t worked (on average) In education a null result (i.e. may not be different from 0) = as good as counterfactual Mean or range – “on average” or better estimates of probability? Generalisability or predictability? Small-scale research findings may optimise rather than typify ‘in the wild’ application or scale-up

A distributional view Visualised distribution of Toolkit effects
Distribution of effects in Kluger & de Nisi (1996) from Dylan Wiliam ( ) Visualised distribution of Toolkit effects

Current Toolkit developments
Formalising methodology (translating/simplifying existing models) Cochrane/ Campbell/ EPPI PRISMA for reviews CONSORT for trials GRADE Guidelines for evidence New comparable and updatable meta-analyses for each strand Identifying factors affecting current effect size estimates Design (sample size, randomisation, clustering) Measurement issues (outcome complexity, outcome alignment) Intervention (duration, intensity) International partnerships Australia – Australian version of Toolkit, 3 RCTs commissioned Chile – under development

Implications for research, policy and practice
Research Further discussion about optimal analysis approaches Statistical Analysis Plans May explain some of the heterogeneity in meta-analyses (limits to precision) Importance of replication and meta-analysis More methodological exploration! Policy and Practice Need to communicate uncertainty estimates carefully as often dependent on analytic approach “What Works?” or “What’s Worked”? Communicate range as well as mean?

References Higgins, S. (2016) Meta-synthesis and comparative meta-analysis of education research findings: some risks and benefits Review of Education 4.1: 31–53. Higgins, S. & Katsipataki, M. (2016) Communicating comparative findings from meta-analysis in educational research: some examples and suggestions International Journal of Research & Method in Education 39.3 pp Kasim, A., Xiao, Z. & Higgins, S. (in preparation) eefAnalytics: A Package for Trial Data Analysis The R Journal Xiao Z., Kasim, A., Higgins, S.E. (2016) Same Difference? Understanding Variation in the Estimation of Effect Sizes from Educational Trials International Journal of Educational Research 77:

View of Durham City from the train station

Challenges arising from the analysis of randomized trials in education

Similar presentations

Presentation on theme: "Challenges arising from the analysis of randomized trials in education"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Challenges arising from the analysis of randomized trials in education

Similar presentations

Presentation on theme: "Challenges arising from the analysis of randomized trials in education"— Presentation transcript:

Similar presentations

About project

Feedback