Presentation on theme: "The suitability of using RCTs in educational research Dr Carole Torgerson Senior Research Fellow Institute for Effective Education University of York ESRC."— Presentation transcript:
The suitability of using RCTs in educational research Dr Carole Torgerson Senior Research Fellow Institute for Effective Education University of York ESRC Conference: Methodological challenges for the 21 st Century, Nov 22 nd, 2007
A careful look at randomized experiments will make clear that they are not the gold standard. But then, nothing is. And the alternatives are usually worse. Berk RA. (2005) Journal of Experimental Criminology 1,
History of RCTs First known RCT in humans was a study in 1932 looking at counselling for improving academic achievement in undergraduates; several other RCTs in educational settings followed (e.g., 1933 trial of efficacy of examinations for undergraduate students). First health trials patulin study in 1944 followed by 1948 streptomycin trial. Forsetlund et al, Econ. Innov. New Techn. 2007: 371.
Background Intense opposition to RCTs from some educational researchers and policy makers No tradition of UK-funding of large-scale randomised field trials in education evaluating policy initiatives Large rigorous trials are possible in education research For important policy issues RCTs could and should be used
Large randomised field trials Project Star: Phase 1 of the Tennessee Class-Size Experiment, US Computers and literacy learning, US Vouchers for private schools, US Minimally qualified, minimally trained classroom assistants, India Comparing briefly trained with fully qualified teachers, US
Computers and literacy Governments across the world have made massive investments in educational computer technology (£1.7 billion in UK, since 1997). Few trials have been undertaken. Rouse et al, US, evaluated Fast forword in a trial of 512 children – no noticeable effects were observed. Rouse et al, 2004, NBER working paper, 10315
UK Computer trial We recently completed a trial of 156 children looking at the effect of computers for literacy learning In a typical English secondary school we randomly allocated all Year 7 pupils (aged 11) to have 10 hours of computer teaching over two weeks The control group received normal teaching Randomisation was performed independently by the York Trials Unit Sample size had > 80% power to show an effect size of 0.5 (moderate) of the computer programme We measured two outcomes: spelling and reading ability (spelling was our a priori primary outcome measure) Similar results to the US study obtained - no evidence of any benefit of using computers for literacy learning Brooks et al, Ed Studies, 32(2), 2006.
Results P = P = 0.74
Conclusion This trial, which is the largest ever undertaken in the UK, shows no evidence of benefit of computer technology on spelling progression and supports the large field trial in the US. The small difference in spelling scores is not statistically significantly different. There is a statistically significant difference in reading scores; however, this favours the control group. The use of software packages for literacy learning in schools could and should be tested in large rigorously designed RCTs. Brooks et al, (2006). Educational Studies 2006;32:
School vouchers New York voucher lottery Initial results suggested a positive benefit of educational vouchers; however, using intention to treat analysis found little or no benefit of attending a private school (chosen by the parents). Kreuger and Zhu 2002; NBER working paper, 9418
Classroom assistants India - due to resource scarcity classroom assistants with 2 weeks of training were introduced using random allocation. A RCT of > 15,000 students was used to evaluate the intervention. The programme was effective, increasing maths and English scores (effect sizes and 0.28 in the first and second years) Note - small to modest effect size, but cost effective due to low cost of intervention. Banerjee et al, 2005; NBER working paper 11904
Teachers for America (TFA) In 1989 TFA programme introduced: –Selected highly qualified people and gave them 5 weeks of training over the summer; –TFA teachers placed in schools in poor neighbourhoods; –RCT comparing TFA teachers vs control teachers (traditionally and alternatively certified, and uncertified), published in 2004; –1800 pupils were randomised to 100 classes: 44 with TFA and 56 with conventional teacher.
TFA Evaluation On admission to primary school, children randomised to be taught by TFA teachers or control teachers. For maths scores: childrens gains were significantly better when taught by TFA teachers compared with control teachers. For literacy scores: no differences. Decker et al. Mathematica Policy Research, 2004
Some characteristics of a high quality pragmatic/field trial Large number of schools, or classes, at least , with 30 or so children per school. Long intervention with post- and follow-up tests. Randomisation independent from the researchers who develop the intervention and collect the data. Data collection and testing undertaken by researchers or teachers blind to group allocation. Such trials are often the norm for health education – why not for non-health education?
Conclusion? Large RCTs of important educational policy are possible, and regularly undertaken in the US and other countries. Are there important policy areas that need evaluating in the UK? E.g. Sure Start What about the use of systematic synthetic phonics teaching? All children aged 5 from September 2007 (> 500,000 children). What is the evidence for this policy?
Systematic review of phonics instruction How effective are different approaches to phonics teaching in comparison to each other (including the specific area of analytic versus synthetic phonics)?
Findings 12 individually randomised trials were identified. All were very small and only one was from the UK. Putting all the trials together in a meta-analysis found a small, statistically significant effect, on reading accuracy (moderate weight of evidence). 3 trials directly compared synthetic with analytic phonics instruction. No difference between the two approaches was found, although this was based on weak evidence.
Meta-analysis: Forest plot
Sensitivity analysis Phonics meta-analysis had significant heterogeneity. Removal of a single, small, outlier reduced this heterogeneity. However, removal of this study resulted in the advantage of phonics instruction being no longer statistically significant.
Different types of phonics Three small trials compared synthetic vs analytical phonics. A meta-analysis of these trials found no difference between the two approaches.
Synthesis: synthetic vs analytic
I am clear that synthetic phonics should be the first strategy in teaching all children to read. –Ruth Kelly – Times Mar 21 st The case for synthetic phonics is overwhelming. –Jim Rose – Times Mar 21 st 2006.
What could have happened? A large field trial could have been undertaken. This could have taken the form of a waiting list control-group design, with half of schools implementing a synthetic phonics programme in and the other half starting in Or alternatively, a stepped wedge design could have been used.
Discussion Large rigorous trials are possible in educational research. Intense opposition to RCTs from many educational researchers. Lack of UK funding of large-scale randomised trials in education. For important policy issues RCTs could and should be used.
Dr Carole Torgerson Senior Research Fellow IEE, University of York, Heslington, YORK YO10 5DD Telephone: