Sample size issues & Trial Quality David Torgerson.

Slides:



Advertisements
Similar presentations
Appraisal of an RCT using a critical appraisal checklist
Advertisements

Experimental designs Non-experimental pre-experimental quasi-experimental experimental No time order time order variable time order variables time order.
Meta-analysis: summarising data for two arm trials and other simple outcome studies Steff Lewis statistician.
Designing Clinical Research Studies An overview S.F. O’Brien.
Observational Studies and RCT Libby Brewin. What are the 3 types of observational studies? Cross-sectional studies Case-control Cohort.
Adapting Designs Professor David Torgerson University of York Professor Carole Torgerson Durham University.
1 Health and Disease in Populations 2002 Week 9 – 2/5/02 Randomised controlled trials 2 Dr Jenny Kurinczuk.
The Bahrain Branch of the UK Cochrane Centre In Collaboration with Reyada Training & Management Consultancy, Dubai-UAE Cochrane Collaboration and Systematic.
天 津 医 科 大 学天 津 医 科 大 学 Clinical trail. 天 津 医 科 大 学天 津 医 科 大 学 1.Historical Background 1537: Treatment of battle wounds: 1741: Treatment of Scurvy 1948:
Journal Club Alcohol, Other Drugs, and Health: Current Evidence July–August 2013.
What makes a good quality trial? Professor David Torgerson York Trials Unit.
Thinking hats: What are the key assumptions of each approach? What are the benefits of each approach? What are the weaknesses of each approach? This is.
Pragmatic or explanatory trial? Hywel Williams University of Nottingham with help from Daniel Bratton and Andrew Nunn MRC Clinical Trials Unit HTA reference.
Who are the participants? Creating a Quality Sample 47:269: Research Methods I Dr. Leonard March 22, 2010.
How does the process work? Submissions in 2007 (n=13,043) Perspectives.
N = 1, Cross-Over Trials and Balanced Designs. N = 1 Trials Trials can be undertaken with just one participant. If the condition is a chronic relapsing.
Clinical Trials Hanyan Yang
Factorial Designs. Background Factorial designs are when different treatments are evaluated within the same randomised trial. A factorial design has a.
Sample Size Determination
Dr Amanda Perry Centre for Criminal Justice Economics and Psychology, University of York.
8.0 Assessing the Quality of the Evidence. Assessing study quality or critical appraisal ► ► Minimize Bias ► ► Weight for Quality ► ► Assess relationship.
Study Designs By Az and Omar.
Making all research results publically available: the cry of systematic reviewers.
Are the results valid? Was the validity of the included studies appraised?
Discussion Gitanjali Batmanabane MD PhD. Do you look like this?
Randomised controlled trials Peter John. Causation in policy evaluation Outcome Intervention Other agency actions External environment.
EBD for Dental Staff Seminar 2: Core Critical Appraisal Dominic Hurst evidenced.qm.
Lecture 16 (Oct 28, 2004)1 Lecture 16: Introduction to the randomized trial Introduction to intervention studies The research question: Efficacy vs effectiveness.
Academic Viva POWER and ERROR T R Wilson. Impact Factor Measure reflecting the average number of citations to recent articles published in that journal.
Evidence Based Medicine
Sample size determination Nick Barrowman, PhD Senior Statistician Clinical Research Unit, CHEO Research Institute March 29, 2010.
Power and Sample Size Determination Anwar Ahmad. Learning Objectives Provide examples demonstrating how the margin of error, effect size and variability.
Study design P.Olliaro Nov04. Study designs: observational vs. experimental studies What happened?  Case-control study What’s happening?  Cross-sectional.
EVIDENCE BASED MEDICINE Effectiveness of therapy Ross Lawrenson.
ARROW Trial Design Professor Greg Brooks, Sheffield University, Ed Studies Dr Jeremy Miles York University, Trials Unit Carole Torgerson, York University,
Statistical Power 1. First: Effect Size The size of the distance between two means in standardized units (not inferential). A measure of the impact of.
Evidencing Outcomes Ruth Mann / George Box Commissioning Strategies Group, NOMS February 2014 UNCLASSIFIED.
Understanding real research 4. Randomised controlled trials.
EBCP. Random vs Systemic error Random error: errors in measurement that lead to measured values being inconsistent when repeated measures are taken. Ie:
The days ahead Monday-Wednesday –Training workshop on how to measure the actual reduction in HIV incidence that is caused by implementation of MC programs.
Sample Size And Power Warren Browner and Stephen Hulley  The ingredients for sample size planning, and how to design them  An example, with strategies.
Chapter 3.1.  Observational Study: involves passive data collection (observe, record or measure but don’t interfere)  Experiment: ~Involves active data.
What is a non-inferiority trial, and what particular challenges do such trials present? Andrew Nunn MRC Clinical Trials Unit 20th February 2012.
Sampling and Probability Chapter 5. Sampling & Elections >Problems with predicting elections: Sample sizes are too small Samples are biased (also tied.
Study designs. Kate O’Donnell General Practice & Primary Care.
Objectives  Identify the key elements of a good randomised controlled study  To clarify the process of meta analysis and developing a systematic review.
Critical Appraisal (CA) I Prepared by Dr. Hoda Abd El Azim.
Chapter 6 Conducting & Reading Research Baumgartner et al Chapter 6 Selection of Research Participants: Sampling Procedures.
Compliance Original Study Design Randomised Surgical care Medical care.
EVALUATING u After retrieving the literature, you have to evaluate or critically appraise the evidence for its validity and applicability to your patient.
Experiments Textbook 4.2. Observational Study vs. Experiment Observational Studies observes individuals and measures variables of interest, but does not.
Introduction to General Epidemiology (2) By: Dr. Khalid El Tohami.
Research methods Designing an experiment Lesson 5.
Looking for statistical twins
Analytical Interventional Studies
Critically Appraising a Medical Journal Article
Confidence Intervals and p-values
CHAPTER 4 Designing Studies
Randomized Trials: A Brief Overview
Research Designs, Threats to Validity and the Hierarchy of Evidence and Appraisal of Limitations (HEAL) Grading System.
Chapter 13- Experiments and Observational Studies
Inferential statistics,
Research Methods A Method to the Madness.
Alcohol, Other Drugs, and Health: Current Evidence May-June, 2018
Evidence Based Practice 3
Perpetrator Programs: What we know about completion and re-offending
Interpreting Basic Statistics
Chapter 12 Power Analysis.
Sample Sizes for IE Power Calculations.
Alcohol, Other Drugs, and Health: Current Evidence July-August, 2018
Presentation transcript:

Sample size issues & Trial Quality David Torgerson

Chance When we do a trial we want to be sure that any effect we see is not simply by chance. The probability of any difference occurring by chance declines as the sample size increases. Also as the sample size increases smaller differences are likely to be statistically significant.

Statistical significance A p value of 0.05 means if we repeated the same trial 100 times we would expect to observe the difference that occurred by chance 5 times. A small p value does not relate to the strength of the association a small difference will have a small p value if the sample size is large.

Power As well as significance there is the issue of power. A sample size might have 80% power to detect a specified difference at 5%. In other words for a given sample size we would have an 80% probability of observing an effect if it exists with a 5% significance.

Sample Size Don’t believe statistics textbooks. A trial can NEVER be too big. Most trials in education use tiny sample sizes (e.g., 30 participants). A small trial will miss an important difference. For example, if a school based intervention increased exam pass rates by 10% this would have very important benefits to society. BUT we would need a trial of at lease 800 participants in an individually randomised trial to observe this effect.

Sample Size Small trials will miss important differences. Bigger is better in trials. Why was the number chosen? For example “given an incidence of 10% we wanted to have 80% power to show a halving to 5%” or “we enrolled 100 participants”.

What is a reasonable difference? In a review by Lipsey and Wilson in 1993 of all quasi-experiments in the social sciences they found few effective interventions had effect sizes greater than 0.5. Health care produces similar gains of 0.5 of a standard deviation or lower. Lipsey & Wilson, American Psychologist 48,

Effect size & Sample size We should, therefore, plan trials that are large enough to identify a difference of 0.5 of a standard deviation between the two experimental groups if it exists.

Who needs a statistician? A simple way to calculate a sample size is to take 32 (for 80% power, 42 for 90%) and divide this by the square of effect size. 0.5 squared is /0.25 = is 512, note halving the effect size quadruples the sample size.

Cluster sample size Because often in education we will randomise by classes or schools we need to take the correlation between pupils into account for sample size calculations. This can often lead to a doubling or more of the sample participants.

Reporting Quality of Trials As well as having an adequate sample size there are other important aspects of trial quality.

Important quality items Allocation method; »method randomisation; »secure randomisation. Intention to treat analysis. Blinding. Attrition.

Blinding Who knew who got what when? Was the participant blind? Most IMPORTANT was outcome assessment blind?

Attrition What was the final number of participants compared with the number randomised? What happened to those lost along the way? Was there equal attrition?

External Validity Once a trial has high internal validity our next task is to assess whether its results are applicable outside its sample. Are participants similar to the general population on who we would apply the intervention? Was intervention used generalisable?

Methods comparison of trials We undertook a methodological review of RCTs in health and education to answer the following questions: »Were only bad trials prevalent in health care? »Was methodological quality improving over time? Torgerson et al. BERJ, 2004: Accepted.

Study Characteristics

Change in concealed allocation NB No education trial used concealed allocation P = 0.04P = 0.70

Blinded Follow-up P = 0.03 P = 0.13P = 0.54

Underpowered P = 0.22 P = 0.76P = 0.01

Mean Change in Items P= P= 0.07 P= 0.03

Quality conclusions Apart from ‘drug’ trials the quality of health care trials is poor and not improving outside of major journals. Education trials are bad and getting worse!

Trial Examples CBT treatment vs Fire Safety Education (FSE) for child arsonists. N = 38 boys randomised to receive FSE or CBT. Outcomes measured at 13 weeks and 12 months included firesetting and match play behaviour Kolko J Child Psychiatr 2001:42:359.

Results Outcomes were mixed in some outcomes were favourable for CBT, whilst for others, there was no difference.

Results

Problems with Trial Too SMALL. Trial could have missed very important differences. Outcomes were NOT arson, there were no reports of arson by any children in the study. Unsure whether randomisation was concealed.

Domestic violence experiment 404 men convicted of partner abuse were randomised to probation or counselling. Data were collected at 12 months on re- arrests, beliefs and behaviours on partner abuse. Feder & Dugan 2002; Justice Quarterly 19;343.

Results No difference in re-offending as measured by re-arrest statistics (I.,e 24% in both groups). No differences in attitudes towards partner abuse.

Trial Methods Trial was relatively large (> 200 in each group) would have had enough power to detect a halving of offending. That is 24% down to 12%. For ‘beliefs’ there was a high drop-out rate, (50%) which may make those results unreliable. Allocation appeared to be secure. Cross-over was slight. Unclear as to whether re-arrest data was collected ‘blindly’.

Conclusion Counselling is probably an ineffective method of trying to prevent spousal abuse. Other interventions should be sought. Message: if you’re being battered by your Spouse, don’t bother with counselling!

Preventing unscheduled school transfers Unscheduled school transfers are associated with poor academic outcomes. The raising healthy children project in Seattle aimed to put into place interventions among high risk students who exhibit academic or behavioural problems. Fleming et al, 2001: Evaluation Review 25:655.

Design Cluster randomised trial 10 schools randomised. 5 experimental schools received a variety of interventions to help high risk students and their families. Analysis was multilevel to take into account clustering.

Results The intervention showed that there was a reduction of 2/3rds in the transfer rates, which was statistically significant. 61% versus 45% difference 16% NOTE, that the intervention schools still had a high transfer rate. Also effects of intervention waned over the 5 years, suggesting it would need to be continuous to be effective.

Study implications Study showed an effective intervention. Number of clusters (10) was on small side, ideally should have been more. High chance of missing a smaller effect.

Conclusions RCT is the BEST evaluative method. They can, and have been done, in the field of education. We need MORE larger and better quality trials to inform future policy in this area.