Statistics : the ten main mistakes Didier Concordet Ecole Nationale Vétérinaire de Toulouse July 2005.

Slides:



Advertisements
Similar presentations
Adaptation to heat waves occurrence in France S. Planton*, M. Gillet**, M. Déqué*, and J. Manach* *Météo-France, CNRM ** « Observatoire National sur les.
Advertisements

Ecole Nationale Vétérinaire de Toulouse Linear Regression
Impact of Exploratory Analysis on Drug Approval Joga Gobburu Pharmacometrics Office Clinical Pharmacology, CDER, FDA
The Application of Propensity Score Analysis to Non-randomized Medical Device Clinical Studies: A Regulatory Perspective Lilly Yue, Ph.D.* CDRH, FDA,
SJS SDI_11 Design of Statistical Investigations Stephen Senn 1 General Introduction.
What is a review? An article which looks at a question or subject and seeks to summarise and bring together evidence on a health topic.
1 From the analytical uncertainty to uncertainty in data interpretation D. Concordet, J.P. Braun
Intro to Statistics Part2 Arier Lee University of Auckland.
Is There a Trade-Off Between Quality and Cost? An Experiment Comparing Telephone vs. Face-to-Face Responses to the National Beneficiary Survey May 13,
Power and sample size.
Understanding p-values Annie Herbert Medical Statistician Research and Development Support Unit
Regulatory Clinical Trials Clinical Trials. Clinical Trials Definition: research studies to find ways to improve health Definition: research studies to.
Bias in Clinical Trials
What is statistics? “Statistics is the study of the collection, organization, analysis, and interpretation of data.”
Sample Size & Power Estimation Computing for Research April 9, 2013.
How are we doing? Sort the types of error into sampling and non-sampling errors, then match the situations to the types of error.
Veterinary clinical studies Key issues for statistical analysis Didier Concordet ECVPT Workshop July 2009 Ecole Nationale Vétérinaire.
The equivalence trial Didier Concordet NATIONAL VETERINARY S C H O O L T O U L O U S E.
ODAC May 3, Subgroup Analyses in Clinical Trials Stephen L George, PhD Department of Biostatistics and Bioinformatics Duke University Medical Center.
Slides to accompany Weathington, Cunningham & Pittenger (2010), Chapter 4: An Overview of Empirical Methods 1.
Introduction to Critical Appraisal : Quantitative Research
15 de Abril de A Meta-Analysis is a review in which bias has been reduced by the systematic identification, appraisal, synthesis and statistical.
Non-Experimental designs: Developmental designs & Small-N designs
Meta-analysis & psychotherapy outcome research
Statistics: The Science of Learning from Data Data Collection Data Analysis Interpretation Prediction  Take Action W.E. Deming “The value of statistics.
Journal Club Alcohol, Other Drugs, and Health: Current Evidence May–June 2009.
CRITICAL READING OF THE LITERATURE RELEVANT POINTS: - End points (including the one used for sample size) - Surrogate end points - Quality of the performed.
Some recent changes and challenges in french primary care Isabelle Dupie, Hector Falcoff International Forum on Quality & Safety in Healthcare Paris, 8-11.
Chapter 2 – Experimental Design and Data Collection Math 22 Introductory Statistics.
1.1: An Overview of Statistics
Data and Data Collection Quantitative – Numbers, tests, counting, measuring Fundamentally--2 types of data Qualitative – Words, images, observations, conversations,
Dr Roland Simons Strategic Policy and Education Futures 13 September 2005 What are some key strengths and issues for them? What are longitudinal surveys?
2 Accuracy and Precision Accuracy How close a measurement is to the actual or “true value” high accuracy true value low accuracy true value 3.
Basic Statistics in Clinical Research Slides created from article by Augustine Onyeaghala (MSc, PhD, PGDQA, PGDCR, MSQA,
Fall 2013 Lecture 5: Chapter 5 Statistical Analysis of Data …yes the “S” word.
Gil Harari Statistical considerations in clinical trials
Warwick Clinical Trials Unit 1 Statistical Errors in Publications October 2010.
QUALITY ASSURANCE Reference Intervals Lecture 4. Normal range or Reference interval The term ‘normal range’ is commonly used when referring to the range.
T tests comparing two means t tests comparing two means.
Chapter 1: Research Methods
1 Chapter 3: Experimental Design. 2 Effect of Wine Consumption on Heart Disease Death Rate **Each data point represents a different country.
Investigating Scientific Claims. Outline I. Experimental vs. Observational Science II. Evidence vs. Inference A. Definitions B. Examples III. Types of.
I.Intro to Statistics II.Various Variables. I.Intro to Statistics A. Definitions -
Lecture 5: Chapter 5: Part I: pg Statistical Analysis of Data …yes the “S” word.
Statistics Section 2-3 Day 1 - Sampling in the Real World.
Design and Analysis of Clinical Study 2. Bias and Confounders Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia.
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
Medical Statistics as a science
How confident are we in the estimation of mean/proportion we have calculated?
Non-Experimental designs
Panel questions Study design Effectiveness Safety Labeling.
April Center for Open Fostering openness, integrity, and reproducibility of scientific research.
Discussion on small CT: brief overview of perspective and way forward (?) Annalisa Trama National Centre Rare Diseases, Italy ICORD 3 rd International.
HCS 465 Week 4 Individual Evaluating the Research Process To purchase this material click below link
Capture / Recapture Applying Ratios to Probability
Confidence Intervals and p-values
Donald E. Cutlip, MD Beth Israel Deaconess Medical Center
The Role of Statistics in Clinical Trials
Central Tendency and Variability
Statistics in AP Psychology
Sue Todd Department of Mathematics and Statistics
Quality Assurance Reference Intervals.
Tim Auton, Astellas September 2014
Data and Data Collection
Chapter One Data Collection
Drug Information Resources
Chapter One Data Collection
Illustration of a trial design to help evaluate the clinical accuracy of a test of ischaemia. Illustration of a trial design to help evaluate the clinical.
Comparison of biological replicates for Ery-treated samples.
Number of patients treated at clinics that followed up fewer than 10 patients (2013–2016) or 20 patients (2012) and proportion of patients followed up.
Presentation transcript:

Statistics : the ten main mistakes Didier Concordet Ecole Nationale Vétérinaire de Toulouse July 2005

2 Statistical mistakes are frequent Many surveys of statistical errors in the medical literature with error rates ranging from 30%-90% (Altman, 1991; Gore et. al.,1976; Pocock et. al., 1987 and MacArthur, 1984) Reviews of the biomedical literature have consistently found that about half the articles use incorrect statistical methods (Glantz, 1980)

3 When do they occur ? When designing the experiment When collecting data When analysing data When interpreting results

4 Design Lack of a proper randomisation the inference space is not defined poor balance of the groups to be compared lack of control group (maybe les frequent now) there exist confounding factors Lack of power the sample size is not large enough to answer the question the statistical unit is not well defined

5 Inference space definition (M1) An experiment in 2 years old beagles showed that the temperature of dogs treated with the antipyretic drug A decreased by 2 °C. Does this result still hold for all 2 years old beagles 3 years olds beagles beagles dogs man

6 Poor balance (M2) Clinical trial comparison of 2 antipyretics rectal temperature after treatment X = 39 N = 100 SD = 1 REFERENCE X = 37 N = 100 SD = 1 New TRT Reference < New TRT ( P<0.001)

7 Poor balance Clinical trial comparison of 2 antipyretics rectal temperature after treatment Clinical trial 1 X = 40 N = 90 SD = 1 REFERENCE X = 42 N = 50 SD = 1 New TRT New TRT< Ref P<0.001 Clinical trial 2 X = 30 N = 10 SD = 1 REFERENCE X = 32 N = 50 SD = 1 New TRT New TRT < Ref P<0.001 Conclusion : Reference > New TRT

8 Power (M3) A clinical study to compare efficacy of two treatments (Ref. and Test) Expected difference between the treatments = 4 SD 2. For the efficacy variable A parallel two groups design is planned with 5 dogs in each groups What to think about this study ? 35 % of power for a type I risk of 5% Even if the expected difference exists, only 35% of the samples (of size 5)of dogs actually exhibits it !

9 Power Efficacy variable on two groups of dogs Ref Test Mean 15.4 SD N55 Student t-test :P = 0.18 Actually no conclusion

10 A real story A study was performed in order to study the effect of diet on several biochemical compounds (about 20). To this end, a dog was fed with a "normal" diet during 3 months and then with the new diet during 3 months. Every two days, a blood sample was taken and the biochemical compounds were dosed. At the end of the experiment 90 data were available for each biochemical compound. There was a significant difference between the effects of the two diets for 10 biochemical compounds (P<0.001). This result was obtained with a sample size of 90

11 Statistical unit (M4) The statistical unit (an individual) is a statistical object that cannot be divided. We want to generalise results obtained on a finite collection of units (a sample) to a population of units. Despite the appearance of "wealth", the sample size was equal to 1 not 90. At the end of the experiment, the only dog of the experiment was well known but what about the other dogs of the population ?

12 Experiment Missing data not adequately reported Extreme values excluded Data ignored because they did not support the hypothesis ?

13 Analysis Failure to check assumptions of the statistical methods (M5) homoscedasticity (for a t-test, a linear regression,…) using a linear regression without first establishing linearity… correlation Ignoring informative "missing" data death and its consequences data below LOQ Choosing the question to get an answer Multiple comparisons

14 Homoscedasticity (M5) 1 Treatment Clearance 2 t-test P-value = 0.56 After log-transf P-value = What the t-test can see

15 Linearity/Correlation (M5) Linear regression Correlation R = Linear regression Correlation R = -0.93

16 Linearity/Correlation Linear regression Correlation R = 0.84 A linear model with 3 groups Within group Correlation R = -0.92

17 Ignoring data (M6)

18 Ignoring data

19 Choosing the question to get an answer (M7) Occurs frequently in the presentation of clinical trials results The question becomes random : it changes with the sample of animals. The question is chosen with its answer in hands… Think about a flip coin game where you win 1 when tail or head occurs. You choose the decision rule once you know the result of the flip ! Such an approach increases the number of false discoveries.

20 Multiple comparisons (M8) Mean SD One wants to compare the ADG obtained with 5 different diets in pig Ten T-tests A risk of 5% for each comparison : the global risk can be very large

21 Interpretation/presentation Standard error and standard deviation P values : non significant effects False causality

22 Standard error / standard deviation (M9) The clairance of the drug was equal to 68 ± 5 mL/mn Two possible meanings depending on the meaning of 5 If 5 is the standard error of the mean (se) there is 95 % chance that the population mean clearance belongs to [ ; ] If 5 is the standard deviation (SD) 95 % of animals have their clearance within [ ; ]

23 P values (M10) The difference between the effect of the drugs A and B is not significant (P = 0.56) therefore drug A can be substituted by drug B. NO The only conclusion that can be drawn from such a P value is that you didn't see any difference between the effect of the drugs A and B. That does not mean that such a difference does not exist. Absence of evidence is not evidence of absence

24 P values (M10) The drug A has a higher efficacy than the drug B (P = 0.001) The drug C has a higher efficacy than the drug B (P = 0.04) Since 0.001<0.04 the drug A has a higher than the drug B. NO The only conclusion that can be drawn from such a P value is that you are sure than A>B and less sure than C>B. This does not presume anything about the amplitude of the differences. Significant does not mean important

25 False causality : lying with statistics There is a strong positive correlation between the number of firefighters present at a fire and the amount of fire damage. Thus, the firefighters present at fire create higher fire damage ! The correlation coefficient is nothing else than a measure of the strength of a linear relationship between 2 variables. Correlation cannot establish causality. A strong correlation between X and Y can occurs when "X" causes "Y" "Y" causes "X" "Z" causes "X" and "Y" (Z = fire size in the previous example) Incidentally with small samples size when X and Y are independent

26 How to avoid these mistakes ? Consult your prefered statistician for help in the design of complicated experiments Use basic descriptive statistics first (graphics, summary statistics,…) Use common sense Consider to learn more statistics