Presentation is loading. Please wait.

Presentation is loading. Please wait.

Warwick Clinical Trials Unit 1 Statistical Errors in Publications October 2010.

Similar presentations


Presentation on theme: "Warwick Clinical Trials Unit 1 Statistical Errors in Publications October 2010."— Presentation transcript:

1 Warwick Clinical Trials Unit 1 Statistical Errors in Publications October 2010

2 Warwick Clinical Trials Unit 2 OVERVIEW Greater emphasis on sections dealing with: Greater emphasis on sections dealing with: Design;Design; Sample size;Sample size; Statistical methodology;Statistical methodology; Results (Presentation/Interpretation);Results (Presentation/Interpretation); Discussion/Conclusion.Discussion/Conclusion.

3 Warwick Clinical Trials Unit 3 SAMPLE PAPERS Sample 1 – Randomised controlled trial – management of ankle sprains comparing elastic support bandage v. aircast ankle brace (Br. J. Sports Med, 2005);Sample 1 – Randomised controlled trial – management of ankle sprains comparing elastic support bandage v. aircast ankle brace (Br. J. Sports Med, 2005); Sample 2 – Study to assess variables which predict chronic neck pain disability (Arch Phys Med Rehabil, 2004).Sample 2 – Study to assess variables which predict chronic neck pain disability (Arch Phys Med Rehabil, 2004).

4 Warwick Clinical Trials Unit 4 PREVALENCE OF STATISTICAL ERRORS Concerns of misuse of statistics dating back over 70 years (Altman, 2004)Concerns of misuse of statistics dating back over 70 years (Altman, 2004) Despite greater awareness (e.g. CONSORT) of statistical issues such concerns have not diminishedDespite greater awareness (e.g. CONSORT) of statistical issues such concerns have not diminished

5 Warwick Clinical Trials Unit 5 Prevalence of Statistical Errors (cont’d) Serious statistical errors were found in 40% of 164 articles published in psychiatry (Altman, 2002);Serious statistical errors were found in 40% of 164 articles published in psychiatry (Altman, 2002); At least one serious statistical error occurred in 38% and 25% of papers in Nature and BMJ respectively (Garcia- Berthou and Alcaraz (2004));At least one serious statistical error occurred in 38% and 25% of papers in Nature and BMJ respectively (Garcia- Berthou and Alcaraz (2004)); Many surveys of statistical errors report error rates ranging from 30%-90% (Altman, 1991; Gore et. al., 1976; Pocock et al., 1987 and MacArthur, 1984).Many surveys of statistical errors report error rates ranging from 30%-90% (Altman, 1991; Gore et. al., 1976; Pocock et al., 1987 and MacArthur, 1984).

6 Warwick Clinical Trials Unit 6 Why are there so many errors? (Altman, 2004) Many investigators are not professional researchers, they are primarily clinicians; Many investigators are not professional researchers, they are primarily clinicians; Training usually a single course in statistics; Training usually a single course in statistics; Training focuses on data analysis, but issues such as statistical reporting and interpreting are not addressed; Training focuses on data analysis, but issues such as statistical reporting and interpreting are not addressed; Statistical content and complexity of medical research has increased steadily over recent decades.

7 Warwick Clinical Trials Unit 7 (Altman, 2004) “........ “........ When I tell friends outside medicine that many papers published in medical journals are misleading because of methodological weaknesses they are rightly shocked. Huge sums of money are spent annually on research that is seriously flawed through the use of inappropriate designs, unrepresentative samples, small samples, incorrect methods …”

8 Warwick Clinical Trials Unit 8 Observe (Natural Course of Disease) Hypothesize (Frame Research Question) Test (Conduct Experiment/ Clinical Trial) Conclude (Validate or Modify Hypothesis) Personal & Scientific Experience Research Planning, Grant Writing, Protocol Development Data Collection & Analysis Journal Articles, Scientific Meetings Concept Development Experimental Design Statistical Inference = Process = Stage = Activity

9 Warwick Clinical Trials Unit 9 DESIGN Population: Population: A population is a group of individuals persons, objects, or items from which samples are taken Sample: Sample: A sample is a finite part of a statistical population whose properties are studied to gain information about the whole Sampling: Sampling: Sampling is the process of selecting a suitable sample, or a representative part of a population for the purpose of determining parameters or characteristics of the whole population. Purpose of sampling: Purpose of sampling: To draw conclusions about populations from samples, we must use inferential statistics which enables us to determine a population`s characteristics by directly observing only a portion (or sample) of the population.

10 Warwick Clinical Trials Unit 10 Design (cont’d) Sampling error: Sampling error: What can make a sample unrepresentative of its population? One of the most frequent causes is sampling error. Two types of sampling errors: Two types of sampling errors: (i)Chance: (i)Chance: That is the error that occurs just because of bad luck. (i)Bias: (i)Bias: Sampling bias is a tendency to favour the selection of units that have particular characteristics (as a result of poor sampling plan) To avoid sampling error: To avoid sampling error: Plan careful !! select using a random selection of participants

11 Warwick Clinical Trials Unit 11 SAMPLE SIZE Sample size may be determined by various practical constraints: FinancialFinancial ResourcesResources Too small a sample is not representative of a populationToo small a sample is not representative of a population Too large a sample results in wastefulness and is unethicalToo large a sample results in wastefulness and is unethical The larger the sample size the more likely the results will reflect what will happen in the populationThe larger the sample size the more likely the results will reflect what will happen in the population

12 Warwick Clinical Trials Unit 12 Sample size (Power Calculation) (cont’d) ● Difference : Clinically important difference ● significance threshold: type I error - conventionally set at 0.01 or 0.05 ●Power: i.e. 1- type II error - conventionally 80% or 90%; How confident you are that the sample will detect a difference, if one really exists in the population Variability: The less variability among patients within each group, the more likely they reflect the overall populations.

13 Warwick Clinical Trials Unit 13 Sample size (Power Calculation) (cont’d) Increase in Sample size: (a)Smaller the clinically relevant difference; (b)Increase in power; (c)Less variability; (d)Reduction in Type I error rate Allow for dropouts and/or withdrawals

14 Warwick Clinical Trials Unit 14 Sample size (cont’d) Review the two articles in terms of :  Design  Sample size

15 Warwick Clinical Trials Unit 15 Sample size (cont’d) “….A major concern in the design of studies is the almost universal lack of reporting of how the sample size was obtained…..” (Altman, 2000). “…Basis of the power calculation is inadequately described …” (Malachy, 2004, Vail et al., 2003). (all sample papers) “Quite often sample size calculations are computed without allowing for dropouts” (McGuigan, 1995). (all sample papers)

16 Warwick Clinical Trials Unit 16 Sample size (cont’d) Small studies: Small trials have a low power and high type I error Small trials have a low power and high type I error No sample size provided, then conclusions of the study have little value (as sample 2) No sample size provided, then conclusions of the study have little value (as sample 2) If underpowered then the conclusions to be taken with caution and the results are inconclusive (as sample 1 ) If underpowered then the conclusions to be taken with caution and the results are inconclusive (as sample 1 )

17 Warwick Clinical Trials Unit 17 Sample size (cont’d) A description of the sample size in the literature should contain, for example: “ The mean and sd. for the RMQ on the active management is 5.91 and 4.27 respectively (Oxfordshire Low Back Pain trial, BMJ, 2005). The smallest difference between the two therapies which is clinically relevant is approximately 2.0. Using this information, the total number of participants required for this study will be 700, allowing for a 25% loss-to-follow up and using 90% power with a 1% type I error rate (significance level).”

18 Warwick Clinical Trials Unit 18 METHODS “................ All of the problems hinge on the understanding of what a statistical test is doing and what a p-value means....” (Murphy, 2004)

19 Warwick Clinical Trials Unit 19 METHODS A Statistical test is a procedure you use to compute a probability in support of the hypothesis (null)

20 Warwick Clinical Trials Unit 20 Methods (cont’d) e.g. H 0 : e.g. H 0 : H 1 : H 1 : Test statistic : t-test = The test statistic is transformed into a p-valueThe test statistic is transformed into a p-value

21 Warwick Clinical Trials Unit 21 Methods (cont’d) P-value: strength of the evidence (quantified by a probability) in support of the null hypothesis.P-value: strength of the evidence (quantified by a probability) in support of the null hypothesis. Neither the statistical test nor the p-value PROVE/DISPROVE the null hypothesis – they provide EVIDENCE in support of the null hypothesis.Neither the statistical test nor the p-value PROVE/DISPROVE the null hypothesis – they provide EVIDENCE in support of the null hypothesis.

22 Warwick Clinical Trials Unit 22 Methods (cont’d) Review the two articles in terms of :  Methods  Results (including figures and tables)

23 Warwick Clinical Trials Unit 23 Methods (cont’d) “.. A further issue is the copying of incorrect or inappropriate methods. Once incorrect procedures become common, it is hard to stop them from spreading through the medical literature like a genetic mutation..” (Altman, 2002). (as sample 1) “Schwartzer et al. (2000) found that most papers made important errors in the application of new technology such as models for longitudinal data.” (Altman, 2000). (e.g. Hierarchical models in sample 1; ROC curves in sample 2)

24 Warwick Clinical Trials Unit 24 Methods (cont’d) Most common errors in Methods section: Failure to check assumption (Nature says that the most common error was not checking for a normal distribution and not stating how normality was tested);Failure to check assumption (Nature says that the most common error was not checking for a normal distribution and not stating how normality was tested); Using linear regression analysis without first establishing that the relationship is linear;Using linear regression analysis without first establishing that the relationship is linear; Ignoring paired or ordered categories and therefore using an inappropriate test;Ignoring paired or ordered categories and therefore using an inappropriate test; Arbitrarily dividing continuous data into ordinal categories without explanation (“Data dredging”);Arbitrarily dividing continuous data into ordinal categories without explanation (“Data dredging”); Multiple comparison (could increase the likelihood of significant result) (sample 2)Multiple comparison (could increase the likelihood of significant result) (sample 2) And many more ……. sub-group analyses, ignoring repeated measures design, non-matched analysis for matched data, modelling incorrectly, i.e. interactions not included ……. And many more ……. sub-group analyses, ignoring repeated measures design, non-matched analysis for matched data, modelling incorrectly, i.e. interactions not included …….

25 Warwick Clinical Trials Unit 25 Methods (cont’d) Begin a statistical analysis with data exploration;Begin a statistical analysis with data exploration; Check assumptions;Check assumptions; Type of data – continuous, binary, ordinal, repeated over time, etc.Type of data – continuous, binary, ordinal, repeated over time, etc. Missing values, outliers, no. of withdrawals;Missing values, outliers, no. of withdrawals; Be careful with computer output (often helps to do simple calculations by hand first).Be careful with computer output (often helps to do simple calculations by hand first).

26 Warwick Clinical Trials Unit 26 RESULTS “..The results section must be written so that the average reader can understand the study findings” (Cummings, 2003). “… poorly written with excessive jargon …” (Byrne, 2000). (sample 1 and sample 2) “.. A major bias is cherry-picking results…” (Malachy, 2004).

27 Warwick Clinical Trials Unit 27 Results (cont’d) Common Language pitfalls Avoid non-technical uses of technical terms such as “normal”, “significant”, “sample”;Avoid non-technical uses of technical terms such as “normal”, “significant”, “sample”; “No difference” means “evidence of lack of statistical significant difference”;“No difference” means “evidence of lack of statistical significant difference”; (Sample 1) (Sample 1) p-values - using 2 digit precision (e.g. p = 0.82); p-values - using 2 digit precision (e.g. p = 0.82); Do not reduce p-values to ‘non-significant’ or ‘NS’; Do not reduce p-values to ‘non-significant’ or ‘NS’; Report a quantity so as that it is scientifically relevant (e.g. mean blood pressure of 115.73 mmHg should be reported as 115.7 mmHg or even 116 mmHg) Report a quantity so as that it is scientifically relevant (e.g. mean blood pressure of 115.73 mmHg should be reported as 115.7 mmHg or even 116 mmHg)

28 Warwick Clinical Trials Unit 28 Results (cont’d) P-values: Over-emphasis on the p-value;Over-emphasis on the p-value; An arbitrary division of the results into “significant” and “non-significant” according to the p-value was not the intention of the founders of statistical inference;An arbitrary division of the results into “significant” and “non-significant” according to the p-value was not the intention of the founders of statistical inference; Smaller p-values indicate a strong evidence against the null hypothesis.Smaller p-values indicate a strong evidence against the null hypothesis.

29 Warwick Clinical Trials Unit 29 Results (cont’d) Confidence Intervals: A confidence interval is simply a range of values which enclose the population value; Confidence intervals are preferable to p-values, as they tell us the range of possible effect sizes compatible with the data; The larger the sample size the narrower the confidence interval; A confidence interval based on the difference (e.g. treatment difference) and contains a 0, or on a ratio (e.g. odds ratio) and contains a 1, implies lack of evidence of a statistically significant difference.

30 Warwick Clinical Trials Unit 30 Results (cont’d) and many more pitfalls ….. and many more pitfalls ….. testing baseline values (sample 1) ;testing baseline values (sample 1) ; not reporting missing data;not reporting missing data; lack of statistical power not considered; lack of statistical power not considered; misinterpreting and misunderstanding results from models e.g. no interactions included. misinterpreting and misunderstanding results from models e.g. no interactions included.

31 Warwick Clinical Trials Unit 31 PRESENTATION In tables that compare groups include count (of patients or events) and column percentages;In tables that compare groups include count (of patients or events) and column percentages; Use appropriate statistics (mean instead of median for non-normal data);Use appropriate statistics (mean instead of median for non-normal data); In tables of column percentages, do not include a row of counts and percentage of missing data (doing this will distort the other percentages in the table);In tables of column percentages, do not include a row of counts and percentage of missing data (doing this will distort the other percentages in the table); Statistical software packages provide a large amount of output – need to be selective about what is presented;Statistical software packages provide a large amount of output – need to be selective about what is presented; Use graphs as alternative to tables with many entries; do not duplicate graphs and tables.Use graphs as alternative to tables with many entries; do not duplicate graphs and tables. Labelling graphs and tables correctly (sample 1 and sample 2)Labelling graphs and tables correctly (sample 1 and sample 2)

32 Warwick Clinical Trials Unit 32 INTERPRETATION AND DISCUSSION Put the study sample in context of the population;Put the study sample in context of the population; Interpreting studies with non-significant results and low statistical power as “negative” (when they are inconclusive) “The absence of proof is not proof of absence”;Interpreting studies with non-significant results and low statistical power as “negative” (when they are inconclusive) “The absence of proof is not proof of absence”; Errors encountered in the design and analysis of a study can also continue through to errors in interpretation (Rushton, 1999);Errors encountered in the design and analysis of a study can also continue through to errors in interpretation (Rushton, 1999); Weaknesses in study design and study strengths stated so that a clear and accurate impression of the reliability of the data can be formed. Weaknesses in study design and study strengths stated so that a clear and accurate impression of the reliability of the data can be formed.

33 Warwick Clinical Trials Unit 33 And finally….. The misuse of statistics is very important; The need for statisticians to be involved in research at some stage, preferably early as possible; Most errors relatively unimportant; Some can have major bearings on the validity of the study. So…….

34 Warwick Clinical Trials Unit 34 “ “ “ “There are three kinds of lies: lies, damn lies and statistics”. Benjamin Disreali.

35 Warwick Clinical Trials Unit 35


Download ppt "Warwick Clinical Trials Unit 1 Statistical Errors in Publications October 2010."

Similar presentations


Ads by Google