Presentation is loading. Please wait.

Presentation is loading. Please wait.

Empowering Evidence: Basic Statistics June 3, 2015 Julian Wolfson, Ph.D. Division of Biostatistics School of Public Health.

Similar presentations


Presentation on theme: "Empowering Evidence: Basic Statistics June 3, 2015 Julian Wolfson, Ph.D. Division of Biostatistics School of Public Health."— Presentation transcript:

1 Empowering Evidence: Basic Statistics June 3, 2015 Julian Wolfson, Ph.D. Division of Biostatistics School of Public Health

2 My goal for today Introduce the major statistical “potholes” to be on the lookout for when interpreting published research.

3 NOT my goal for today ● Cram an intro stat course into one hour ● Teach you how to analyze data ● Give you a recipe for deciding whether a statistical analysis has been conducted correctly

4 Leek and Peng, Nature (2015)

5 3. Statistical vs. scientific significance 1. Selection bias and confounding 2. Multiple comparisons and post-hoc analysis

6 What is statistics? Statistics is the science which allows us to draw reliable conclusions from data. For the purposes of evaluating evidence, we are mainly interested in statistical inference, which involves quantifying the uncertainty of our conclusions based on the data in hand.

7

8 ● In medical science, we mostly seek to understand cause-effect relationships. ● Randomized intervention studies are one important tool. ● But sometimes we are “stuck” with observational data. All roads lead to causality

9 Selection bias & confounding

10 Selection bias: what to watch out for ● Unmeasured (unmeasurable?) risk factors ● Exclusion of observations with missing data ● Post-randomization comparisons in randomized studies: o “Compliers only” analyses o Surrogate endpoint analyses

11

12

13 Selection bias & confounding Recommendations: 1.Seek randomized trial evidence wherever possible, but be skeptical of non-ITT analyses. 2.Look for how missing data and drop-out were handled 3.Do a “mental sensitivity analysis” → how large would effect of selection bias / confounding have to be to change the scientific conclusion?

14 The almighty p-value For statistical inference, we seek to evaluate the plausibility of the null hypothesis about some characteristic of the population: ● Mean SBP is 135 ● Probability of getting the flu over next 6 months is the same for two vaccines ● Median time to progression is the same for two chemotherapeutic agents

15 The almighty p-value We evaluate the evidence for/against the null hypothesis on the basis of our sample. The p-value tells us how surprised we should be to see data “as or more extreme” than that in our sample. If the p-value based on our sample is small (we are very surprised!), we take a leap of faith and declare that the null hypothesis is false.

16 HTTP :// XKCD. COM /1478/

17 P-value fishing Hypothesis tests: ● Are designed to control the Type I error rate, the probability of rejecting the null hypothesis when it is true. ● Rely on the assumption that you are performing a single well-defined, repeatable experiment.

18 P-value fishing Problem: Common abuses of significance testing (“p-value fishing”) result in hypothesis testing procedures which do NOT control the Type I error rate. One of the most common abuses is post-hoc subgroup analysis.

19 Example ● Suppose you perform a study to assess the effect of a tuberculosis vaccine vs. placebo ● Overall, there is no effect, but you notice that the vaccine appears more effective in Hispanic women. ● You test the null hypothesis of no vaccine effect for Hispanic women → p = 0.013 ● You report “The vaccine offers protection against tuberculosis for Hispanic females (p = 0.013).”

20

21 Post-hoc subgroup analysis The major problem with post-hoc subgroup analysis is that you are often using the same data to generate and test the hypothesis. Think: What is the repeatable experiment here?

22 Post-hoc subgroup analysis Recommendation: When evaluating evidence, try to establish whether tested hypotheses were pre-specified. If not, be very cautious about interpretation.

23 Power The power of a hypothesis test is the probability of rejecting the null hypothesis when it is false in some specified way. e.g., “Our study has 80% power to detect a 20 mg/dl drop in total cholesterol.”

24 Sample size Most study designs trade off power and sample size, while keeping the Type I error rate fixed. In many studies, “sample size” is hard to pin down: ● Longitudinal studies ● Adaptive designs ● Cluster randomization

25 Beware large sample sizes! Budding researchers are universally warned about drawing conclusions from small samples. But inferential statistics is very good at quantifying uncertainty in these settings. With bigger sample sizes now feasible to collect and analyze, we need to have a conversation about the dangers of large N.

26 Statistical vs. Scientific Significance As sample sizes increase, smaller differences between groups become “detectable” (likely to yield p < 0.05). Effects which are statistically significant may be so small as to be scientifically insignificant.

27 Obstet Gynecol. 2010 Feb; 115(2 Pt 1): 357–364.

28 Statistical vs. Scientific Significance Recommendation: Look at the effect estimate (and corresponding confidence interval*) in addition to the p-value. *Confidence interval: An estimated range for the effect which should contain the true effect size (90/95/99)% of the time.

29 Wrap-up The major “statistical” issues which arise when evaluating evidence are non-technical, mostly scientific issues. (Bio)statisticians are trained to recognize these issues and prevent/correct them or explain their possible impact on results. Evaluating evidence requires a team-based approach which combines statistical and domain expertise.

30 Thank you!


Download ppt "Empowering Evidence: Basic Statistics June 3, 2015 Julian Wolfson, Ph.D. Division of Biostatistics School of Public Health."

Similar presentations


Ads by Google