Presentation is loading. Please wait.

Presentation is loading. Please wait.

April Center for Open Fostering openness, integrity, and reproducibility of scientific research.

Similar presentations


Presentation on theme: "April Center for Open Fostering openness, integrity, and reproducibility of scientific research."— Presentation transcript:

1 April Clyburne-Sherin @april_cs Center for Open Science @OSFramework http://cos.io/ Fostering openness, integrity, and reproducibility of scientific research

2 Technology to enable change Training to enact change Incentives to embrace change

3 Technology to enable change Training to enact change Incentives to embrace change

4 Reproducible statistics in the health sciences April Clyburne-Sherin Reproducible Research Evangelist april@cos.io

5 Reproducible statistics in the health sciences The problem with the published literature Reproducibility Power Reporting Bias Research degrees of freedom The solution Preregistration How to evaluate the published literature p-values Effect sizes and confidence intervals How to preregister Open Science Framework

6 Reproducible statistics in the health sciences Learning objectives The findings of many studies cannot be reproduced Low powered studies produce inflated effect sizes Low powered studies produce low chance of finding true positives Researcher Degrees of Freedom lead to false positive inflations Selective reporting biases the literature Preregistration is a simple solution for reproducible statistics A p-value is not enough to establish clinical significance Effect sizes plus confidence intervals work better together

7

8 Button et al. (2013) Power in Neuroscience

9 Figure 1. Positive Results by Discipline. Fanelli D (2010) “ Positive ” Results Increase Down the Hierarchy of the Sciences. PLoS ONE 5(4): e10068. doi:10.1371/journal.pone.0010068 http://127.0.0.1:8081/plosone/article?id=info:doi/10.1371/journal.pone.0010068

10 The findings of many studies cannot be reproduced Why should you care? To increase the efficiency of your own work Hard to build off our own work, or work of others in our lab We may not have the knowledge we think we have Hard to even check this if reproducibility low

11 Current barriers to reproducibility ● Statistical o Low power o Researcher degrees of freedom o Ignoring null results ● Transparency o Poor documentation o Loss of materials and data o Infrequent sharing

12 Low powered studies mean low chance of finding a true positive ● Low reproducibility due to power o 16% chance of finding the effect twice ● Inflated effect size estimates ● Decreased likelihood of true positives

13 Researcher Degrees of Freedom lead to false positive inflations Simmons, Nelson, & Simonsohn (2012)

14 Selective reporting biases the literature Selective reporting  Outcome reporting bias 1.Chan, An-Wen, et al. "Empirical evidence for selective reporting of outcomes in randomized trials: comparison of protocols to published articles." Jama 291.20 (2004): 2457-2465. 2.Macleod, Malcolm R., et al. "Biomedical research: increasing value, reducing waste." The Lancet 383.9912 (2014): 101-104. 62% of trials had at least one primary outcome changed, introduced or omitted 50%+ of pre-specified outcomes not reported

15 Why does selective reporting matter? Selective reporting  Outcome reporting bias Response from a trialist who had analysed data on a prespecified outcome but not reported them “When we looked at that data, it actually showed an increase in harm amongst those who got the active treatment, and we ditched it because we weren’t expecting it and we were concerned that the presentation of these data would have an impact on people’s understanding of the study findings. … The argument was, look, this intervention appears to help people, but if the paper says it may increase harm, that will, it will, be understood differently by, you know, service providers. So we buried it.” Smyth, R. M. D., et al. "Frequency and reasons for outcome reporting bias in clinical trials: interviews with trialists." Bmj 342 (2011): c7153.

16 Solution: Pre-registration Before data is collected, specify The what of the study Research question Population Primary outcome General design Pre-analysis plan Information on exact analysis that will be conducted Sample size Data processing and cleaning procedures Exclusion criterion Statistical Analyses ● Registered in a read-only format and time-stamped

17 Positive Result Rate dropped from 57% to 8% after preregistration required.

18 Pre-registration in the health sciences

19 Evaluating the literature A p-value is not enough to establish clinical significance ● Missing clinical insight such as treatment effect size, magnitude of change, or direction of the outcome ● Clinically significant differences can be statistically insignificant ● Clinically unimportant differences can be statistically significant

20 P-values What is a p-value? ● The probability of getting your data if there is no treatment effect ● p‐level of α = 0.05 means there is a 95% probability that the researcher will correctly conclude that there is no treatment effect when there is really is no treatment effect

21 P-values What is a p-value? ● Generally leads to dichotomous thinking o Either something is significant or it is not ● Influenced by the number and variability of subjects ● Changes from one sample to the next

22 The dance of the p-values

23 P-values A p-value is not enough to establish clinical significance ● P-values should be considered along with ● Effect size ● Confidence intervals ● Power ● Study design

24 Effect Size ● A measure of the magnitude of interest, tells us ‘how much’ ● Generally leads to thinking about estimation, rather than a dichotomous decision about significance ● Often combined with confidence intervals (CIs) to give us a sense of how much uncertainty there is around our estimate

25 Confidence Intervals ● Provide a ‘plausible’ range for effect size in the population o In 95% of the samples you draw from a population, the interval will contain the true population effect  Not the same thing as saying that 95% of the sample ES will fall within the interval ● Can also be used for NHST o if 0 falls outside of the CI, then your test will be statistically significant

26 Better together ● Why should you always report both effect sizes and CIs? o Effect sizes, like p-values, are bouncy o Point estimate can convey an invalid sense of certainty about your ES ● CIs give you additional information about the plausible upper and lower bounds of bouncing ESs

27 Better together

28 So why use the ESs + CIs? ● Give you more fine grained information about your data o point estimates, plausible values, and uncertainty ● Give more information for replication attempts ● Used for meta-analytic calculations, so are more helpful for accumulating knowledge across studies

29 Low powered studies still produce inflated effect sizes ● If I use ES and CIs rather than p-values, do I still have to worry about sample size? o Underpowered studies tend to over-estimate ES o Larger samples will lead to better estimation of the ES and smaller CIs  They will have higher levels of precision

30 Precision isn’t cheap ● To get high precision (narrow CIs) in any one study, you need large samples o Example: You need about 250 people to get an accurate, stable estimate of the ES in psychology

31 Precision isn’t cheap

32 Free training on how to make research more reproducible http://cos.io/stats_consulting

33 Find this presentation at https://osf.io/rwtyf/ Questions: contact@cos.iocontact@cos.io


Download ppt "April Center for Open Fostering openness, integrity, and reproducibility of scientific research."

Similar presentations


Ads by Google