Download presentation

Presentation is loading. Please wait.

Published byDavid Dawson Modified about 1 year ago

1
CONTACT Dr Sean Williams | DEPARTMENT FOR HEALTH Sport, Health and Exercise Science –Sean Williams MAGNITUDE-BASED INFERENCES An alternative to hypothesis testing

2
2 Magnitude-based inferences CONTACT Dr Sean Williams | Lecture outline Limitations of null hypothesis significance testing (NHST) Confidence intervals Magnitude-based inferences (MBI) Smallest worthwhile effects Limitations of magnitude-based inferences

3
3 Magnitude-based inferences CONTACT Dr Sean Williams | Null hypothesis significance testing A major aim of research is to make an inference about an effect in a population based on study of a sample. Null-hypothesis testing via the P-value and statistical significance is the traditional approach to making an inference. P-value is the probability of obtaining the observed result, or more extreme results, if the null hypothesis is true. 0 Probability Most likely observation Set of possible results (under the null hypothesis) One-sided p-value Observed data point

4
4 Magnitude-based inferences CONTACT Dr Sean Williams | Limitations of NHST P-values are difficult to understand! P ≤ 0.05 is arbitrary “Surely God loves the 0.06 nearly as much as 0.05.” (Rosnow and Rosenthal, 1989) Some useful effects aren’t statistically significant Some statistically significant effects aren’t useful P > 0.05 often interpreted as unpublishable So good data don’t get published Ignores ‘judgement’, leads to dichotomised thinking Statistical Hypothesis Inference Testing?

5
5 Magnitude-based inferences CONTACT Dr Sean Williams | Limitations of NHST

6
6 Magnitude-based inferences CONTACT Dr Sean Williams | Limitations of NHST

7
7 Magnitude-based inferences CONTACT Dr Sean Williams | Dance of the p-values *** <0.001Very highly significant!!! ** <0.01Highly significant!! * <0.05Significant (phew) ? 0.05 – 0.10“Approaching significance” >0.10Non-significant https://www.youtube.com/watch?v=ez4DgdurRPg

8
8 Magnitude-based inferences CONTACT Dr Sean Williams | A range within which we infer the true, population or large sample value is likely to fall. Likely is usually a probability of 0.95 (for 95% limits). Representation of the limits as a confidence interval : Area = 0.95 upper likely limitlower likely limit observed value probability value of effect statistic 0 positivenegative probability distribution of true value, given the observed value value of effect statistic 0 positivenegative likely range of true value Confidence intervals

9
9 Magnitude-based inferences CONTACT Dr Sean Williams | Confidence intervals (CI) To calculate a confidence interval you need: Confidence level (usually 95%, but can use 90% or 99%) Statistic (e.g. group mean) Margin of error (Critical value x Standard error of the statistic) z-score or t-score CI = [M – MOE, M + MOE] CI = [M – Critical value x SE, M + Critical value x SE] EXAMPLE: You test the vertical jump heights of 100 athletes. The mean and standard deviation of the sample was 60±20 cm. The 95% CIs for this mean would be: Lower CI = 60 – 1.96 x 20/√100 Upper CI = x 20/√100 Lower CI = 60 – 3.92 Upper CI = % CI = 56 to 64 cm

10
10 Magnitude-based inferences CONTACT Dr Sean Williams | Confidence intervals (CI) Confidence intervals also convey the precision of an estimate Wider confidence interval = less precision In the previous example, a smaller sample size (e.g. n=20) would have given less precision for our estimate… Lower CI = 60 – 1.96 x 20/√20 Upper CI = x 20/√20 Lower CI = 60 – 8.77 Upper CI = % CI = 51 to 69 cm

11
11 Magnitude-based inferences CONTACT Dr Sean Williams | Recap P-value = The probability of obtaining the observed result, or more extreme results, if the null hypothesis is true. ‘NHST’ has several limitations, namely, it leads to dichotomised thinking and does not tell us if the effect is important/worthwhile. Confidence intervals tell us the likely range of the true (population) value. It could be red! Confidence intervals also convey precision of our estimate Larger sample size and/or more consistent response = Smaller confidence interval & more precision.

12
12 Magnitude-based inferences CONTACT Dr Sean Williams | For magnitude-based inferences, we interpret confidence limits in relation to the smallest clinically beneficial and harmful effects. These are usually equal and opposite in sign. Harm is the opposite of benefit, not side effects. They define regions of beneficial, trivial, and harmful values: All you need is these two things: the confidence interval and a sense of what is important (e.g., beneficial and harmful). trivial beneficial smallest clinically harmful effect smallest clinically beneficial effect value of effect statistic 0 positivenegative harmful Magnitude-based inferences

13
13 Magnitude-based inferences CONTACT Dr Sean Williams | Put the confidence interval and these regions together to make a decision about clinically significant, clear or decisive effects. 0 value of effect statistic positivenegative trivial harmful beneficial Use it. Yes Use it. Yes Depends No Don’t use it. Yes Don’t use it. No Don’t use it. No Don’t use it. Yes Don’t use it. Yes Unclear: need more research. No MBI Statistically significant? Why hypothesis testing can be unethical and impractical! Use it. No Magnitude-based inferences

14
14 Magnitude-based inferences CONTACT Dr Sean Williams | We calculate probabilities that the true effect could be clinically beneficial, trivial, or harmful ( P beneficial, P trivial, P harmful ). Spreadsheets available at: sportsci.org The Ps allow a more detailed call on magnitude, as follows… smallest harmful value P harmful = 0.05 P trivial = 0.15 probability distribution of true value smallest beneficial value P beneficial = probability value of effect statistic positivenegative observed value Magnitude-based inferences

15
15 Magnitude-based inferences CONTACT Dr Sean Williams | Making a more detailed call on magnitudes using chances of benefit and harm. 0/0/100 Most likely beneficial 0 value of effect statistic positivenegative trivial harmful beneficial 0/7/93 Likely beneficial 1/59/40 0/97/3 2/94/4 28/70/2 74/26/0 Possibly harmful 9/60/31Unclear 2/33/65 Chances (%) that the effect is harmful / trivial / beneficial Possibly beneficial Clinical: unclear Risk of harm >0.5% is unacceptable, unless chance of benefit is high enough. Very likely trivial Likely trivial Possibly harmful 97/3/0 Very likely harmful Magnitude-based inferences Mechanistic: Possibly trivial

16
16 Magnitude-based inferences CONTACT Dr Sean Williams | Use this table for the plain-language version of chances: An effect should be almost certainly not harmful ( 25%) before you decide to use it. But you can tolerate higher chances of harm if chances of benefit are much higher: e.g., 3% harm and 76% benefit = clearly useful. Use an odds ratio of benefit/harm of >66 in such situations. The effect… beneficial/trivial/harmful is almost certainly not… Probability <0.005 Chances <0.5% Odds <1:199 is very unlikely to be… 0.005– –5%1:999–1:19 is unlikely to be…, is probably not… 0.05–0.255–25%1:19–1:3 is possibly (not)…, may (not) be… 0.25–0.7525–75%1:3–3:1 is likely to be…, is probably… is very likely to be… is almost certainly… 0.75– –0.995 > –95% 95–99.5% >99.5% 3:1–19:1 19:1–199:1 >199:1 Magnitude-based inferences

17
17 Magnitude-based inferences CONTACT Dr Sean Williams | P value 0.03 value of statistic 1.5 Conf. level (%) 90 deg. of freedom 18 positivenegative 1 threshold values for clinical chances Confidence limits lowerupper prob (%)odds 783:1 likely, probable clinically positive Chances (% or odds) that the true value of the statistic is783:1 likely, probable 191:4 unlikely, probably not 31:30 very unlikely prob (%)odds 221:3 unlikely, probably not clinically trivial prob (%)odds 01:2071 almost certainly not clinically negative Both these effects are clinically decisive, clear, or significant. Two examples of use of the spreadsheet for clinical chances:

18
18 Magnitude-based inferences CONTACT Dr Sean Williams | How to Publish Clinical Chances Example of a table from a randomized controlled trial: Mean improvement (%) and 90% confidence limits 3.1; ± ; ±1.2Very likely beneficial 0.5; ±1.4Unclear Compared groups Slow - control Explosive - control Slow - explosive Qualitative outcome a Almost certainly beneficial a with reference to a smallest worthwhile change of 0.5%. TABLE 1–Differences in improvements in kayaking sprint speed between slow, explosive and control training groups.

19
19 Magnitude-based inferences CONTACT Dr Sean Williams | Recap P-value = The probability of obtaining the observed result, or more extreme results, if the null hypothesis is true. NHST has several limitations, namely, it does not tell us if the effect is important/worthwhile. Confidence intervals tell us the likely range of the true (population) value. It could be red! Confidence intervals also convey precision of our estimate Larger sample size and/or more consistent response = Smaller confidence interval & more precision.

20
20 Magnitude-based inferences CONTACT Dr Sean Williams | Recap For magnitude-based inferences, we interpret confidence limits in relation to the smallest clinically beneficial and harmful effects. Spreadsheets at sportsci.org provide the % likelihood that an effect is harmful | trivial | beneficial. Effects that cross thresholds for benefit and harm are classed as unclear. An effect should be almost certainly not harmful ( 25%) before you decide to use it. But you can tolerate higher chances of harm if chances of benefit are much higher: e.g., 3% harm and 76% benefit = clearly useful. Use an odds ratio of benefit/harm of >66 in such situations.

21
21 Magnitude-based inferences CONTACT Dr Sean Williams | Problem: what's the smallest clinically important effect? “If you can't answer this question, quit the field”. This problem applies also with hypothesis testing, because it determines sample size you need to test the null properly. 0.3 of a CV gives a top athlete one extra medal every 10 races. This is the smallest important change in performance to aim for in research on, or intended for, elite athletes. 0.9, 1.6, 2.5, 4.0 of a CV gives an extra 3, 5, 7, 9 medals per 10 races (thresholds for moderate, large, very large, extremely large effecs). References: Hopkins et al. MSSE 31, , 1999 and MSSE 41, 3- 12, Smallest worthwhile difference?

22
22 Magnitude-based inferences CONTACT Dr Sean Williams | Smallest worthwhile difference? The default for most other populations and effects is Cohen's set of smallest values. You express the difference or change in the mean as a fraction of the between-subject standard deviation ( mean/SD). It's like a z score or a t statistic. In a controlled trial, it's the SD of all subjects in the pre-test, not the SD of the change scores. The smallest worthwhile difference or change is is equivalent to moving from the 50th to the 58th percentile.

23
23 Magnitude-based inferences CONTACT Dr Sean Williams | Example: The effect of a treatment on strength strength post pre Trivial effect (0.1x SD) strength post pre Very large effect (3x SD) Interpretation of standardised difference or change in means: Cohen < 0.2 Hopkins < > ? trivial small moderate large very large ?>4.0extremely large

24
24 Magnitude-based inferences CONTACT Dr Sean Williams | Relationship of standardised effect to difference or change in percentile: strength area = 50% athlete on 50th percentile strength Standardised effect = 0.20 athlete on 58th percentile area = 58% 98 Standardised effect 0.20 Percentile change 50 60 Smallest worthwhile difference?

25
25 Magnitude-based inferences CONTACT Dr Sean Williams | Smallest worthwhile difference? TrivialSmallModerateLargeVery largeNearly perfect Perfect Correlation Diff. in means Infinite Freq. diff Rel. risk Infinite Odds ratio infinite

26
26 Magnitude-based inferences CONTACT Dr Sean Williams | Problem: these new approaches are not yet mainstream. Confidence limits at least are coming in, so look for and interpret the importance of the lower and upper limits. You can use a spreadsheet to convert a published P value into a more meaningful magnitude-based inference. If the authors state “P<0.05” you can’t do it properly. If they state “P>0.05” or “NS”, you can’t do it at all. More difficult to present and discuss results? ‘Magnitude-based inferences under attack’ Limitations of magnitude-based inferences

27
27 Magnitude-based inferences CONTACT Dr Sean Williams | Summary MBI’s are an alternative to traditional NHST. For magnitude-based inferences, we interpret confidence limits in relation to the smallest clinically beneficial and harmful effects. Smallest worthwhile effects may be based on variability of performance (e.g. 0.3 of CV). Or standardised effects may be used (e.g. Cohen’s D). Spreadsheets available at sportsci.org to carry out MBI’s. Growing in popularity, but still not understood/accepted by many journals and academics. Confidence intervals convey far more information than a P-value alone, and should be presented where possible.

28
28 Magnitude-based inferences CONTACT Dr Sean Williams | Recommended Reading Simulation software: developmental-psychology/esci Batterham, A. M. & Hopkins, W. G. (2006) Making meaningful inferences about magnitudes. International Journal of Sports Physiology and Performance, 1, Hopkins, W., Marshall, S., Batterham, A. & Hanin, J. (2009) Progressive statistics for studies in sports medicine and exercise science. Medicine and Science in Sports and Exercise, 41,

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google