Presentation is loading. Please wait.

Presentation is loading. Please wait.

Random error, Confidence intervals and p-values Simon Thornley Simon Thornley.

Similar presentations


Presentation on theme: "Random error, Confidence intervals and p-values Simon Thornley Simon Thornley."— Presentation transcript:

1 Random error, Confidence intervals and p-values Simon Thornley Simon Thornley

2

3 Overview Simulated, repeated, epidemiological study to understand Random error confidence intervals p-values

4 What is random error? Only one type of error in epidemiological studies – Others? We rarely have whole population, so rely on sample Who is picked? Element of chance? Assumption of representative study?

5 10 x 20 sided dice Rolling dice = outcome e.g. Diarrhoea 10 rolls exposed; 10 rolls unexposed. Roll dice 20 times each; >5 diarrhoea in exposed; >15 diarrhoea in unexposed

6 Exposed (10 rolls) 1 roll = 1 participant Unexposed (10 rolls) ≥5 (diarrhoea)<5 (no diarrhoea) ≥15 (diarrhoea) <15 (no diarrhoea) abcd DiarrhoeaNo diarrhoea Exposed ab Unexposed cd a+cb+d

7 What is Risk/odds ratio? Assuming the die are fair, calculate true risk ratio and odds ratio of the outcome?

8 What is the true odds ratio? Assume die are unbiased Risk in exposed? Risk in unexposed? Odds in exposed? Odds in unexposed?

9 What is Risk/odds ratio? Risk in exposed is 15/20 = 3/4 Odds in exposed is 15/5 = 3 Risk in unexposed is 5/20 = ¼ Odds in unexposed is 5/15 = 1/3 Risk ratio or relative risk = ¾ / ¼ = 12/4 = 3 Odds ratio = 3/(1/3) = 9 *Note – what is theoretical upper limit of RR

10 Aggregate outcomes Rolling dice = outcome e.g. Diarrhoea 10 students exposed; 5 unexposed. Roll dice 40 times each; >5 diarrhoea in exposed; >15 diarrhoea in unexposed

11 Every pair: Two by two table DiarrhoeaNo diarrhoea Exposed ab Unexposed cd a+cb+d

12 In pairs Calculate risk ratio, odds ratio and 95% confidence interval for odds ratio Incidence in exposed Relative risk =------------------------- Incidence in unexposed a/(a+b) = -------------- c/(c+d)

13 Odds ratio odds of exposure in cases odds ratio=---------------------------- odds of exposure in controls =a*d -------- b*c

14 95% confidence interval for odds ratio se(log OR) = √[1/a+1/b+1/c+1/d] Error factor= exp[1.96 x s.e.(log OR)] 95%CI= OR / EF to OR x EF Calculate ‘z’ statistic =log (OR)/se(log OR) Use tables to convert to p-value

15 My example DiarrhoeaNo diarrhoea Exposed64 Unexposed28 812 OR =(6/4)/(2/8)=(6*8) /(2*4)=48/8=6 RR=(6/[6+4])/(2/[8+2])=0.6/0.2 =3 SE (log(OR)) = √[1/6+1/4+1/2+1/8] =1.02 Error factor = exp[1.96 x s.e.(1.02)]=7.39 95% CI = 6/7.39 to 6*7.39 = 0.81 to 44.4 z-statistic = log(OR)/se(log(OR) =1.76; P=0.08 True odds ratio is... 9 Does calculated 95% odds ratio contain true value?

16 In pairs Calculate risk ratio, odds ratio and 95% confidence interval for odds ratio Incidence in exposed Relative risk =------------------------- Incidence in unexposed a/(a+b) = -------------- c/(c+d)

17 No significant diff. What does this mean? What is the “null” hypothesis? What is the “alternate” hypothesis?

18 No significant diff. Null hypothesis That P(Diarrhoea|Exposed) = P(Diarrhoea|Unexposed) = 0.5 OR P(D|E)-P(D|no E)=0 Alternate hypothesis P(D|E)-P(D|no E)=delta

19 Errors in hypothesis testing Test result Null Hypothesis (No difference) Accept NullReject Null TrueOKType 1 error FalseType 2 errorOK

20 Sample size n=sample in each group π 0 =risk in unexposed (1/4) π 1 =risk in exposed (3/4) u=1 sided AUC of normal dist. Corresponding to 100%-power (eg. 10%; u=1.28) v=2 sided z level corresponding to % of AUC of normal dist for two sided significance level required (5%, v=1.96)

21 0.50 H 0 : π 1 - π null =0.00.75 u (~10%)v/2 (~5%) v/2 H Alt : π 1 - π null =0.25

22 Overlap

23 Increase sample by 2

24 Increase sample by 10

25 Our example... n=18.8 per group or 38 total We were unlucky not to find a significant difference between the two samples

26 Simpler formula... n=16 per group or 32 total

27 Sample size vs power

28 What about loss to follow up? In most studies, we anticipate that some people won’t contribute data due to loss to follow up. High risk groups include people with addictions, smokers, weight loss etc Adjustment factor for x% loss to follow up = 100/(100-x) Eg. x=20%; adjustment factor = 1.25

29 5 different effect measures We have 10 repetitions of the study; How many of these 95% confidence intervals would you expect to contain the true value? How many studies draw the incorrect conclusion of no difference between the treatment and no treatment groups using p-values?

30 2 different approaches Yes/No→ ‘p-value’ What is the true difference? →’95% confidence interval’

31 Confidence Interval What is the true difference cf. probability we are making the right decision (p-value) 95% confidence interval is not –“95% probability that the true effect estimate lies within the confidence interval” –“If we repeated the experiment over and over, then, on average, and calculate 95% confidence intervals on all of them, 19/20 will contain the true effect estimate” Population Sample Confidence interval

32 Confidence Interval If 95% confidence interval does not contain 0 (difference between means) or 1 (ratio effect measures) the null hypothesis can be rejected. Increasing popularity over p-values...

33 P-value Is observed difference –True vs. sampling error (chance)… –Can be used to give ‘Yes’ / ‘No’ answer ‘null’ hypothesis – e.g. no effect of parental smoking on smoking initiation P-value= probability of getting results by chance alone (no difference), given/if null hypothesis is true. NOT- common misinterpretation – probability that the null hypothesis is true. Does not take into account statistical power; ignores type 2 error (accepting Null when Null is false). If P-value low (e.g. p<0.05): –Chance is improbable explanation for result –Reject the null hypothesis i.e. Exposure doesn’t affect risk of diarrhoea.

34 Problems with P-values TotalPreference Treatment: placebo % PreferenceTwo sided p- value 2015:575%0.04 200115:8657.50%0.04 20001046:95452.30%0.04 2 000 0001001445: 998555 50.07%0.04 Imagine study; All given two treatments. At follow up which drug do you prefer?

35 Summary Confidence interval – range of values within which we are reasonably confident that the population measure lies P-value – strength of evidence against the null hypothesis that the true difference in the population is zero.


Download ppt "Random error, Confidence intervals and p-values Simon Thornley Simon Thornley."

Similar presentations


Ads by Google