Presentation is loading. Please wait.

Presentation is loading. Please wait.

Significance testing Ioannis Karagiannis (based on previous EPIET material) 18 th EPIET/EUPHEM Introductory course 28.09.2012.

Similar presentations


Presentation on theme: "Significance testing Ioannis Karagiannis (based on previous EPIET material) 18 th EPIET/EUPHEM Introductory course 28.09.2012."— Presentation transcript:

1 Significance testing Ioannis Karagiannis (based on previous EPIET material) 18 th EPIET/EUPHEM Introductory course

2 The idea of statistical inference Sample Population Conclusions based on the sample Generalisation to the population Hypotheses 2

3 Inferential statistics Uses patterns in the sample data to draw inferences about the population represented, accounting for randomness Two basic approaches: – Hypothesis testing – Estimation Common goal: conclude on the effect of an independent variable on a dependent variable 3

4 The aim of a statistical test To reach a deterministic decision (yes or no) about observed data on a probabilistic basis. 4

5 Why significance testing? Norovirus outbreak on a Greek island: The risk of illness was higher among people who ate raw seafood (RR=21.5). Is the association due to chance? 5

6 The two hypotheses There is a difference between the two groups (=there is an effect) Alternative Hypothesis (H 1 ) (e.g.: RR=21.5) When you perform a test of statistical significance, you reject or do not reject the Null Hypothesis (H 0 ) There is NO difference between the two groups (=no effect) Null Hypothesis (H 0 ) (RR=1) 6

7 Norovirus on a Greek island Null hypothesis (H 0 ): There is no association between consumption of raw seafood and illness. Alternative hypothesis (H 1 ): There is an association between consumption of raw seafood and illness. 7

8 Hypothesis testing Tests of statistical significance Data not consistent with H 0 : – H 0 can be rejected in favour of some alternative hypothesis H 1 (the objective of our study). Data are consistent with the H 0 : – H 0 cannot be rejected You cannot say that the H 0 is true. You can only decide to reject it or not reject it. 8

9 p value p value = probability that our result (e.g. a difference between proportions or a RR) or more extreme values could be observed under the null hypothesis H 0 rejected using reported p value 9

10 p values – practicalities Low p values = low degree of compatibility between H 0 and the observed data: association unlikely to be by chance you reject H 0, the test is significant High p values = high degree of compatibility between H 0 and the observed data: association likely to be by chance you dont reject H 0, the test is not significant 10

11 Levels of significance – practicalities We need of a cut-off ! 1% 5% 10% p value > 0.05 = H 0 not rejected (non significant) p value 0.05 = H 0 rejected (significant) BUT: Give always the exact p-value rather than significant vs. non-significant. 11

12 The limit for statistical significance was set at p=0.05. There was a strong relationship (p<0.001). …, but it did not reach statistical significance (ns). The relationship was statistically significant (p=0.0361) Examples from the literature p=0.05 Agreed convention Not an absolute truth Surely, God loves the 0.06 nearly as much as the 0.05 (Rosnow and Rosenthal, 1991) 12

13 p = 0.05 and its errors Level of significance, usually p = 0.05 p value used for decision making But still 2 possible errors: H 0 should not be rejected, but it was rejected : Type I or alpha error H 0 should be rejected, but it was not rejected : Type II or beta error 13

14 H 0 is true but rejected: Type I or error H 0 is false but not rejected: Type II or error Types of errors Decision based on the p value Truth No diff Diff 14

15 More on errors Probability of Type I error: – Value of α is determined in advance of the test – The significance level is the level of α error that we would accept (usually 0.05) Probability of Type II error: – Value of β depends on the size of effect (e.g. RR, OR) and sample size – 1- β : Statistical power of a study to detect an effect on a specified size (e.g. 0.80) – Fix β in advance: choose an appropriate sample size 15

16 Quantifying the association Test of association of exposure and outcome E.g. chi 2 test or Fishers exact test Comparison of proportions Chi 2 value quantifies the association The larger the chi 2 value, the smaller the p value – the more the observed data deviate from the assumption of independence (no effect). 16

17 Chi-square value 17

18 Norovirus on a Greek island 2x2 table Raw seafood No raw seafood IllNon ill %81% Expected proportion of ill and not ill : x19% ill x 81% non-ill x 19% ill x 81% non-ill Expected number of ill and not ill for each cell :

19 Chi-square calculation (29-6) 2 /6(9-31) 2 /31 (5-27) 2 /27 ( ) 2 / 114 Raw seafood No raw seafood IllNon ill χ 2 = 125 p < 0.001

20 Norovirus on a Greek island The attack rate of illness among consumers of raw seafood was 21.5 times higher than among non consumers of these food items (p<0.001). The p value is smaller than the chosen significance level of α = 5%. The null hypothesis is rejected. There is a < probability (<1/1000) that the observed association could have occured by chance, if there were no true association between eating imported raw seafood and illness. 20

21 C2012 vs facilitators The ultimate (eye) test. H 0 : the proportion of facilitators wearing glasses during the Tuesday morning sessions was equal to the proportion of fellows wearing glasses. H 1 : the above proportions were different. 21

22 C2012 vs facilitators Fellow Facilitator GlassesNo glasses %67% Expected proportion of ill and not ill : x33% +ve x67% -ve x33% +ve x67% -ve Expected number of ill and not ill for each cell :

23 Chi-square calculation (11-13) 2 /13(27-25) 2 /25 (6-4.6) 2 /4.6(8-9.4) 2 /9.4 Fellow Facilitator GlassesNo glasses 23 χ 2 = 1.11 p = 0.343

24 t-test Used to compare means of a continuous variable in two different groups Assumes normal distribution 24

25 t-test H 0 : fellows with glasses do not tend to sit further in the back of the room compared to fellows without glasses H 1 : fellows with glasses tend to sit further in the back of the room compared to fellows without glasses 25

26 t-test 26

27 Epidemiology and statistics 27

28 Criticism on significance testing Epidemiological application need more than a decision as to whether chance alone could have produced association. (Rothman et al. 2008) Estimation of an effect measure (e.g. RR, OR) rather than significance testing. 28

29 Suggested reading KJ Rothman, S Greenland, TL Lash, Modern Epidemiology, Lippincott Williams & Wilkins, Philadelphia, PA, 2008 SN Goodman, R Royall, Evidence and Scientific Research, AJPH 78, 1568, 1988 SN Goodman, Toward Evidence-Based Medical Statistics. 1: The P Value Fallacy, Ann Intern Med. 130, 995, 1999 C Poole, Low P-Values or Narrow Confidence Intervals: Which are more Durable? Epidemiology 12, 291,

30 Previous lecturers Alain Moren Paolo DAncona Lisa King Ágnes Hajdu Preben Aavitsland Doris Radun Manuel Dehnert 30


Download ppt "Significance testing Ioannis Karagiannis (based on previous EPIET material) 18 th EPIET/EUPHEM Introductory course 28.09.2012."

Similar presentations


Ads by Google