Presentation is loading. Please wait.

Presentation is loading. Please wait.

Basic Statistics II. Significance/hypothesis tests.

Similar presentations

Presentation on theme: "Basic Statistics II. Significance/hypothesis tests."— Presentation transcript:

1 Basic Statistics II

2 Significance/hypothesis tests

3 RCT comparing drug A and drug B for the treatment of hypertension 50 patients allocated to A 50 patients allocated to B Outcome = systolic BP at 3 months

4 Results Group A Mean = 145, sd = 9.9 Group B Mean = 135, sd = 10.0

5 Null hypothesis : “μ (A) = μ (B)” [ie. difference equals 0] Alternative hypothesis : “μ (A) ≠ μ (B)” [ie. difference doesn’t equal zero] [where μ = population mean]

6 Statistical problem When can we conclude that the observed difference mean(A) - mean(B) is large enough to suspect that μ (A) - μ (B) is not zero?

7 P-value : “probability of obtaining observed data if the null hypothesis were true” [eg. if no difference in systolic BP between two groups]

8 How do we evaluate the probability?

9 Test Statistic Numerical value which can be compared with a known statistical distribution Expressed in terms of the observed data and the data expected if the null hypothesis were true

10 Test statistic [mean (A) – mean (B)] / sd [mean(A)-mean(B)] Under null hypothesis this ratio will follow a Normal distribution with mean = 0 and sd = 1

11 Hypertension example Test statistic = [mean (A) – mean (B)] / sd [mean(A)-mean(B)] = [ 145 – 135 ] / 1.99 = 5 → p <0.001

12 Interpretation Drug B results in lower systolic blood pressure in patients with hypertension than does Drug A

13 Two-sample t-test Compares two independent groups of Normally distributed data

14 Significance test example I

15 Null hypothesis : “μ (A) = μ (B)” [ie. difference equals 0] Alternative hypothesis : “μ (A) ≠ μ (B)” [ie. difference doesn’t equal zero] Two-sided test

16 Null hypothesis : “μ (A) = μ (B) or μ (A) < μ (B) ” Alternative hypothesis : “μ (A) > μ (B)” One-sided test

17 A one-sided test is only appropriate if a difference in the opposite direction would have the same meaning or result in the same action as no difference

18 Paired-sample t-test Compares two dependent groups of Normally distributed data

19 Paired-sample t-test Mean daily dietary intake of 11 women measured over 10 pre-menstrual and 10 post-menstrual days

20 Dietary intake example Pre-menstrual (n=11): Mean=6753kJ, sd=1142 Post-menstrual (n=11): Mean=5433kJ, sd=1217 Difference Mean=1320, sd=367

21 Dietary intake example Test statistic = 1320/[367/sqrt(11)] = 11.9 p<0.001

22 Dietary intake example Dietary intake during the pre- menstrual period was significantly greater than that during the post- menstrual period

23 The equivalent non-parametric tests Mann-Whitney U-test Wilcoxon matched pairs signed rank sum test

24 Non-parametric tests Based on the ranks of the data Use complicated formula Hence computer package is recommended

25 Significance test example II

26 Type I error Significant result when null hypothesis is true (0.05) Type II error Non-significant result when null hypothesis is false [Power = 1 – Type II]

27 The chi-square test Used to investigate the relationship between two qualitative variables The analysis of cross-tabulations

28 The chi-square test Compares proportions in two independent samples

29 Chi-square test example In an RCT comparing infra-red stimulation (IRS) with placebo on pain caused by osteoarthritis, 9/12 in IRS group ‘improved’ compared with 4/13 in placebo group

30 Chi-square test example Improve? Yes No Placebo IRS

31 Placebo : 4/13 = 31% improve IRS: 9/12 = 75% improve

32 Cross-tabulations The chi-square test tests the null hypothesis of no relationship between ‘group’ and ‘improvement’ by comparing the observed frequencies with those expected if the null hypothesis were true

33 Cross-tabulations Expected frequency = row total x col total grand total

34 Chi-square test example Improve? Yes No Placebo IRS Expected value for ‘4’ = 13 x 13 / 25 = 6.8

35 Expected values Improve? Yes No Placebo IRS

36 Test Statistic =  (observed freq – expected freq) 2 expected freq

37 Test Statistic =  (O – E) 2 E = ( ) 2 /6.8 + (9 – 6.2) 2 /6.2 + ( ) 2 /6.8 + (9 – 6.2) 2 /6.2 = 4.9 → p=0.027

38 Chi-square test example Statistically significant difference in improvement between the IRS and placebo groups

39 Small samples The chi-square test is valid if: at least 80% of the expected frequencies exceed 5 and all the expected frequencies exceed 1

40 Small samples If criterion not satisfied then combine or delete rows and columns to give bigger expected values

41 Small samples Alternatively: Use Fisher’s Exact Test [calculates probability of observed table of frequencies - or more extreme tables-under null hypothesis]

42 Yates’ Correction Improves the estimation of the discrete distribution of the test statistic by the continuous chi-square distribution

43 Chi-square test with Yates’ correction Subtract ½ from the O-E difference  (|O – E|-½) 2 E

44 Significance test example III

45 McNemar’s test Compares proportions in two matched samples

46 McNemar’s test example Severe cold age 14 Yes No Severe Yes cold No age

47 McNemar’s test example Null hypothesis = proportions saying ‘yes’ on the 1 st and 2 nd occasions are the same  the frequencies for ‘yes,no’ and ‘no,yes’ are equal

48 McNemar’s test Test statistic based on observed and expected ‘discordant’ frequencies Similar to that for simple chi-square test

49 McNemar’s test example Test statistic = 31.4 => p <0.001 Significant difference between the two ages

50 Significance test example IV

51 Comparison of means 2 groups 2-sample t-test 3 or more groups ANOVA

52 One-way analysis of variance Example: Assessing the effect of treatment on the stress levels of a cohort of 60 subjects. 3 age-groups: 15-25, 26-45, Stress measured on scale 0-100

53 Stress levels GroupMean (SD) (n=20)52.8 (11.2) (n=20)33.4 (15.0) (n=20)35.6 (11.7)

54 Graph of stress levels

55 ANOVA Sum of squares DfMean square FSig Between groups <0.001 Within groups Total

56 Interpretation Significant difference between the three age-groups with respect to stress levels But what about the specific (pairwise) differences?

57 Stress levels GroupMean (SD) (n=20)52.8 (11.2) (n=20)33.4 (15.0) (n=20)35.6 (11.7)

58 Multiple comparisons Comparing each pair of means in turn gives a high probability of finding a significant result by chance A multiple comparison method (eg. Scheffé, Duncan, Newman-Keuls) makes appropriate adjustment

59 Scheffés test Comparison vs p< vs p< vs p=0.86

60 Stress levels GroupMean (SD) (n=20)52.8 (11.2) (n=20)33.4 (15.0) (n=20)35.6 (11.7)

61 Comparison of medians 2 groups Mann-Whitney 3 or more groups Kruskal-Wallis

62 Kruskal-Wallis Example: Stress levels Overall comparison of 3 groups: p<0.001

63 Multiple comparisons There are no non-parametric equivalents to the multiple comparison tests such as Scheffés Need to apply Bonferroni’s correction to multiple Mann-Whitney U-tests

64 Bonferroni’s correction For k comparisons between means: multiply each p value by k

65 Mann-Whitney U-test Comparison vs p< vs p< vs p=0.68 Need to multiple each p-value by 3

66 Significance test example V

Download ppt "Basic Statistics II. Significance/hypothesis tests."

Similar presentations

Ads by Google