Download presentation
Published byMarybeth Owen Modified over 9 years ago
1
Overview and Common Pitfalls in Statistics and How to Avoid Them
Bandit Thinkhamrop, PhD.(Statistics) Dept. of Biostatistics & Demography Faculty of Public Health Khon Kaen University
2
Roles of Statistics in Research Begin at a clear destination
What is the conclusion? Concluded based on what? Could it be wrong? Can the data be wrong? Can it be wrong due to data analysis? 7
3
Statistics Quantify the effect and its error
Magnitude of effect Parameter estimation [95%CI] Hypothesis testing [P-value] Quantify errors for further judgments 7
4
P-value vs. 95%CI (1) An example of a study with dichotomous outcome
A study compared cure rate between Drug A and Drug B Setting: Drug A = Alternative treatment Drug B = Conventional treatment Results: Drug A: n1 = 50, Pa = 80% Drug B: n2 = 50, Pb = 50% Pa-Pb = 30% (95%CI: 26% to 34%; P=0.001)
5
P-value vs. 95%CI (2) Pa > Pb Pb > Pa
Pa-Pb = 30% (95%CI: 26% to 34%; P< 0.05)
6
P-value vs. 95%CI (3) Adapted from: Armitage, P. and Berry, G. Statistical methods in medical research. 3rd edition. Blackwell Scientific Publications, Oxford page 99
7
Tips #6 (b) P-value vs. 95%CI (4)
Adapted from: Armitage, P. and Berry, G. Statistical methods in medical research. 3rd edition. Blackwell Scientific Publications, Oxford page 99 There were statistically significant different between the two groups.
8
Tips #6 (b) P-value vs. 95%CI (5)
Adapted from: Armitage, P. and Berry, G. Statistical methods in medical research. 3rd edition. Blackwell Scientific Publications, Oxford page 99 There were no statistically significant different between the two groups.
9
P-value vs. 95%CI (4) Save tips:
Always report 95%CI with p-value, NOT report solely p-value Always interpret based on the lower or upper limit of the confidence interval, p-value can be an optional Never interpret p-value > 0.05 as an indication of no difference or no association, only the CI can provide this message.
10
1. Over reliance on p-value
Example: Significant findings p-value <0.05 Non-significant findings p-value > 0.05
17
Diff เดิน = 1.3, 95%CI: to 2.62 Diff เดิน+นับเลข = 3.0, 95%CI: to 5.87 Diff เดิน+นับเดือน = 3.9, 95%CI: to 6.62
20
1. Over reliance on p-value (cont.)
Example: significant findings p-value <0.05 Tips to avoid it: Always report the magnitude of effect and its 95%CI Always interpret the findings based on the magnitude of effect, either the lower or upper boundary of the CI, against the minimum meaningful level
21
2. Test for baseline comparisons
Factors Group A (n=20) Group B P-value Age (years) 39.0 0.2 39.5 0.5 <0.001 Male, n(%) 2 (10.0%) 6 (30.0%) 0.114 Weight (kg) 60 52 30 55 0.084 Height (cm) 160 100 130 99 0.346 SBP at baseline (mmHg) 135 5 130 8 0.023 VAS (pain) at baseline 5 5 9 8 0.067 Number is meanSD unless indicated otherwise
23
Test for baseline
24
2. Test for baseline comparisons (cont.)
Compare all variables that could related to an association between the exposure and the study outcome. Indication of imbalance is based on clinical judgment - no statistical test is needed. Magnitude of the difference is matter, NOT p-value Large or small difference is clinical judgment If the variable is not highly correlated with the study outcome, it can be ignored even if the difference is high. If in doubt, use multivariable analysis where all imbalance variables were included in the model
25
3. No magnitude of effect presented
Example: See various examples in the class Tips to avoid it: If you can’t count it, it doesn’t exist… Tribe, 1971, p.1360, Always quantify magnitude of effect Always provide the confidence interval of the effect that is the primary objective of the study
27
Factors affecting birth weight
Num-ber Mean Mean Diff 95%CI P-value 1. Being complete ANC Yes No xxx xx.x xx.x – xx.x 0.xxx 2. Education or mother Primary school or lower Secondary school College or higher 3. Mother age (year) Less than 20 20 – 45 45 or older
28
Factors affecting low birth weight
Num-ber % LBW OR 95%CI P-value 1. Being complete ANC Yes No xxx xx.x 1 x.xx x.xx – x.xx 0.xxx 2. Education or mother Primary school or lower Secondary school College or higher 3. Mother age (year) Less than 20 20 – 45 45 or older
29
4. Applied inappropriate methods of analysis
Example: Inconsistent with type of the data Not handle dependency among observation Not accounted for sampling design Not well handle missing data Not accounted for confounding effects Not investigated interaction effects Tips to avoid it: Based on the objective and design of the study
30
5. Described methods of analysis inappropriately
Example: Too general Tips to avoid it: Specific and replicable
31
6. Presented the results inappropriately
Example: See various examples in the class and some examples as follow: Sex (OR = 3.5) Age (OR = 1.5) Marital status (OR = 2.0) Tips to avoid it: Always quantify magnitude of effect Always provide the confidence interval of the effect that is the primary objective of the study
32
Repeated measure ANOVA
33
Logistic regression
34
Student’s t-test
35
Correlation coefficient
37
ANOVA and t-test
39
Regression Model
41
Regression model
42
Concluded based on sample statistics NOT on population parameter
43
Within or between group
45
7. Sample size unjustified
Example: Simplified methods might be misleading Tips to avoid it:
46
8. Interpret a confidence interval inappropriately
Example: Width -> wide vs narrow interval Cross the null value -> sig- vs non-significant Tips to avoid it: Compare magnitude of either lower or upper boundary of the interval with the meaningful level then make a judgment
47
9. Categorization of the continuous variable inappropriately
Example: Continuous -> categorical Numerical count -> categorical Survival outcome -> categorical Tips to avoid it: Based on the research question Keep the intrinsic type of the variable –categorization of it can be done for exploratory purpose Based on clinical judgments
48
10. Handle the primary outcome inappropriately
Example: Interchange among the following: Continuous outcome Categorical outcome Numerical count Survival outcome Tips to avoid it: Based on the research question Based on clinical judgments
49
11. Before-after design Example: Possible approaches:
Post measurement only Change score Fraction Post measurement adjusted for baseline Tips to avoid it: Based on the research question Preferably - post measurement adjusted for baseline
51
Between or within group comparisons?
52
Suggested format of presentation
Time Group 1 (n=25) 2 3 Diff (95%CI, p-value)* 2-1 3-1 Pre 1.51.0 1.81.2 2.52.0 NA Post 1.71.0 3.53.0 0.8 ( ) P=0.01 1.8 ( ) P=0.03 Late 1.61.0 2.91.5 4.52.0 1.3 ( ) 2.9 ( ) * Mean difference adjusted for baseline using ANCOVA
53
12. Jump to non-parametric test without through exploration of distribution of the data
Example: “Since the sample size is small, we decided to use non-parametric test.” Tips to avoid it: Raw data could be better than p-value obtained from non-parametric test Small sample cannot be corrected by non-parametric statistics, in fact, we have NO SUFFICIENT evidence to allow any valid conclusions!
54
13. Row total, Column total, Grand total fixed?
Example: Row-total fixed -> Cohort study Column-total fixed -> Case-control study Sex Disease Normal Total Male 8 (80%) 2 (20%) 10 (100%) Female 12 (24%) 38 (76%) 50 (100%) 20 (33.3%) 40 (66.6%) 60 (100%) Sex Disease Normal Total Male 8 (40%) 2 (5%) 10 (16.7%) Female 12 (60%) 38 (95%) 50 (83.3%) 20 (100%) 40 (100%) 60 (100%) Tips to avoid it: Based on the study design
55
14. Concluded based on opinion or too general, not on the main findings or specific to the study results Example: “Effective prevention strategies should be formulated. Health education should be provided.” Tips to avoid it: Logically link from the main finding that is the primary research question. Specific to what was found in the study
56
ผิดเป็นครู Q & A
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.